network system design
TRANSCRIPT
![Page 1: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/1.jpg)
Network Systems Design(CS490N)
Douglas Comer
Computer Science DepartmentPurdue University
West Lafayette, IN 47907
http://www.cs.purdue.edu/people/comer
Copyright 2003. All rights reserved. This document maynot be reproduced by any means without the express written
consent of the author.
![Page 2: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/2.jpg)
I
Course IntroductionAnd Overview
CS490N -- Chapt. 1 1 2003
![Page 3: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/3.jpg)
Topic And Scope
The concepts, principles, and technologies that underlie thedesign of hardware and software systems used in computernetworks and the Internet, focusing on the emerging field ofnetwork processors.
CS490N -- Chapt. 1 2 2003
![Page 4: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/4.jpg)
You Will Learn
� Review of
– Network systems
– Protocols and protocol processing tasks� Hardware architectures for protocol processing� Software-based network systems and software architectures� Classification
– Concept
– Software implementation
– Special languages� Switching fabrics
CS490N -- Chapt. 1 3 2003
![Page 5: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/5.jpg)
You Will Learn(continued)
� Network processors: definition, architectures, and use� Design tradeoffs and consequences� Survey of commercial network processors� Details of one example network processor
– Architecture and instruction set(s)
– Programming model and program optimization
– Cross-development environment
CS490N -- Chapt. 1 4 2003
![Page 6: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/6.jpg)
What You Will NOT Learn
� EE details
– VLSI technology and design rules
– Chip interfaces: ICs and pin-outs
– Waveforms, timing, or voltage
– How to wire wrap or solder� Economic details
– Comprehensive list of vendors and commercial products
– Price points
CS490N -- Chapt. 1 5 2003
![Page 7: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/7.jpg)
Background Required
� Basic knowledge of
– Network and Internet protocols
– Packet headers� Basic understanding of hardware architecture
– Registers
– Memory organization
– Typical instruction set� Willingness to use an assembly language
CS490N -- Chapt. 1 6 2003
![Page 8: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/8.jpg)
Schedule Of Topics
� Quick review of basic networking� Protocol processing tasks and classification� Software-based systems using conventional hardware� Special-purpose hardware for high speed� Motivation and role of network processors� Network processor architectures
CS490N -- Chapt. 1 7 2003
![Page 9: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/9.jpg)
Schedule Of Topics(continued)
� An example network processor technology in detail
– Hardware architecture and parallelism
– Programming model
– Testbed architecture and features� Design tradeoffs� Scaling a network processor� Classification languages and programs
CS490N -- Chapt. 1 8 2003
![Page 10: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/10.jpg)
Course Administration
� Textbook
– D. Comer, Network Systems Design Using NetworkProcessors, Prentice Hall, 2003.
� Grade
– Quizzes 5%
– Midterm and final exam 35%
– Programming projects 60%
CS490N -- Chapt. 1 9 2003
![Page 11: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/11.jpg)
Lab Facilities Available
� Extensive network processor testbed facilities (more thanany other university)
� Donations from
– Agere Systems
– IBM
– Intel� Includes hardware and cross-development software
CS490N -- Chapt. 1 10 2003
![Page 12: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/12.jpg)
What You Will Do In The Lab
� Write and compile software for an NP� Download software into an NP� Monitor the NP as it runs� Interconnect Ethernet ports on an NP board
– To other ports on other NP boards
– To other computers in the lab� Send Ethernet traffic to the NP� Receive Ethernet traffic from the NP
CS490N -- Chapt. 1 11 2003
![Page 13: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/13.jpg)
Programming Projects
� A packet analyzer
– IP datagrams
– TCP segments� An Ethernet bridge� An IP fragmenter� A classification program� A bump-in-the-wire system using low-level packet
processors
CS490N -- Chapt. 1 12 2003
![Page 14: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/14.jpg)
Questions?
![Page 15: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/15.jpg)
A QUICK OVERVIEW
OF NETWORK PROCESSORS
CS490N -- Chapt. 1 14 2003
![Page 16: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/16.jpg)
The Network Systems Problem
� Data rates keep increasing� Protocols and applications keep evolving� System design is expensive� System implementation and testing take too long� Systems often contain errors� Special-purpose hardware designed for one system cannot
be reused
CS490N -- Chapt. 1 15 2003
![Page 17: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/17.jpg)
The Challenge
Find ways to improve the design and manufacture ofcomplex networking systems.
CS490N -- Chapt. 1 16 2003
![Page 18: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/18.jpg)
The Big Questions
� What systems?
– Everything we have now
– New devices not yet designed� What physical communication mechanisms?
– Everything we have now
– New communication systems not yetdesigned / standardized
� What speeds?
– Everything we have now
– New speeds much faster than those in use
CS490N -- Chapt. 1 17 2003
![Page 19: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/19.jpg)
More Big Questions
� What protocols?
– Everything we have now
– New protocols not yet designed / standardized� What applications?
– Everything we have now
– New applications not yet designed / standardized
CS490N -- Chapt. 1 18 2003
![Page 20: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/20.jpg)
The Challenge(restated)
Find flexible, general technologies that enable rapid,low-cost design and manufacture of a variety of scalable,robust, efficient network systems that run a variety ofexisting and new protocols, perform a variety of existing andnew functions for a variety of existing and new, higher-speednetworks to support a variety of existing and newapplications.
CS490N -- Chapt. 1 19 2003
![Page 21: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/21.jpg)
Special Difficulties
� Ambitious goal� Vague problem statement� Problem is evolving with the solution� Pressure from
– Changing infrastructure
– Changing applications
CS490N -- Chapt. 1 20 2003
![Page 22: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/22.jpg)
Desiderata
� High speed� Flexible and extensible to accommodate
– Arbitrary protocols
– Arbitrary applications
– Arbitrary physical layer� Low cost
CS490N -- Chapt. 1 21 2003
![Page 23: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/23.jpg)
Desiderata
� High speed� Flexible and extensible to accommodate
– Arbitrary protocols
– Arbitrary applications
– Arbitrary physical layer� Low cost
CS490N -- Chapt. 1 21 2003
![Page 24: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/24.jpg)
Statement Of Hope(1995 version)
If there is hope, it lies in ASIC designers.
CS490N -- Chapt. 1 22 2003
![Page 25: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/25.jpg)
Statement Of Hope(1999 version)
If there is hope, it lies in ASIC designers.
???
CS490N -- Chapt. 1 22 2003
![Page 26: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/26.jpg)
Statement Of Hope(2003 version)
If there is hope, it lies in ASIC designers.
programmers!
CS490N -- Chapt. 1 22 2003
![Page 27: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/27.jpg)
Programmability
� Key to low-cost hardware for next generation networksystems
� More flexibility than ASIC designs� Easier / faster to update than ASIC designs� Less expensive to develop than ASIC designs� What we need: a programmable device with more capability
than a conventional CPU
CS490N -- Chapt. 1 23 2003
![Page 28: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/28.jpg)
The Idea In A Nutshell
� Devise new hardware building blocks� Make them programmable� Include support for protocol processing and I/O
– General-purpose processor(s) for control tasks
– Special-purpose processor(s) for packet processing andtable lookup
� Include functional units for tasks such as checksumcomputation
� Integrate as much as possible onto one chip� Call the result a network processor
CS490N -- Chapt. 1 24 2003
![Page 29: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/29.jpg)
The Rest Of The Course
� We will
– Examine general problem being solved
– Survey some approaches vendors have taken
– Explore possible architectures
– Study example technologies
– Consider how to implement systems using networkprocessors
CS490N -- Chapt. 1 25 2003
![Page 30: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/30.jpg)
Disclaimer #1
In the field of network processors, I am a tyro.
CS490N -- Chapt. 1 26 2003
![Page 31: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/31.jpg)
Definition
Tyro \Ty’ro\, n.; pl. Tyros. A beginner in learning; one who is inthe rudiments of any branch of study; a person imperfectlyacquainted with a subject; a novice.
CS490N -- Chapt. 1 27 2003
![Page 32: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/32.jpg)
By Definition
In the field of network processors, you are all tyros.
CS490N -- Chapt. 1 28 2003
![Page 33: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/33.jpg)
In Our Defense
When it comes to network processors, everyone is a tyro.
CS490N -- Chapt. 1 29 2003
![Page 34: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/34.jpg)
Questions?
![Page 35: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/35.jpg)
II
Basic Terminology And Example Systems(A Quick Review)
CS490N -- Chapt. 2 1 2003
![Page 36: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/36.jpg)
Packets Cells And Frames
� Packet
– Generic term
– Small unit of data being transferred
– Travels independently
– Upper and lower bounds on size
CS490N -- Chapt. 2 2 2003
![Page 37: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/37.jpg)
Packets Cells And Frames(continued)
� Cell
– Fixed-size packet (e.g., ATM)� Frame or layer-2 packet
– Packet understood by hardware� IP datagram
– Internet packet
CS490N -- Chapt. 2 3 2003
![Page 38: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/38.jpg)
Types Of Networks
� Paradigm
– Connectionless
– Connection-oriented� Access type
– Shared (i.e., multiaccess)
– Point-To-Point
CS490N -- Chapt. 2 4 2003
![Page 39: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/39.jpg)
Point-To-Point Network
� Connects exactly two systems� Often used for long distance� Example: data circuit connecting two routers
CS490N -- Chapt. 2 5 2003
![Page 40: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/40.jpg)
Data Circuit
� Leased from phone company� Also called serial line because data is transmitted bit-
serially� Originally designed to carry digital voice� Cost depends on speed and distance� T-series standards define low speeds (e.g. T1)� STS and OC standards define high speeds
CS490N -- Chapt. 2 6 2003
![Page 41: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/41.jpg)
Digital Circuit Speeds
Standard Name Bit Rate Voice Circuits
– 0.064 Mbps 1T1 1.544 Mbps 24T3 44.736 Mbps 672OC-1 51.840 Mbps 810OC-3 155.520 Mbps 2430OC-12 622.080 Mbps 9720OC-24 1,244.160 Mbps 19440OC-48 2,488.320 Mbps 38880OC-192 9,953.280 Mbps 155520OC-768 39,813.120 Mbps 622080
CS490N -- Chapt. 2 7 2003
![Page 42: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/42.jpg)
Digital Circuit Speeds
Standard Name Bit Rate Voice Circuits
– 0.064 Mbps 1T1 1.544 Mbps 24T3 44.736 Mbps 672OC-1 51.840 Mbps 810OC-3 155.520 Mbps 2430OC-12 622.080 Mbps 9720OC-24 1,244.160 Mbps 19440OC-48 2,488.320 Mbps 38880OC-192 9,953.280 Mbps 155520OC-768 39,813.120 Mbps 622080
� Holy grail of networking: devices capable of accepting andforwarding data at 10 Gbps (OC-192).
CS490N -- Chapt. 2 7 2003
![Page 43: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/43.jpg)
Local Area Networks
� Ethernet technology dominates� Layer 1 standards
– Media and wiring
– Signaling
– Handled by dedicated interface chips
– Unimportant to us� Layer 2 standards
– MAC framing and addressing
CS490N -- Chapt. 2 8 2003
![Page 44: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/44.jpg)
MAC Addressing
� Three address types
– Unicast (single computer)
– Broadcast (all computers in broadcast domain)
– Multicast (some computers in broadcast domain)
CS490N -- Chapt. 2 9 2003
![Page 45: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/45.jpg)
More Terminology
� Internet
– Interconnection of multiple networks
– Allows heterogeneity of underlying networks� Network scope
– Local Area Network (LAN) covers limited distance
– Wide Area Network (WAN) covers arbitrary distance
CS490N -- Chapt. 2 10 2003
![Page 46: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/46.jpg)
Network System
� Individual hardware component� Serves as fundamental building block� Used in networks and internets� May contain processor and software� Operates at one or more layers of the protocol stack
CS490N -- Chapt. 2 11 2003
![Page 47: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/47.jpg)
Example Network Systems
� Layer 2
– Bridge
– Ethernet switch
– VLAN switch
CS490N -- Chapt. 2 12 2003
![Page 48: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/48.jpg)
VLAN Switch
� Similar to conventional layer 2 switch
– Connects multiple computers
– Forwards frames among them
– Each computer has unique unicast address� Differs from conventional layer 2 switch
– Allows manager to configure broadcast domains� Broadcast domain known as virtual network
CS490N -- Chapt. 2 13 2003
![Page 49: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/49.jpg)
Broadcast Domain
� Determines propagation of broadcast / multicast� Originally corresponded to fixed hardware
– One per cable segment
– One per hub or switch� Now configurable via VLAN switch
– Manager assigns ports to VLANs
CS490N -- Chapt. 2 14 2003
![Page 50: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/50.jpg)
Example Network Systems(continued)
� Layer 3
– Internet host computer
– IP router (layer 3 switch)� Layer 4
– Basic Network Address Translator (NAT)
– Round-robin Web load balancer
– TCP terminator
CS490N -- Chapt. 2 15 2003
![Page 51: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/51.jpg)
Example Network Systems(continued)
� Layer 5
– Firewall
– Intrusion Detection System (IDS)
– Virtual Private Network (VPN)
– Softswitch running SIP
– Application gateway
– TCP splicer (also known as NAPT — Network Addressand Protocol Translator)
– Smart Web load balancer
– Set-top boxCS490N -- Chapt. 2 16 2003
![Page 52: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/52.jpg)
Example Network Systems(continued)
� Network control systems
– Packet / flow analyzer
– Traffic monitor
– Traffic policer
– Traffic shaper
CS490N -- Chapt. 2 17 2003
![Page 53: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/53.jpg)
Questions?
![Page 54: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/54.jpg)
III
Review Of Protocols And Packet Formats
CS490N -- Chapt. 3 1 2003
![Page 55: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/55.jpg)
Protocol Layering
Application
Transport
Internet
Network Interface
Physical Layer 1
Layer 2
Layer 3
Layer 4
Layer 5
� Five-layer Internet reference model� Multiple protocols can occur at each layer
CS490N -- Chapt. 3 2 2003
![Page 56: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/56.jpg)
Layer 2 Protocols
� Two protocols are important
– Ethernet
– ATM� We will concentrate on Ethernet
CS490N -- Chapt. 3 3 2003
![Page 57: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/57.jpg)
Ethernet Addressing
� 48-bit addressing� Unique address assigned to each station (NIC)� Destination address in each packet can specify delivery to
– A single computer (unicast)
– All computers in broadcast domain (broadcast)
– Some computers in broadcast domain (multicast)
CS490N -- Chapt. 3 4 2003
![Page 58: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/58.jpg)
Ethernet Addressing(continued)
� Broadcast address is all 1s� Single bit determines whether remaining addresses are
unicast or multicast
x x x x x x xm x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
multicast bit
CS490N -- Chapt. 3 5 2003
![Page 59: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/59.jpg)
Ethernet Frame Processing
6 6 2 46 - 1500
Dest.Address
SourceAddress
FrameType Data In Frame
Header Payload
� Dedicated physical layer hardware
– Checks and removes preamble and CRC on input
– Computes and appends CRC and preamble on output� Layer 2 systems use source, destination and (possibly) type
fields
CS490N -- Chapt. 3 6 2003
![Page 60: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/60.jpg)
Internet
� Set of (heterogeneous) computer networks interconnected byIP routers
� End-user computers, called hosts, each attach to specificnetwork
� Protocol software
– Runs on both hosts and routers
– Provides illusion of homogeneity
CS490N -- Chapt. 3 7 2003
![Page 61: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/61.jpg)
Internet Protocols Of Interest
� Layer 2
– Address Resolution Protocol (ARP)� Layer 3
– Internet Protocol (IP)� Layer 4
– User Datagram Protocol (UDP)
– Transmission Control Protocol (TCP)
CS490N -- Chapt. 3 8 2003
![Page 62: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/62.jpg)
IP Datagram Format
0 4 8 16 19 24 31
VERS HLEN SERVICE TOTAL LENGTH
ID FLAGS F. OFFSET
TTL TYPE HDR CHECKSUM
SOURCE
DESTINATION
IP OPTIONS (MAY BE OMITTED) PADDING
BEGINNING OF PAYLOAD...
� Format of each packet sent across Internet� Fixed-size fields make parsing efficient
CS490N -- Chapt. 3 9 2003
![Page 63: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/63.jpg)
IP Datagram Fields
Field Meaning
VERS Version number of IP being used (4)HLEN Header length measured in 32-bit unitsSERVICE Level of service desiredTOTAL LENGTH Datagram length in octets including headerID Unique value for this datagramFLAGS Bits to control fragmentationF. OFFSET Position of fragment in original datagramTTL Time to live (hop countdown)TYPE Contents of payload areaHDR CHECKSUM One’s-complement checksum over headerSOURCE IP address of original senderDESTINATION IP address of ultimate destinationIP OPTIONS Special handling parametersPADDING To make options a 32-bit multiple
CS490N -- Chapt. 3 10 2003
![Page 64: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/64.jpg)
IP addressing
� 32-bit Internet address assigned to each computer� Virtual, hardware independent value� Prefix identifies network; suffix identifies host� Network systems use address mask to specify boundary
between prefix and suffix
CS490N -- Chapt. 3 11 2003
![Page 65: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/65.jpg)
Next-Hop Forwarding
� Routing table
– Found in both hosts and routers
– Stores ( destination, mask, next_hop ) tuples� Route lookup
– Takes destination address as argument
– Finds next hop
– Uses longest-prefix match
CS490N -- Chapt. 3 12 2003
![Page 66: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/66.jpg)
Next-Hop Forwarding
� Routing table
– Found in both hosts and routers
– Stores ( destination, mask, next_hop ) tuples� Route lookup
– Takes destination address as argument
– Finds next hop
– Uses longest-prefix match
CS490N -- Chapt. 3 12 2003
![Page 67: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/67.jpg)
UDP Datagram Format
0 16 31
SOURCE PORT DESTINATION PORT
MESSAGE LENGTH CHECKSUM
BEGINNING OF PAYLOAD...
Field Meaning
SOURCE PORT ID of sending applicationDESTINATION PORT ID of receiving applicationMESSAGE LENGTH Length of datagram including the headerCHECKSUM One’s-complement checksum over entire datagram
CS490N -- Chapt. 3 13 2003
![Page 68: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/68.jpg)
TCP Segment Format
0 4 10 16 24 31
SOURCE PORT DESTINATION PORT
SEQUENCE
ACKNOWLEDGEMENT
HLEN NOT USED CODE BITS WINDOW
CHECKSUM URGENT PTR
OPTIONS (MAY BE OMITTED) PADDING
BEGINNING OF PAYLOAD...
� Sent end-to-end� Fixed-size fields make parsing efficient
CS490N -- Chapt. 3 14 2003
![Page 69: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/69.jpg)
TCP Segment Fields
Field Meaning
SOURCE PORT ID of sending applicationDESTINATION PORT ID of receiving applicationSEQUENCE Sequence number for data in payloadACKNOWLEDGEMENT Acknowledgement of data receivedHLEN Header length measured in 32-bit unitsNOT USED Currently unassignedCODE BITS URGENT, ACK, PUSH, RESET, SYN, FINWINDOW Receiver’s buffer size for additional dataCHECKSUM One’s-complement checksum over entire segmentURGENT PTR Pointer to urgent data in segmentOPTIONS Special handlingPADDING To make options a 32-bit multiple
CS490N -- Chapt. 3 15 2003
![Page 70: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/70.jpg)
Illustration Of Encapsulation
ETHERNET HDR. ETHERNET PAYLOAD
IP HEADER IP PAYLOAD
UDP HEADER UDP PAYLOAD
� Field in each header specifies type of encapsulated packet
CS490N -- Chapt. 3 16 2003
![Page 71: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/71.jpg)
Example ARP Packet Format
0 8 16 24 31
ETHERNET ADDRESS TYPE (1) IP ADDRESS TYPE (0800)
ETH ADDR LEN (6) IP ADDR LEN (4) OPERATION
SENDER’S ETH ADDR (first 4 octets)
SENDER’S ETH ADDR (last 2 octets) SENDER’S IP ADDR (first 2 octets)
SENDER’S IP ADDR (last 2 octets) TARGET’S ETH ADDR (first 2 octets)
TARGET’S ETH ADDR (last 4 octets)
TARGET’S IP ADDR (all 4 octets)
� Format when ARP used with Ethernet and IP� Each Ethernet address is six octets� Each IP address is four octets
CS490N -- Chapt. 3 17 2003
![Page 72: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/72.jpg)
End Of Review
![Page 73: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/73.jpg)
Questions?
![Page 74: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/74.jpg)
IV
Conventional Computer Hardware Architecture
CS490N -- Chapt. 4 1 2003
![Page 75: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/75.jpg)
Software-Based Network System
� Uses conventional hardware (e.g., PC)� Software
– Runs the entire system
– Allocates memory
– Controls I/O devices
– Performs all protocol processing
CS490N -- Chapt. 4 2 2003
![Page 76: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/76.jpg)
Why Study Protocol ProcessingOn Conventional Hardware?
� Past
– Employed in early IP routers
– Many algorithms developed / optimized for conventionalhardware
� Present
– Used in low-speed network systems
– Easiest to create / modify
– Costs less than special-purpose hardware
CS490N -- Chapt. 4 3 2003
![Page 77: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/77.jpg)
Why Study Protocol ProcessingOn Conventional Hardware?
(continued)
� Future
– Processors continue to increase in speed
– Some conventional hardware present in all systems
CS490N -- Chapt. 4 4 2003
![Page 78: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/78.jpg)
Why Study Protocol ProcessingOn Conventional Hardware?
(continued)
� Future
– Processors continue to increase in speed
– Some conventional hardware present in all systems
– You will build software-based systems in lab!
CS490N -- Chapt. 4 4 2003
![Page 79: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/79.jpg)
Serious Question
� Which is growing faster?
– Processing power
– Network bandwidth� Note: if network bandwidth growing faster
– Need special-purpose hardware
– Conventional hardware will become irrelevant
CS490N -- Chapt. 4 5 2003
![Page 80: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/80.jpg)
Growth Of Technologies
1990 1992 1994 1996 1998 2000 2002
10
100
1,000
10,000
....................
.....................
....................
.....................
................
................
..........
........
........
........
........
.
486-33486-66
Pent.-166
Pent.-400
Pent.-3GHz
......
......
......
......
......
...........
............
............
............
..........
........
........
........
........
.....................
.........................
...
10 MpbsEthernet
100 MbpsFDDI
622 MbpsOC-12
2.4 GbpsOC-48
10 GbpsOC-192
CS490N -- Chapt. 4 6 2003
![Page 81: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/81.jpg)
Conventional Computer Hardware
� Four important aspects
– Processor
– Memory
– I/O interfaces
– One or more buses
CS490N -- Chapt. 4 6 2003
![Page 82: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/82.jpg)
Illustration Of ConventionalComputer Architecture
bus
CPU MEMORY
. . .
network interfaces and other I/O devices
� Bus is central, shared interconnect� All components contend for use
CS490N -- Chapt. 4 7 2003
![Page 83: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/83.jpg)
Bus Organization And Operations
. . . . . . . . .
control lines address lines data lines
� Parallel wires (K+N+C total)� Used to pass
– An address of K bits
– A data value of N bits (width of the bus)
– Control information of C bits
CS490N -- Chapt. 4 8 2003
![Page 84: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/84.jpg)
Bus Width
� Wider bus
– Transfers more data per unit time
– Costs more
– Requires more physical space� Compromise: to simulate wider bus, use hardware that
multiplexes transfers
CS490N -- Chapt. 4 9 2003
![Page 85: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/85.jpg)
Bus Paradigm
� Only two basic operations
– Fetch
– Store� All operations cast as forms of the above
CS490N -- Chapt. 4 10 2003
![Page 86: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/86.jpg)
Fetch/Store
� Fundamental paradigm� Used throughout hardware, including network processors
CS490N -- Chapt. 4 11 2003
![Page 87: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/87.jpg)
Fetch Operation
� Place address of a device on address lines� Issue fetch on control lines� Wait for device that owns the address to respond� If successful, extract value (response) from data lines
CS490N -- Chapt. 4 12 2003
![Page 88: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/88.jpg)
Store Operation
� Place address of a device on address lines� Place value on data lines� Issue store on control lines� Wait for device that owns the address to respond� If unsuccessful, report error
CS490N -- Chapt. 4 13 2003
![Page 89: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/89.jpg)
Example Of Operations MappedInto Fetch/Store Paradigm
� Imagine disk device attached to a bus� Assume the hardware can perform three (nontransfer)
operations:
– Start disk spinning
– Stop disk
– Determine current status
CS490N -- Chapt. 4 14 2003
![Page 90: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/90.jpg)
Example Of Operations MappedInto Fetch/Store Paradigm
(continued)
� Assign the disk two contiguous bus addresses D and D+1� Arrange for store of nonzero to address D to start disk
spinning� Arrange for store of zero to address D to stop disk� Arrange for fetch from address D+1 to return current status� Note: effect of store to address D+1 can be defined as
– Appears to work, but has no effect
– Returns an error
CS490N -- Chapt. 4 15 2003
![Page 91: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/91.jpg)
Bus Address Space
� Arbitrary hardware can be attached to bus� K address lines result in 2k possible bus addresses� Address can refer to
– Memory (e.g., RAM or ROM)
– I/O device� Arbitrary devices can be placed at arbitrary addresses� Address space can contain ‘‘holes’’
CS490N -- Chapt. 4 16 2003
![Page 92: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/92.jpg)
Bus Address Terminology
� Device on bus known as memory mapped I/O� Locations that correspond to nontransfer operations known
as Control and Status Registers (CSRs)
CS490N -- Chapt. 4 17 2003
![Page 93: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/93.jpg)
Example Bus Address Space
disk
NIC
memory
hole (unassigned)
hole (unassigned)
hole (unassigned)
lowest bus address
highest bus address
CS490N -- Chapt. 4 18 2003
![Page 94: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/94.jpg)
Network I/O OnConventional Hardware
� Network Interface Card (NIC)
– Attaches between bus and network
– Operates like other I/O devices
– Handles electrical/optical details of network
– Handles electrical details of bus
– Communicates over bus with CPU or other devices
CS490N -- Chapt. 4 19 2003
![Page 95: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/95.jpg)
Making Network I/O Fast
� Key idea: migrate more functionality onto NIC� Four techniques used with bus
– Onboard address recognition & filtering
– Onboard packet buffering
– Direct Memory Access (DMA)
– Operation and buffer chaining
CS490N -- Chapt. 4 20 2003
![Page 96: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/96.jpg)
Onboard Address Recognition And Filtering
� NIC given set of addresses to accept
– Station’s unicast address
– Network broadcast address
– Zero or more multicast addresses� When packet arrives, NIC checks destination address
– Accept packet if address on list
– Discard others
CS490N -- Chapt. 4 21 2003
![Page 97: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/97.jpg)
Onboard Packet Buffering
� NIC given high-speed local memory� Incoming packet placed in NIC’s memory� Allows computer’s memory/bus to operate slower than
network� Handles small packet bursts
CS490N -- Chapt. 4 22 2003
![Page 98: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/98.jpg)
Direct Memory Access (DMA)
� CPU
– Allocates packet buffer in memory
– Passes buffer address to NIC
– Goes on with other computation� NIC
– Accepts incoming packet from network
– Copies packet over bus to buffer in memory
– Informs CPU that packet has arrived
CS490N -- Chapt. 4 23 2003
![Page 99: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/99.jpg)
Buffer Chaining
� CPU
– Allocates multiple buffers
– Passes linked list to NIC� NIC
– Receives next packet
– Divides into one or more buffers� Advantage: a buffer can be smaller than packet
CS490N -- Chapt. 4 24 2003
![Page 100: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/100.jpg)
Operation Chaining
� CPU
– Allocates multiple buffers
– Builds linked list of operations
– Passes list to NIC� NIC
– Follows list and performs instructions
– Interrupts CPU after each operation� Advantage: multiple operations proceed without CPU
intervention
CS490N -- Chapt. 4 25 2003
![Page 101: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/101.jpg)
Illustration OfOperation Chaining
r r r
packet buffer packet buffer packet buffer
� Optimizes movement of data to memory
CS490N -- Chapt. 4 26 2003
![Page 102: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/102.jpg)
Data Flow Diagram
memoryNIC
data arrives
data leaves
� Depicts flow of data through hardware units� Used throughout the course and text
CS490N -- Chapt. 4 27 2003
![Page 103: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/103.jpg)
Summary
� Software-based network systems run on conventionalhardware
– Processor
– Memory
– I/O devices
– Bus� Network interface cards can be optimized to reduce CPU
load
CS490N -- Chapt. 4 28 2003
![Page 104: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/104.jpg)
Questions?
![Page 105: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/105.jpg)
V
Basic Packet Processing:Algorithms And Data Structures
CS490N -- Chapt. 5 1 2003
![Page 106: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/106.jpg)
Copying
� Used when packet moved from one memory location toanother
� Expensive� Must be avoided whenever possible
– Leave packet in buffer
– Pass buffer address among threads/layers
CS490N -- Chapt. 5 2 2003
![Page 107: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/107.jpg)
Buffer Allocation
� Possibilities
– Large, fixed buffers
– Variable-size buffers
– Linked list of fixed-size blocks
CS490N -- Chapt. 5 3 2003
![Page 108: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/108.jpg)
Buffer Addressing
� Buffer address must be resolvable in all contexts� Easiest implementation: keep buffers in kernel space
CS490N -- Chapt. 5 4 2003
![Page 109: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/109.jpg)
Integer Representation
� Two standards
– Little endian (least-significant byte at lowest address)
– Big endian (most-significant byte at lowest address)
CS490N -- Chapt. 5 5 2003
![Page 110: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/110.jpg)
Illustration Of Big AndLittle Endian Integers
1 2 3 4
4 3 2 1
little endian
big endian
increasing memory addresses
increasing memory addresses
CS490N -- Chapt. 5 6 2003
![Page 111: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/111.jpg)
Integer Conversion
� Needed when heterogeneous computers communicate� Protocols define network byte order� Computers convert to network byte order� Typical library functions
Function data size Translation
ntohs 16 bits Network byte order to host’s byte orderhtons 16 bits Host’s byte order to network byte orderntohl 32 bits Network byte order to host’s byte orderhtonl 32 bits Host’s byte order to network byte order
CS490N -- Chapt. 5 7 2003
![Page 112: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/112.jpg)
Examples Of Algorithms ImplementedWith Software-Based Systems
� Layer 2
– Ethernet bridge� Layer 3
– IP forwarding
– IP fragmentation and reassembly� Layer 4
– TCP connection recognition and splicing� Other
– Hash table lookup
CS490N -- Chapt. 5 8 2003
![Page 113: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/113.jpg)
Why Study These Algorithms?
CS490N -- Chapt. 5 9 2003
![Page 114: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/114.jpg)
Why Study These Algorithms?
� Provide insight on packet processing tasks
CS490N -- Chapt. 5 9 2003
![Page 115: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/115.jpg)
Why Study These Algorithms?
� Provide insight on packet processing tasks� Reinforce concepts
CS490N -- Chapt. 5 9 2003
![Page 116: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/116.jpg)
Why Study These Algorithms?
� Provide insight on packet processing tasks� Reinforce concepts� Help students recall protocol details
CS490N -- Chapt. 5 9 2003
![Page 117: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/117.jpg)
Why Study These Algorithms?
� Provide insight on packet processing tasks� Reinforce concepts� Help students recall protocol details� You will need them in lab!
CS490N -- Chapt. 5 9 2003
![Page 118: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/118.jpg)
Ethernet Bridge
� Used between a pair of Ethernets� Provides transparent connection� Listens in promiscuous mode� Forwards frames in both directions� Uses addresses to filter
CS490N -- Chapt. 5 9 2003
![Page 119: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/119.jpg)
Bridge Filtering
� Uses source address in frames to identify computers on eachnetwork
� Uses destination address to decide whether to forward frame
CS490N -- Chapt. 5 10 2003
![Page 120: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/120.jpg)
Bridge Algorithm
Assume: two network interfaces each operating in promiscuousmode.Create an empty list, L, that will contain pairs of values;Do forever {
Acquire the next frame to arrive;Set I to the interface over which the frame arrived;Extract the source address, S;Extract the destination address, D;Add the pair ( S, I ) to list L if not already present.If the pair ( D, I ) appears in list L {
Drop the frame;} Else {
Forward the frame over the other interface;}
}
CS490N -- Chapt. 5 11 2003
![Page 121: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/121.jpg)
Implementation Of Table Lookup
� Need high speed (more on this later)� Software-based systems typically use hashing for table
lookup
CS490N -- Chapt. 5 12 2003
![Page 122: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/122.jpg)
Hashing
� Optimizes number of probes� Works well if table not full� Practical technique: double hashing
CS490N -- Chapt. 5 13 2003
![Page 123: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/123.jpg)
Hashing Algorithm
Given: a key, a table in memory, and the table size N.Produce: a slot in the table that corresponds to the key
or an empty table slot if the key is not in the table.Method: double hashing with open addressing.Choose P1 and P2 to be prime numbers;Fold the key to produce an integer, K;Compute table pointer Q equal to ( P1 × K ) modulo N;Compute increment R equal to ( P2 × K ) modulo N;While (table slot Q not equal to K and nonempty) {
Q ← (Q + R) modulo N;}At this point, Q either points to an empty table slot or to the
slot containing the key.
CS490N -- Chapt. 5 14 2003
![Page 124: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/124.jpg)
Address Lookup
� Computer can compare integer in one operation� Network address can be longer than integer (e.g., 48 bits)� Two possibilities
– Use multiple comparisons per probe
– Fold address into integer key
CS490N -- Chapt. 5 15 2003
![Page 125: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/125.jpg)
Folding
� Maps N-bit value into M-bit key, M < N� Typical technique: exclusive or� Potential problem: two values map to same key� Solution: compare full value when key matches
CS490N -- Chapt. 5 16 2003
![Page 126: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/126.jpg)
IP Forwarding
� Used in hosts as well as routers� Conceptual mapping
(next hop, interface) ← f(datagram, routing table)
� Table driven
CS490N -- Chapt. 5 17 2003
![Page 127: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/127.jpg)
IP Routing Table
� One entry per destination� Entry contains
– 32-bit IP address of destination
– 32-bit address mask
– 32-bit next-hop address
– N-bit interface number
CS490N -- Chapt. 5 18 2003
![Page 128: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/128.jpg)
Example IP Routing Table
Destination Address Next-Hop InterfaceAddress Mask Address Number
192.5.48.0 255.255.255.0 128.210.30.5 2128.10.0.0 255.255.0.0 128.210.141.12 10.0.0.0 0.0.0.0 128.210.30.5 2
� Values stored in binary� Interface number is for internal use only� Zero mask produces default route
CS490N -- Chapt. 5 19 2003
![Page 129: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/129.jpg)
IP Forwarding Algorithm
Given: destination address A and routing table R.Find: a next hop and interface used to route datagrams to A.For each entry in table R {
Set MASK to the Address Mask in the entry;Set DEST to the Destination Address in the entry;If (A & MASK) == DEST {
Stop; use the next hop and interface in the entry;}
}If this point is reached, declare error: no route exists;
CS490N -- Chapt. 5 20 2003
![Page 130: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/130.jpg)
IP Fragmentation
� Needed when datagram larger than network MTU� Divides IP datagram into fragments� Uses FLAGS bits in datagram header
0 D M FLAGS bits
0 = last fragment; 1 = more fragments
0 = may fragment; 1 = do not fragment
Reserved (must be zero)
CS490N -- Chapt. 5 21 2003
![Page 131: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/131.jpg)
IP Fragmentation Algorithm(Part 1: Initialization)
Given: an IP datagram, D, and a network MTU.Produce: a set of fragments for D.If the DO NOT FRAGMENT bit is set {
Stop and report an error;
}Compute the size of the datagram header, H;Choose N to be the largest multiple of 8 such
that H+N ≤ MTU;Initialize an offset counter, O, to zero;
CS490N -- Chapt. 5 22 2003
![Page 132: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/132.jpg)
IP Fragmentation Algorithm(Part 2: Processing)
Repeat until datagram empty {
Create a new fragment that has a copy of D’s header;
Extract up to the next N octets of data from D and place
the data in the fragment;
Set the MORE FRAGMENTS bit in fragment header;
Set TOTAL LENGTH field in fragment header to be H+N;
Set FRAGMENT OFFSET field in fragment header to O;
Compute and set the CHECKSUM field in fragment
header;
Increment O by N/8;
}
CS490N -- Chapt. 5 23 2003
![Page 133: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/133.jpg)
Reassembly
� Complement of fragmentation� Uses IP SOURCE ADDRESS and IDENTIFICATION fields
in datagram header to group related fragments� Joins fragments to form original datagram
CS490N -- Chapt. 5 24 2003
![Page 134: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/134.jpg)
Reassembly Algorithm
Given: a fragment, F, add to a partial reassembly.
Method: maintain a set of fragments for each datagram.
Extract the IP source address, S, and ID fields from F;
Combine S and ID to produce a lookup key, K;
Find the fragment set with key K or create a new set;
Insert F into the set;
If the set contains all the data for the datagram {
Form a completely reassembled datagram and process it;
}
CS490N -- Chapt. 5 25 2003
![Page 135: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/135.jpg)
Data Structure For Reassembly
� Two parts
– Buffer large enough to hold original datagram
– Linked list of pieces that have arrived
40 80 40
fragment inreassembly buffer
reassembly buffer
CS490N -- Chapt. 5 26 2003
![Page 136: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/136.jpg)
TCP Connection
� Involves a pair of endpoints� Started with SYN segment� Terminated with FIN or RESET segment� Identified by 4-tuple
( src addr, dest addr, src port, dest port )
CS490N -- Chapt. 5 27 2003
![Page 137: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/137.jpg)
TCP Connection Recognition Algorithm(Part 1)
Given: a copy of traffic passing across a network.
Produce: a record of TCP connections present in the traffic.
Initialize a connection table, C, to empty;
For each IP datagram that carries a TCP segment {
Extract the IP source, S, and destination, D, addresses;
Extract the source, P1, and destination, P2, port numbers;
Use (S,D,P1,P2) as a lookup key for table C and
create a new entry, if needed;
CS490N -- Chapt. 5 28 2003
![Page 138: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/138.jpg)
TCP Connection Recognition Algorithm(Part 2)
If the segment has the RESET bit set, delete the entry;
Else if the segment has the FIN bit set, mark theconnection
closed in one direction, removing the entry from C if
the connection was previously closed in the other;
Else if the segment has the SYN bit set, mark theconnection as
being established in one direction, making it completely
established if it was previously marked as being
established in the other;
}
CS490N -- Chapt. 5 29 2003
![Page 139: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/139.jpg)
TCP Splicing
� Join two TCP connections� Allow data to pass between them� To avoid termination overhead translate segment header
fields
– Acknowledgement number
– Sequence number
CS490N -- Chapt. 5 30 2003
![Page 140: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/140.jpg)
Illustration Of TCP Splicing
splicerHostA
HostB
TCP connection #1 TCP connection #2
sequence 200 sequence 50 sequence 860 sequence 1200
Connection Sequence Connection Sequence& Direction Number & Direction Number
Incoming #1 200 Incoming #2 1200Outgoing #2 860 Outgoing #1 50
Change 660 Change -1150
CS490N -- Chapt. 5 31 2003
![Page 141: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/141.jpg)
TCP Splicing Algorithm(Part 1)
Given: two TCP connections.
Produce: sequence translations for splicing the connection.
Compute D1, the difference between the starting sequences
on incoming connection 1 and outgoing connection 2;
Compute D2, the difference between the starting sequences
on incoming connection 2 and outgoing connection 1;
CS490N -- Chapt. 5 32 2003
![Page 142: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/142.jpg)
TCP Splicing Algorithm(Part 2)
For each segment {
If segment arrived on connection 1 {
Add D1 to sequence number;
Subtract D2 from acknowledgement number;
} else if segment arrived on connection 2 {
Add D2 to sequence number;
Subtract D1 from acknowledgement number;
}
}
CS490N -- Chapt. 5 33 2003
![Page 143: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/143.jpg)
Summary
� Packet processing algorithms include
– Ethernet bridging
– IP fragmentation and reassembly
– IP forwarding
– TCP splicing� Table lookup important
– Full match for layer 2
– Longest prefix match for layer 3
CS490N -- Chapt. 5 34 2003
![Page 144: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/144.jpg)
Questions?
![Page 145: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/145.jpg)
For Hands-On Experience With
A Software-Based System:
Enroll in CS 636 !
CS490N -- Chapt. 5 36 2003
![Page 146: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/146.jpg)
VI
Packet Processing Functions
CS490N -- Chapt. 6 1 2003
![Page 147: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/147.jpg)
Goal
� Identify functions that occur in packet processing� Devise set of operations sufficient for all packet processing� Find an efficient implementation for the operations
CS490N -- Chapt. 6 2 2003
![Page 148: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/148.jpg)
Packet Processing Functions We Will Consider
� Address lookup and packet forwarding� Error detection and correction� Fragmentation, segmentation, and reassembly� Frame and protocol demultiplexing� Packet classification� Queueing and packet discard� Scheduling and timing� Security: authentication and privacy� Traffic measurement, policing, and shaping
CS490N -- Chapt. 6 3 2003
![Page 149: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/149.jpg)
Address Lookup And Packet Forwarding
� Forwarding requires address lookup� Lookup is table driven� Two types
– Exact match (typically layer 2)
– Longest-prefix match (typically layer 3)� Cost depends on size of table and type of lookup
CS490N -- Chapt. 6 4 2003
![Page 150: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/150.jpg)
Error Detection And Correction
� Data sent with packet used as verification
– Checksum
– CRC� Cost proportional to size of packet� Often implemented with special-purpose hardware
CS490N -- Chapt. 6 5 2003
![Page 151: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/151.jpg)
An Important Note About Cost
The cost of an operation is proportional to the amount of dataprocessed. An operation such as checksum computation thatrequires examination of all the data in a packet is among themost expensive.
CS490N -- Chapt. 6 6 2003
![Page 152: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/152.jpg)
Fragmentation, Segmentation, And Reassembly
� IP fragments and reassembles datagrams� ATM segments and reassembles AAL5 packets� Same idea; details differ� Cost is high because
– State must be kept and managed
– Unreassembled fragments occupy memory
CS490N -- Chapt. 6 7 2003
![Page 153: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/153.jpg)
Frame And Protocol Demultiplexing
� Traditional technique used in layered protocols� Type appears in each header
– Assigned on output
– Used on input to select ‘‘next’’ protocol� Cost of demultiplexing proportional to number of layers
CS490N -- Chapt. 6 8 2003
![Page 154: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/154.jpg)
Packet Classification
� Alternative to demultiplexing� Crosses multiple layers� Achieves lower cost� More on classification later in the course
CS490N -- Chapt. 6 9 2003
![Page 155: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/155.jpg)
Queueing And Packet Discard
� General paradigm is store-and-forward
– Incoming packet placed in queue
– Outgoing packet placed in queue� When queue is full, choose packet to discard� Affects throughput of higher-layer protocols
CS490N -- Chapt. 6 10 2003
![Page 156: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/156.jpg)
Queueing Priorities
� Multiple queues used to enforce priority among packets� Incoming packet
– Assigned priority as function of contents
– Placed in appropriate priority queue� Queueing discipline
– Examines priority queues
– Chooses which packet to send
CS490N -- Chapt. 6 11 2003
![Page 157: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/157.jpg)
Examples Of Queueing Disciplines
� Priority Queueing
– Assign unique priority number to each queue
– Choose packet from highest priority queue that isnonempty
– Known as strict priority queueing
– Can lead to starvation
CS490N -- Chapt. 6 12 2003
![Page 158: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/158.jpg)
Examples Of Queueing Disciplines(continued)
� Weighted Round Robin (WRR)
– Assign unique priority number to each queue
– Process all queues round-robin
– Compute N, max number of packets to select from aqueue proportional to priority
– Take up to N packets before moving to next queue
– Works well if all packets equal size
CS490N -- Chapt. 6 13 2003
![Page 159: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/159.jpg)
Examples Of Queueing Disciplines(continued)
� Weighted Fair Queueing (WFQ)
– Make selection from queue proportional to priority
– Use packet size rather than number of packets
– Allocates priority to amount of data from a queue ratherthan number of packets
CS490N -- Chapt. 6 14 2003
![Page 160: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/160.jpg)
Scheduling And Timing
� Important mechanisms� Used to coordinate parallel and concurrent tasks
– Processing on multiple packets
– Processing on multiple protocols
– Multiple processors� Scheduler attempts to achieve fairness
CS490N -- Chapt. 6 15 2003
![Page 161: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/161.jpg)
Security: Authentication And Privacy
� Authentication mechanisms
– Ensure sender’s identity� Confidentiality mechanisms
– Ensure that intermediaries cannot interpret packetcontents
� Note: in common networking terminology, privacy refers toconfidentiality
– Example: Virtual Private Networks
CS490N -- Chapt. 6 16 2003
![Page 162: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/162.jpg)
Traffic Measurement And Policing
� Used by network managers� Can measure aggregate traffic or per-flow traffic� Often related to Service Level Agreement (SLA)� Cost is high if performed in real-time
CS490N -- Chapt. 6 17 2003
![Page 163: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/163.jpg)
Traffic Shaping
� Make traffic conform to statistical bounds� Typical use
– Smooth bursts
– Avoid packet trains� Only possibilities
– Discard packets (seldom used)
– Delay packets
CS490N -- Chapt. 6 18 2003
![Page 164: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/164.jpg)
Example Traffic Shaping Mechanisms
� Leaky bucket
– Easy to implement
– Popular
– Sends steady number of packets per second
– Rate depends on number of packets waiting
– Does not guarantee steady data rate
CS490N -- Chapt. 6 19 2003
![Page 165: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/165.jpg)
Example Traffic Shaping Mechanisms(continued)
� Token bucket
– Sends steady number of bits per second
– Rate depends on number of bits waiting
– Achieves steady data rate
– More difficult to implement
CS490N -- Chapt. 6 20 2003
![Page 166: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/166.jpg)
Illustration Of Traffic Shaper
packet queue
packetsarrive
packetsleave
forwards packets ata steady rate
� Packets
– Arrive in bursts
– Leave at steady rate
CS490N -- Chapt. 6 21 2003
![Page 167: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/167.jpg)
Timer Management
� Fundamental piece of network system� Needed for
– Scheduling
– Traffic shaping
– Other protocol processing (e.g., retransmission)� Cost
– Depends on number of timer operations (e.g., set,cancel)
– Can be high
CS490N -- Chapt. 6 22 2003
![Page 168: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/168.jpg)
Summary
� Primary packet processing functions are
– Address lookup and forwarding
– Error detection and correction
– Fragmentation and reassembly
– Demultiplexing and classification
– Queueing and discard
– Scheduling and timing
– Security functions
– Traffic measurement, policing, and shaping
CS490N -- Chapt. 6 23 2003
![Page 169: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/169.jpg)
Questions?
![Page 170: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/170.jpg)
VII
Protocol Software On AConventional Processor
CS490N -- Chapt. 7 1 2003
![Page 171: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/171.jpg)
Possible Implementations OfProtocol Software
� In an application program
– Easy to program
– Runs as user-level process
– No direct access to network devices
– High cost to copy data from kernel address space
– Cannot run at wire speed
CS490N -- Chapt. 7 2 2003
![Page 172: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/172.jpg)
Possible Implementations OfProtocol Software
(continued)
� In an embedded system
– Special-purpose hardware device
– Dedicated to specific task
– Ideal for stand-alone system
– Software has full control
CS490N -- Chapt. 7 3 2003
![Page 173: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/173.jpg)
Possible Implementations OfProtocol Software
(continued)
� In an embedded system
– Special-purpose hardware device
– Dedicated to specific task
– Ideal for stand-alone system
– Software has full control
– You will experience this in lab!
CS490N -- Chapt. 7 3 2003
![Page 174: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/174.jpg)
Possible Implementations OfProtocol Software
(continued)
� In an operating system kernel
– More difficult to program than application
– Runs with kernel privilege
– Direct access to network devices
CS490N -- Chapt. 7 4 2003
![Page 175: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/175.jpg)
Interface To The Network
� Known as Application Program Interface (API)� Can be
– Asynchronous
– Synchronous� Synchronous interface can use
– Blocking
– Polling
CS490N -- Chapt. 7 5 2003
![Page 176: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/176.jpg)
Asynchronous API
� Also known as event-driven� Programmer
– Writes set of functions
– Specifies which function to invoke for each event type� Programmer has no control over function invocation� Functions keep state in shared memory� Difficult to program� Example: function f() called when packet arrives
CS490N -- Chapt. 7 6 2003
![Page 177: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/177.jpg)
Synchronous API Using Blocking
� Programmer
– Writes main flow-of-control
– Explicitly invokes functions as needed
– Built-in functions block until request satisfied� Example: function wait_for_packet() blocks until packet
arrives� Easier to program
CS490N -- Chapt. 7 7 2003
![Page 178: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/178.jpg)
Synchronous API Using Polling
� Nonblocking form of synchronous API� Each function call returns immediately
– Performs operation if available
– Returns error code otherwise� Example: function try_for_packet() either returns next
packet or error code if no packet has arrived� Closer to underlying hardware
CS490N -- Chapt. 7 8 2003
![Page 179: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/179.jpg)
Typical Implementations And APIs
� Application program
– Synchronous API using blocking (e.g., socket API)
– Another application thread runs while an applicationblocks
� Embedded systems
– Synchronous API using polling
– CPU dedicated to one task� Operating systems
– Asynchronous API
– Built on interrupt mechanism
CS490N -- Chapt. 7 9 2003
![Page 180: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/180.jpg)
Example Asynchronous API
� Design goals
– For use with network processor
– Simplest possible interface
– Sufficient for basic packet processing tasks� Includes
– I/O functions
– Timer manipulation functions
CS490N -- Chapt. 7 10 2003
![Page 181: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/181.jpg)
Example Asynchronous API(continued)
� Initialization and termination functions
– on_startup()
– on_shutdown()� Input function (called asynchronously)
– recv_frame()� Output functions
– new_fbuf()
– send_frame()
CS490N -- Chapt. 7 11 2003
![Page 182: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/182.jpg)
Example Asynchronous API(continued)
� Timer functions (called asynchronously)
– delayed_call()
– periodic_call()
– cancel_call()� Invoked by outside application
– console_command()
CS490N -- Chapt. 7 12 2003
![Page 183: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/183.jpg)
Processing Priorities
� Determine which code CPU runs at any time� General idea
– Hardware devices need highest priority
– Protocol software has medium priority
– Application programs have lowest priority� Queues provide buffering across priorities
CS490N -- Chapt. 7 13 2003
![Page 184: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/184.jpg)
Illustration Of Priorities
device drivershandling frames
protocolprocessing
Applications
NIC1 NIC2
highest priority
medium priority
lowest priority
packet queuebetween levels
CS490N -- Chapt. 7 14 2003
![Page 185: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/185.jpg)
Implementation Of PrioritiesIn An Operating System
� Two possible approaches
– Interrupt mechanism
– Kernel threads
CS490N -- Chapt. 7 15 2003
![Page 186: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/186.jpg)
Interrupt Mechanism
� Built into hardware� Operates asynchronously� Saves current processing state� Changes processor status� Branches to specified location
CS490N -- Chapt. 7 16 2003
![Page 187: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/187.jpg)
Two Types Of Interrupts
� Hardware interrupt
– Caused by device (bus)
– Must be serviced quickly� Software interrupt
– Caused by executing program
– Lower priority than hardware interrupt
– Higher priority than other OS code
CS490N -- Chapt. 7 17 2003
![Page 188: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/188.jpg)
Software Interrupts AndProtocol Code
� Protocol stack operates as software interrupt� When packet arrives
– Hardware interrupts
– Device driver raises software interrupt� When device driver finishes
– Hardware interrupt clears
– Protocol code is invoked
CS490N -- Chapt. 7 18 2003
![Page 189: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/189.jpg)
Kernel Threads
� Alternative to interrupts� Familiar to programmer� Finer-grain control than software interrupts� Can be assigned arbitrary range of priorities
CS490N -- Chapt. 7 19 2003
![Page 190: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/190.jpg)
Conceptual Organization
� Packet passes among multiple threads of control� Queue of packets between each pair of threads� Threads synchronize to access queues
CS490N -- Chapt. 7 20 2003
![Page 191: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/191.jpg)
Possible Organization OfKernel Threads For Layered Protocols
� One thread per layer� One thread per protocol� Multiple threads per protocol� Multiple threads per protocol plus timer management
thread(s)� One thread per packet
CS490N -- Chapt. 7 21 2003
![Page 192: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/192.jpg)
One Thread Per Layer
� Easy for programmer to understand� Implementation matches concept� Allows priority to be assigned to each layer� Means packet is enqueued once per layer
CS490N -- Chapt. 7 22 2003
![Page 193: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/193.jpg)
Illustration Of One Thread Per Layer
applications
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
T2
T3
T4
queue
queue
queue
Layer 2
Layer 3
Layer 4
packets arrive packets leave
app. sends app. receives
CS490N -- Chapt. 7 23 2003
![Page 194: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/194.jpg)
One Thread Per Protocol
� Like one thread per layer
– Implementation matches concept
– Means packet is enqueued once per layer� Advantages over one thread per layer
– Easier for programmer to understand
– Finer-grain control
– Allows priority to be assigned to each protocol
CS490N -- Chapt. 7 24 2003
![Page 195: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/195.jpg)
Illustration Of One Thread Per Protocol
applications
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
udp tcp
queue queue
� TCP and UDP reside at same layer� Separation allows priority
CS490N -- Chapt. 7 25 2003
![Page 196: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/196.jpg)
Multiple Threads Per Protocol
� Further division of duties� Simplifies programming� More control than single thread� Typical division
– Thread for incoming packets
– Thread for outgoing packets
– Thread for management/timing
CS490N -- Chapt. 7 26 2003
![Page 197: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/197.jpg)
Illustration Of MultipleThreads Used With TCP
applications
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
tcp
tim.
queue
timer thread
� Separate timer makes programming easier
CS490N -- Chapt. 7 27 2003
![Page 198: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/198.jpg)
Timers And Protocols
� Many protocols implement timeouts
– TCP
* Retransmission timeout
* 2MSL timeout
– ARP
* Cache entry timeout
– IP
* Reassembly timeout
CS490N -- Chapt. 7 28 2003
![Page 199: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/199.jpg)
Multiple Threads Per ProtocolPlus Timer Management Thread(s)
� Observations
– Many protocols each need timer functionality
– Each timer thread incurs overhead� Solution: consolidate timers for multiple protocols
CS490N -- Chapt. 7 29 2003
![Page 200: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/200.jpg)
Is One Timer Thread Sufficient?
� In theory
– Yes� In practice
– Large range of timeouts (microseconds to tens ofseconds)
– May want to give priority to some timeouts� Solution: two or more timer threads
CS490N -- Chapt. 7 30 2003
![Page 201: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/201.jpg)
Multiple Timer Threads
� Two threads usually suffice� Large-granularity timer
– Values specified in seconds
– Operates at lower priority� Small-granularity timer
– Values specified in microseconds
– Operates at higher priority
CS490N -- Chapt. 7 31 2003
![Page 202: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/202.jpg)
Thread Synchronization
� Thread for layer i
– Needs to pass a packet to layer i + 1
– Enqueues the packet� Thread for layer i + 1
– Retrieves packet from the queue
CS490N -- Chapt. 7 32 2003
![Page 203: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/203.jpg)
Thread Synchronization
� Thread for layer i
– Needs to pass a packet to layer i + 1
– Enqueues the packet� Thread for layer i + 1
– Retrieves packet from the queue� Context switch required!
CS490N -- Chapt. 7 32 2003
![Page 204: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/204.jpg)
Context Switch
� OS function� CPU passes from current thread to a waiting thread� High cost� Must be minimized
CS490N -- Chapt. 7 33 2003
![Page 205: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/205.jpg)
One Thread Per Packet
� Preallocate set of threads� Thread operation
– Waits for packet to arrive
– Moves through protocol stack
– Returns to wait for next packet� Minimizes context switches
CS490N -- Chapt. 7 34 2003
![Page 206: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/206.jpg)
Summary
� Packet processing software usually runs in OS� API can be synchronous or asynchronous� Priorities achieved with
– Software interrupts
– Threads� Variety of thread architectures possible
CS490N -- Chapt. 7 35 2003
![Page 207: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/207.jpg)
Questions?
![Page 208: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/208.jpg)
VIII
Hardware ArchitecturesFor Protocol Processing
AndAggregate Rates
CS490N -- Chapt. 8 1 2003
![Page 209: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/209.jpg)
A Brief History OfComputer Hardware
� 1940s
– Beginnings� 1950s
– Consolidation on von Neumann architecture
– I/O controlled by CPU� 1960s
– I/O becomes important
– Evolution of third generation architecture with interrupts
CS490N -- Chapt. 8 2 2003
![Page 210: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/210.jpg)
I/O Processing
� Evolved from after-thought to central influence� Low-end systems (e.g., microcontrollers)
– Dumb I/O interfaces
– CPU does all the work (polls devices)
– Single, shared memory
– Low cost, but low speed
CS490N -- Chapt. 8 3 2003
![Page 211: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/211.jpg)
I/O Processing(continued)
� Mid-range systems (e.g., minicomputers)
– Single, shared memory
– I/O interfaces contain logic for transfer and statusoperations
– CPU
* Starts device then resumes processing
– Device
* Transfers data to / from memory
* Interrupts when operation complete
CS490N -- Chapt. 8 4 2003
![Page 212: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/212.jpg)
I/O Processing(continued)
� High-end systems (e.g., mainframes)
– Separate, programmable I/O processor
– OS downloads code to be run
– Device has private on-board buffer memory
– Examples: IBM channel, CDC peripheral processor
CS490N -- Chapt. 8 5 2003
![Page 213: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/213.jpg)
Networking Systems Evolution
� Twenty year history� Same trend as computer architecture
– Began with central CPU
– Shift to emphasis on I/O� Three main generations
CS490N -- Chapt. 8 6 2003
![Page 214: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/214.jpg)
First Generation Network Systems
� Traditional software-based router� Used conventional (minicomputer) hardware
– Single general-purpose processor
– Single shared memory
– I/O over a bus
– Network interface cards use same design as other I/Odevices
CS490N -- Chapt. 8 7 2003
![Page 215: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/215.jpg)
Protocol Processing InFirst Generation Network Systems
all otherprocessing
framing &address
recognition
framing &address
recognition
NIC1 NIC2Standard CPU
� General-purpose processor handles most tasks� Sufficient for low-speed systems� Note: we will examine other generations later in the course
CS490N -- Chapt. 8 8 2003
![Page 216: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/216.jpg)
How Fast Does A CPU Need To Be?
� Depends on
– Rate at which data arrives
– Amount of processing to be performed
CS490N -- Chapt. 8 9 2003
![Page 217: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/217.jpg)
Two Measures Of Speed
� Data rate (bits per second)
– Per interface rate
– Aggregate rate� Packet rate (packets per second)
– Per interface rate
– Aggregate rate
CS490N -- Chapt. 8 10 2003
![Page 218: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/218.jpg)
How Fast Is A Fast Connection?
� Definition of fast data rate keeps changing
– 1960: 10 Kbps
– 1970: 1 Mbps
– 1980: 10 Mbps
– 1990: 100 Mbps
– 2000: 1000 Mbps (1 Gbps)
– 2003: 2400 Mbps
CS490N -- Chapt. 8 11 2003
![Page 219: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/219.jpg)
How Fast Is A Fast Connection?
� Definition of fast data rate keeps changing
– 1960: 10 Kbps
– 1970: 1 Mbps
– 1980: 10 Mbps
– 1990: 100 Mbps
– 2000: 1000 Mbps (1 Gbps)
– 2003: 2400 Mbps
– Soon: 10 Gbps???
CS490N -- Chapt. 8 12 2003
![Page 220: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/220.jpg)
Aggregate Rate Vs.Per-Interface Rate
� Interface rate
– Rate at which data enters / leaves� Aggregate
– Sum of interface rates
– Measure of total data rate system can handle� Note: aggregate rate crucial if CPU handles traffic from all
interfaces
CS490N -- Chapt. 8 12 2003
![Page 221: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/221.jpg)
A Note About System Scale
The aggregate data rate is defined to be the sum of the rates atwhich traffic enters or leaves a system. The maximumaggregate data rate of a system is important because it limitsthe type and number of network connections the system canhandle.
CS490N -- Chapt. 8 13 2003
![Page 222: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/222.jpg)
Packet Rate Vs. Data Rate
� Sources of CPU overhead
– Per-bit processing
– Per-packet processing� Interface hardware handles much of per-bit processing
CS490N -- Chapt. 8 14 2003
![Page 223: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/223.jpg)
A Note About System Scale
For protocol processing tasks that have a fixed cost per packet,the number of packets processed is more important than theaggregate data rate.
CS490N -- Chapt. 8 15 2003
![Page 224: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/224.jpg)
Example Packet Rates
Technology Network Packet Rate Packet RateData Rate For Small Packets For Large PacketsIn Gbps In Kpps In Kpps
10Base-T 0.010 19.5 0.8100Base-T 0.100 195.3 8.2OC-3 0.156 303.8 12.8OC-12 0.622 1,214.8 51.21000Base-T 1.000 1,953.1 82.3OC-48 2.488 4,860.0 204.9OC-192 9.953 19,440.0 819.6OC-768 39.813 77,760.0 3,278.4
� Key concept: maximum packet rate occurs with minimum-size packets
CS490N -- Chapt. 8 16 2003
![Page 225: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/225.jpg)
Bar Chart Of Example Packet Rates
100 Kpps
101 Kpps
102 Kpps
103 Kpps
104 Kpps
105 Kpps
19.5
195.3303.8
1214.81953.1
4860.0
19440.0
77760.0
10Base-T 100Base-T OC-3 OC-12 1000Base-T OC-48 OC-192 OC-768
CS490N -- Chapt. 8 17 2003
![Page 226: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/226.jpg)
Bar Chart Of Example Packet Rates
100 Kpps
101 Kpps
102 Kpps
103 Kpps
104 Kpps
105 Kpps
19.5
195.3303.8
1214.81953.1
4860.0
19440.0
77760.0
10Base-T 100Base-T OC-3 OC-12 1000Base-T OC-48 OC-192 OC-768
� Gray areas show rates for large packetsCS490N -- Chapt. 8 17 2003
![Page 227: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/227.jpg)
Time Per Packet
Technology Time per packet Time per packetfor small packets for large packets
( in µs ) ( in µs )
10Base-T 51.20 1,214.40100Base-T 5.12 121.44OC-3 3.29 78.09OC-12 0.82 19.521000Base-T 0.51 12.14OC-48 0.21 4.88OC-192 0.05 1.22OC-768 0.01 0.31
� Note: these numbers are for a single connection!
CS490N -- Chapt. 8 18 2003
![Page 228: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/228.jpg)
Conclusion
Software running on a general-purpose processor is aninsufficient architecture to handle high-speed networks becausethe aggregate packet rate exceeds the capabilities of a CPU.
CS490N -- Chapt. 8 19 2003
![Page 229: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/229.jpg)
Possible Ways To SolveThe CPU Bottleneck
� Fine-grain parallelism� Symmetric coarse-grain parallelism� Asymmetric coarse-grain parallelism� Special-purpose coprocessors� NICs with onboard processing� Smart NICs with onboard stacks� Cell switching� Data pipelines
CS490N -- Chapt. 8 20 2003
![Page 230: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/230.jpg)
Fine-Grain Parallelism
� Multiple processors� Instruction-level parallelism� Example:
– Parallel checksum: add values of eight consecutivememory locations at the same time
� Assessment: insignificant advantages for packet processing
CS490N -- Chapt. 8 21 2003
![Page 231: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/231.jpg)
Symmetric Coarse-Grain Parallelism
� Symmetric multiprocessor hardware
– Multiple, identical processors� Typical design: each CPU operates on one packet� Requires coordination� Assessment: coordination and data access means N
processors cannot handle N times more packets than oneprocessor
CS490N -- Chapt. 8 22 2003
![Page 232: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/232.jpg)
Asymmetric Coarse-Grain Parallelism
� Multiple processors� Each processor
– Optimized for specific task
– Includes generic instructions for control� Assessment
– Same problems of coordination and data access assymmetric case
– Designer much choose how many copies of eachprocessor type
CS490N -- Chapt. 8 23 2003
![Page 233: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/233.jpg)
Special-Purpose Coprocessors
� Special-purpose hardware� Added to conventional processor to speed computation� Invoked like software subroutine� Typical implementation: ASIC chip� Choose operations that yield greatest improvement in speed
CS490N -- Chapt. 8 24 2003
![Page 234: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/234.jpg)
General Principle
To optimize computation, move operations that account for themost CPU time from software into hardware.
CS490N -- Chapt. 8 25 2003
![Page 235: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/235.jpg)
General Principle
To optimize computation, move operations that account for themost CPU time from software into hardware.
� Idea known as Amdahl’s law (performance improvementfrom faster hardware technology is limited to the fraction oftime the faster technology can be used)
CS490N -- Chapt. 8 25 2003
![Page 236: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/236.jpg)
NICs And Onboard Processing
� Basic optimizations
– Onboard address recognition and filtering
– Onboard buffering
– DMA
– Buffer and operation chaining� Further optimization possible
CS490N -- Chapt. 8 26 2003
![Page 237: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/237.jpg)
Smart NICs With Onboard Stacks
� Add hardware to NIC
– Off-the-shelf chips for layer 2
– ASICs for layer 3� Allows each NIC to operate independently
– Effectively a multiprocessor
– Total processing power increased dramatically
CS490N -- Chapt. 8 27 2003
![Page 238: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/238.jpg)
Illustration Of Smart NICsWith Onboard Processing
all otherprocessing
most layer 2 processingsome layer 3 processing
most layer 2 processingsome layer 3 processing
Smart NIC1 Smart NIC2Standard CPU
� NIC handles layers 2 and 3� CPU only handles exceptions
CS490N -- Chapt. 8 28 2003
![Page 239: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/239.jpg)
Cell Switching
� Alternative to new hardware� Changes
– Basic paradigm
– All details (e.g., protocols)� Connection-oriented
CS490N -- Chapt. 8 29 2003
![Page 240: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/240.jpg)
Cell Switching Details
� Fixed-size packets
– Allows fixed-size buffers
– Guaranteed time to transmit/receive� Relative (connection-oriented) addressing
– Smaller address size
– Label on packet changes at each switch
– Requires connection setup� Example: ATM
CS490N -- Chapt. 8 30 2003
![Page 241: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/241.jpg)
Data Pipeline
� Move each packet through series of processors� Each processor handles some tasks� Assessment
– Well-suited to many protocol processing tasks
– Individual processor can be fast
CS490N -- Chapt. 8 31 2003
![Page 242: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/242.jpg)
Illustration Of Data Pipeline
stage 1stage 2
stage 3
stage 4
stage 5
packets enterthe pipeline
packets leavethe pipeline
interstage packet buffer
� Pipeline can contain heterogeneous processors� Packets pass through each stage
CS490N -- Chapt. 8 32 2003
![Page 243: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/243.jpg)
Summary
� Packet rate can be more important than data rate� Highest packet rate achieved with smallest packets� Rates measured per interface or aggregate� Special hardware needed for highest-speed network systems
– Smart NIC can include part of protocol stack
– Parallel and pipelined hardware also possible
CS490N -- Chapt. 8 33 2003
![Page 244: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/244.jpg)
?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
?Questions?
![Page 245: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/245.jpg)
IX
ClassificationAnd
Forwarding
CS490N -- Chapt. 9 1 2003
![Page 246: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/246.jpg)
Recall
� Packet demultiplexing
– Used with layered protocols
– Packet proceeds through one layer at a time
– On input, software in each layer chooses module at nexthigher layer
– On output, type field in each header specifiesencapsulation
CS490N -- Chapt. 9 2 2003
![Page 247: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/247.jpg)
The Disadvantage Of Demultiplexing
Although it provides freedom to define and use arbitraryprotocols without introducing transmission overhead,demultiplexing is inefficient because it imposes sequentialprocessing among layers.
CS490N -- Chapt. 9 3 2003
![Page 248: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/248.jpg)
Packet Classification
� Alternative to demultiplexing� Designed for higher speed� Considers all layers at the same time� Linear in number of fields� Two possible implementations
– Software
– Hardware
CS490N -- Chapt. 9 4 2003
![Page 249: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/249.jpg)
Example Classification
� Classify Ethernet frames carrying traffic to Web server� Specify exact header contents in rule set� Example
– Ethernet type field specifies IP
– IP type field specifies TCP
– TCP destination port specifies Web server
CS490N -- Chapt. 9 5 2003
![Page 250: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/250.jpg)
Example Classification(continued)
� Field sizes and values
– 2-octet Ethernet type is 080016
– 2-octet IP type is 6
– 2-octet TCP destination port is 80
CS490N -- Chapt. 9 6 2003
![Page 251: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/251.jpg)
Illustration Of Encapsulated Headers
0 4 8 10 16 19 24 31
ETHERNET DEST. (0-1)
ETHERNET DESTINATION (2-5)
ETHERNET SOURCE (0-3)
ETHERNET SOURCE (4-5) ETHERNET TYPE
VERS HLEN SERVICE IP TOTAL LENGTH
IP IDENT FLAGS FRAG. OFFSET
IP TTL IP TYPE IP HDR. CHECKSUM
IP SOURCE ADDRESS
IP DESTINATION ADDRESS
TCP SOURCE PORT TCP DESTINATION PORT
TCP SEQUENCE
TCP ACKNOWLEDGEMENT
HLEN NOT USED CODE BITS TCP WINDOW
TCP CHECKSUM TCP URGENT PTR
Start Of TCP Data . . .
� Highlighted fields are used for classification of Web servertraffic
CS490N -- Chapt. 9 7 2003
![Page 252: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/252.jpg)
Software ImplementationOf Classification
� Compare values in header fields� Conceptually a logical and of all field comparisons� Example
if ( (frame type == 0x0800) && (IP type == 6) && (TCP port == 80) )
declare the packet matches the classification;
else
declare the packet does not match the classification;
CS490N -- Chapt. 9 8 2003
![Page 253: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/253.jpg)
Optimizing Software Classification
� Comparisons performed sequentially� Can reorder comparisons to minimize effort
CS490N -- Chapt. 9 9 2003
![Page 254: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/254.jpg)
Example Of OptimizingSoftware Classification
� Assume
– 95.0% of all frames have frame type 080016
– 87.4% of all frames have IP type 6
– 74.3% of all frames have TCP port 80� Also assume values 6 and 80 do not occur in corresponding
positions in non-IP packet headers� Reordering tests can optimize processing time
CS490N -- Chapt. 9 10 2003
![Page 255: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/255.jpg)
Example Of OptimizingSoftware Classification
(continued)
if ( (TCP port == 80) && (IP type == 6) && (frame type == 0x0800) )
declare the packet matches the classification;
else
declare the packet does not match the classification;
� At each step, test the field that will eliminate the mostpackets
CS490N -- Chapt. 9 11 2003
![Page 256: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/256.jpg)
Note About Optimization
Although the maximum number of comparisons in a softwareclassifier is fixed, the average number of comparisons isdetermined by the order of the tests; minimum comparisonsresult if, at each step, the classifier tests the field thateliminates the most packets.
CS490N -- Chapt. 9 12 2003
![Page 257: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/257.jpg)
Hardware Implementation Of Classification
� Can build special-purpose hardware� Steps
– Extract needed fields
– Concatenate bits
– Place result in register
– Perform comparison� Hardware can operate in parallel
CS490N -- Chapt. 9 13 2003
![Page 258: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/258.jpg)
Illustration Of Hardware Classifier
Memory
hardware register
packet in memory
comparator
constant to compare
wide data path to movepacket headers from memory
to a hardware register
result of comparison
specific header bytesextracted for comparison
� Constant for Web classifier is 08.00.06.01.5016
CS490N -- Chapt. 9 14 2003
![Page 259: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/259.jpg)
Special Cases Of Classification
� Multiple categories� Variable-size headers� Dynamic classification
CS490N -- Chapt. 9 15 2003
![Page 260: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/260.jpg)
In Practice
� Classification usually involves multiple categories� Packets grouped together into flows� May have a default category� Each category specified with rule set
CS490N -- Chapt. 9 16 2003
![Page 261: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/261.jpg)
Example Multi-Category Classification
� Flow 1: traffic destined for Web server� Flow 2: traffic consisting of ICMP echo request packets� Flow 3: all other traffic (default)
CS490N -- Chapt. 9 17 2003
![Page 262: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/262.jpg)
Rule Sets
� Web server traffic
– 2-octet Ethernet type is 080016
– 2-octet IP type is 6
– 2-octet TCP destination port is 80� ICMP echo traffic
– 2-octet Ethernet type is 080016
– 2-octet IP type is 1
– 1-octet ICMP type is 8
CS490N -- Chapt. 9 18 2003
![Page 263: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/263.jpg)
Software Implementation Of Multiple Rules
if (frame type != 0x0800) {
send frame to flow 3;
} else if (IP type == 6 && TCP destination port == 80) {
send packet to flow 1;
} else if (IP type == 1 && ICMP type == 8) {
send packet to flow 2;
} else {
send frame to flow 3;
}
� Further optimization possible
CS490N -- Chapt. 9 19 2003
![Page 264: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/264.jpg)
Variable-Size Packet Headers
� Fields not at fixed offsets� Easily handled with software� Finite cases can be specified in rules
CS490N -- Chapt. 9 20 2003
![Page 265: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/265.jpg)
Example Variable-Size Header: IP Options
� Rule Set 1
– 2-octet frame type field contains 080016
– 1-octet field at the start of the datagram contains 4516
– 1-octet type field in the IP datagram contains 6
– 2-octet field 22 octets from start of the datagramcontains 80
� Rule Set 2
– 2-octet frame type field contains 080016
– 1-octet field at the start of the datagram contains 4616
– 1-octet type field in the IP datagram contains 6
– 2-octet field 26 octets from the start of datagramcontains 80
CS490N -- Chapt. 9 21 2003
![Page 266: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/266.jpg)
Effect Of Protocol Design On Classification
� Fixed headers fastest to classify� Each variable-size header adds one computation step� In worst case, classification no faster than demultiplexing� Extreme example: IPv6
CS490N -- Chapt. 9 22 2003
![Page 267: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/267.jpg)
Hybrid Classification
hardwareclassifier
softwareclassifier
. . .. . .
packets arrivefor classification
exit forunclassified packets
packets unrecognizedby hardware
packets classified intoflows by hardware packets classified into
flows by software
� Combines hardware and software mechanisms
– Hardware used for standard cases
– Software used for exceptions� Note: software classifier can operate at slower rate
CS490N -- Chapt. 9 23 2003
![Page 268: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/268.jpg)
Two Basic Types Of Classification
� Static
– Flows specified in rule sets
– Header fields and values known a priori� Dynamic
– Flows created by observing packet stream
– Values taken from headers
– Allows fine-grain flows
– Requires state information
CS490N -- Chapt. 9 24 2003
![Page 269: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/269.jpg)
Example Static Classification
� Allocate one flow per service type� One header field used to identify flow
– IP TYPE OF SERVICE (TOS)� Use DIFFSERV interpretation� Note: Ethernet type field also checked
CS490N -- Chapt. 9 25 2003
![Page 270: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/270.jpg)
Example Dynamic Classification
� Allocate flow per TCP connection� Header fields used to identify flow
– IP source address
– IP destination address
– TCP source port number
– TCP destination port number� Note: Ethernet type and IP type fields also checked
CS490N -- Chapt. 9 26 2003
![Page 271: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/271.jpg)
Implementation Of Dynamic Classification
� Usually performed in software� State kept in memory� State information created/updated at wire speed
CS490N -- Chapt. 9 27 2003
![Page 272: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/272.jpg)
Two Conceptual Bindings
classification: packet → flow
forwarding: flow → packet disposition
� Classification binding is usually 1-to-1� Forwarding binding can be 1-to-1 or many-to-1
CS490N -- Chapt. 9 28 2003
![Page 273: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/273.jpg)
Flow Identification
� Connection-oriented network
– Per-flow SVC can be created on demand
– Flow ID equals connection ID� Connectionless network
– Flow ID used internally
– Each flow ID mapped to ( next hop, interface )
CS490N -- Chapt. 9 29 2003
![Page 274: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/274.jpg)
Relationship Of Classification And ForwardingIn A Connection-Oriented Network
In a connection-oriented network, flow identifiers assigned byclassification can be chosen to match connection identifiersused by the underlying network. Doing so makes forwardingmore efficient by eliminating one binding.
CS490N -- Chapt. 9 30 2003
![Page 275: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/275.jpg)
Forwarding In A Connectionless Network
� Route for flow determined when flow created� Indexing used in place of route lookup� Flow identifier corresponds to index of entry in forwarding
cache� Forwarding cache must be changed when route changes
CS490N -- Chapt. 9 31 2003
![Page 276: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/276.jpg)
Second Generation Network Systems
� Designed for greater scale� Use classification instead of demultiplexing� Decentralized architecture
– Additional computational power on each NIC
– NIC implements classification and forwarding� High-speed internal interconnection mechanism
– Interconnects NICs
– Provides fast data path
CS490N -- Chapt. 9 32 2003
![Page 277: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/277.jpg)
Illustration Of Second GenerationNetwork Systems Architecture
Forward-
ing
Class-
ification
Layer 1 & 2
(framing)
Forward-
ing
Class-
ification
Layer 1 & 2
(framing)fast data path
ControlAnd
Exceptions
Interface1 Interface2Standard CPU
CS490N -- Chapt. 9 33 2003
![Page 278: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/278.jpg)
Classification And Forwarding Chips
� Sold by vendors� Implement hardware classification and forwarding� Typical configuration: rule sets given in ROM
CS490N -- Chapt. 9 34 2003
![Page 279: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/279.jpg)
Summary
� Classification faster than demultiplexing� Can be implemented in hardware or software� Dynamic classification
– Uses packet contents to assign flows
– Requires state information
CS490N -- Chapt. 9 35 2003
![Page 280: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/280.jpg)
?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
?Questions?
![Page 281: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/281.jpg)
XI
Network Processors: Motivation And Purpose
CS490N -- Chapt. 11 1 2003
![Page 282: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/282.jpg)
Second Generation Network Systems
� Concurrent with ATM development (early 1990s)� Purpose: scale to speeds faster than single CPU capacity� Features
– Use classification instead of demultiplexing
– Decentralized architecture to offload CPU
– Design optimized for fast data path
CS490N -- Chapt. 11 2 2003
![Page 283: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/283.jpg)
Second Generation Network Systems(details)
� Multiple network interfaces
– Powerful NIC
– Private buffer memory� High-speed hardware interconnects NICs� General-purpose processor only handles exceptions� Sufficient for medium speed interfaces (100 Mbps)
CS490N -- Chapt. 11 3 2003
![Page 284: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/284.jpg)
Reminder: Protocol Processing InSecond Generation Network Systems
Forward-ing
Class-ification
Layer 1 & 2(framing)
Forward-ing
Class-ification
Layer 1 & 2(framing)fast data path
ControlAnd
Exceptions
Interface1 Interface2Standard CPU
� NIC handles most of layers 1 - 3� Fast-path forwarding avoids CPU completely
CS490N -- Chapt. 11 4 2003
![Page 285: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/285.jpg)
Third Generation Network Systems
� Late 1990s� Functionality partitioned further� Additional hardware on each NIC� Almost all packet processing off-loaded from CPU
CS490N -- Chapt. 11 5 2003
![Page 286: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/286.jpg)
Third Generation Design
� NIC contains
– ASIC hardware
– Embedded processor plus code in ROM� NIC handles
– Classification
– Forwarding
– Traffic policing
– Monitoring and statistics
CS490N -- Chapt. 11 6 2003
![Page 287: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/287.jpg)
Embedded Processor
� Two possibilities
– Complex Instruction Set Computer (CISC)
– Reduced Instruction Set Computer (RISC)� RISC used often because
– Higher clock rates
– Smaller
– Lower power consumption
CS490N -- Chapt. 11 7 2003
![Page 288: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/288.jpg)
Purpose Of Embedded ProcessorIn Third Generation Systems
Third generation systems use an embedded processor to handlelayer 4 functionality and exception packets that cannot beforwarded across the fast path. An embedded processorarchitecture is chosen because ease of implementation andamenability to change are more important than speed.
CS490N -- Chapt. 11 8 2003
![Page 289: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/289.jpg)
Protocol Processing In Third Generation Systems
Traffic Mgmt. (ASIC)
Other processing
switching fabricLayers 1 & 2
Layer 4
Embeddedprocessor
Layer 3 & class.ASIC
Layers 1 & 2
Layer 4
EmbeddedProcessor
Layer 3 & class.ASIC
Interface1 Interface2standard CPU
� Special-purpose ASICs handle lower layer functions� Embedded (RISC) processor handles layer 4� CPU only handles low-demand processing
CS490N -- Chapt. 11 9 2003
![Page 290: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/290.jpg)
Are Third Generation Systems Sufficient?
CS490N -- Chapt. 11 10 2003
![Page 291: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/291.jpg)
Are Third Generation Systems Sufficient?
� Almost
CS490N -- Chapt. 11 10 2003
![Page 292: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/292.jpg)
Are Third Generation Systems Sufficient?
� Almost . . . but not quite.
CS490N -- Chapt. 11 10 2003
![Page 293: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/293.jpg)
Problems With Third Generation Systems
� High cost� Long time to market� Difficult to simulate/test� Expensive and time-consuming to change
– Even trivial changes require silicon respin
– 18-20 month development cycle� Little reuse across products� Limited reuse across versions
CS490N -- Chapt. 11 11 2003
![Page 294: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/294.jpg)
Problems With Third Generation Systems(continued)
� No consensus on overall framework� No standards for special-purpose support chips� Requires in-house expertise (ASIC designers)
CS490N -- Chapt. 11 12 2003
![Page 295: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/295.jpg)
A Fourth Generation
� Goal: combine best features of first generation and thirdgeneration systems
– Flexibility of programmable processor
– High speed of ASICs� Technology called network processors
CS490N -- Chapt. 11 13 2003
![Page 296: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/296.jpg)
Definition Of A Network Processor
A network processor is a special-purpose, programmablehardware device that combines the low cost and flexibility of aRISC processor with the speed and scalability of custom silicon(i.e., ASIC chips). Network processors are building blocks usedto construct network systems.
CS490N -- Chapt. 11 14 2003
![Page 297: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/297.jpg)
Network Processors: Potential Advantages
� Relatively low cost� Straightforward hardware interface� Facilities to access
– Memory
– Network interface devices� Programmable� Ability to scale to higher
– Data rates
– Packet rates
CS490N -- Chapt. 11 15 2003
![Page 298: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/298.jpg)
Network Processors: Potential Advantages
� Relatively low cost� Straightforward hardware interface� Facilities to access
– Memory
– Network interface devices� Programmable� Ability to scale to higher
– Data rates
– Packet rates
CS490N -- Chapt. 11 15 2003
![Page 299: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/299.jpg)
The Promise Of Programmability
� For producers
– Lower initial development costs
– Reuse software in later releases and related systems
– Faster time-to-market
– Same high speed as ASICs� For consumers
– Much lower product cost
– Inexpensive (firmware) upgrades
CS490N -- Chapt. 11 16 2003
![Page 300: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/300.jpg)
Choice Of Instruction Set
� Programmability alone insufficient� Also need higher speed� Should network processors have
– Instructions for specific protocols?
– Instructions for specific protocol processing tasks?� Choices difficult
CS490N -- Chapt. 11 17 2003
![Page 301: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/301.jpg)
Instruction Set
� Need to choose one instruction set� No single instruction set best for all uses� Other factors
– Power consumption
– Heat dissipation
– Cost� More discussion later in the course
CS490N -- Chapt. 11 18 2003
![Page 302: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/302.jpg)
Scalability
� Two primary techniques
– Parallelism
– Data pipelining� Questions
– How many processors?
– How should they be interconnected?� More discussion later
CS490N -- Chapt. 11 19 2003
![Page 303: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/303.jpg)
Costs And Benefits Of Network Processors
� Currently
– More expensive than conventional processor
– Slower than ASIC design� Where do network processors fit?
– Somewhere in the middle
CS490N -- Chapt. 11 20 2003
![Page 304: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/304.jpg)
Where Network Processors Fit
Increasing cost
IncreasingPerformance
SoftwareOn Conventional
Processor
ASICDesigns
NetworkProcessorDesigns
?
?
� Network processors: the middle ground
CS490N -- Chapt. 11 21 2003
![Page 305: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/305.jpg)
Achieving Higher Speed
� What is known
– Must partition packet processing into separate functions
– To achieve highest speed, must handle each functionwith separate hardware
� What is unknown
– Exactly what functions to choose
– Exactly what hardware building blocks to use
– Exactly how building blocks should be interconnected
CS490N -- Chapt. 11 22 2003
![Page 306: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/306.jpg)
Variety Of Network Processors
� Economics driving a gold rush
– NPs will dramatically lower production costs fornetwork systems
– A good NP design potentially worth lots of $$� Result
– Wide variety of architectural experiments
– Wild rush to try yet another variation
CS490N -- Chapt. 11 23 2003
![Page 307: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/307.jpg)
An Interesting Observation
� System developed using ASICs
– High development cost ($1M)
– Lower cost to replicate� System developed using network processors
– Lower development cost
– Higher cost to replicate� Conclusion: amortized cost favors ASICs for most high-
volume systems
CS490N -- Chapt. 11 24 2003
![Page 308: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/308.jpg)
Summary
� Third generation network systems have embedded processoron each NIC
� Network processor is programmable chip with facilities toprocess packets faster than conventional processor
� Primary motivation is economic
– Lower development cost than ASICs
– Higher processing rates than conventional processor
CS490N -- Chapt. 11 25 2003
![Page 309: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/309.jpg)
Questions?
![Page 310: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/310.jpg)
XII
The Complexity OfNetwork Processor Design
CS490N -- Chapt. 12 1 2003
![Page 311: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/311.jpg)
How Should A Network ProcessorBe Designed?
� Depends on
– Operations network processor will perform
– Role of network processor in overall system
CS490N -- Chapt. 12 2 2003
![Page 312: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/312.jpg)
Goals
� Generality
– Sufficient for all protocols
– Sufficient for all protocol processing tasks
– Sufficient for all possible networks� High speed
– Scale to high bit rates
– Scale to high packet rates� Elegance
– Minimality, not merely comprehensiveness
CS490N -- Chapt. 12 3 2003
![Page 313: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/313.jpg)
The Key Point
A network processor is not designed to process a specificprotocol or part of a protocol. Instead, designers seek aminimal set of instructions that are sufficient to handle anarbitrary protocol processing task at high speed.
CS490N -- Chapt. 12 4 2003
![Page 314: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/314.jpg)
Network Processor Design
� To understand network processors, consider problem to besolved
– Protocols being implemented
– Packet processing tasks
CS490N -- Chapt. 12 5 2003
![Page 315: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/315.jpg)
Packet Processing Functions
� Error detection and correction� Traffic measurement and policing� Frame and protocol demultiplexing� Address lookup and packet forwarding� Segmentation, fragmentation, and reassembly� Packet classification� Traffic shaping� Timing and scheduling� Queueing� Security: authentication and privacy
CS490N -- Chapt. 12 6 2003
![Page 316: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/316.jpg)
Questions
� Does our list of functions encompass all protocolprocessing?
� Which function(s) are most important to optimize?� How do the functions map onto hardware units in a typical
network system?� Which hardware units in a network system can be replaced
with network processors?� What minimal set of instructions is sufficiently general to
implement all functions?
CS490N -- Chapt. 12 7 2003
![Page 317: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/317.jpg)
Division Of Functionality
� Partition problem to reduce complexity� Basic division into two parts� Functions applied when packet arrives known as
ingress processing� Functions applied when packet leaves known as
egress processing
CS490N -- Chapt. 12 8 2003
![Page 318: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/318.jpg)
Ingress Processing
� Security and error detection� Classification or demultiplexing� Traffic measurement and policing� Address lookup and packet forwarding� Header modification and transport splicing� Reassembly or flow termination� Forwarding, queueing, and scheduling
CS490N -- Chapt. 12 9 2003
![Page 319: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/319.jpg)
Egress Processing
� Addition of error detection codes� Address lookup and packet forwarding� Segmentation or fragmentation� Traffic shaping� Timing and scheduling� Queueing and buffering� Output security processing
CS490N -- Chapt. 12 10 2003
![Page 320: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/320.jpg)
Illustration Of Packet Flow
Ingress Processing� Error and security checking� Classification or demultiplexing� Traffic measurement and policing� Address lookup and packet forwarding� Header modification and transport splicing� Reassembly or flow termination� Forwarding, queueing, and scheduling
Egress Processing� Addition of error detection codes� Address lookup and packet forwarding� Segmentation or fragmentation� Traffic shaping� Timing and scheduling� Queueing and buffering� Output security Processing
PHYSICAL
INTERFACE
FABRIC
packetsarrive
packetsleave
CS490N -- Chapt. 12 11 2003
![Page 321: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/321.jpg)
A Note About Scalability
Unlike a conventional processor, scalability is essential fornetwork processors. To achieve maximum scalability, anetwork processor offers a variety of special-purpose functionalunits, allows parallel or pipelined execution, and operates in adistributed environment.
CS490N -- Chapt. 12 12 2003
![Page 322: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/322.jpg)
How Will Network ProcessorsBe Used?
� For ingress processing only?� For egress processing only?� For combination?
CS490N -- Chapt. 12 13 2003
![Page 323: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/323.jpg)
How Will Network ProcessorsBe Used?
� For ingress processing only?� For egress processing only?� For combination?� Answer: No single role
CS490N -- Chapt. 12 13 2003
![Page 324: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/324.jpg)
Potential Architectural RolesFor Network Processor
� Replacement for a conventional CPU� Augmentation of a conventional CPU� On the input path of a network interface card� Between a network interface card and central interconnect� Between central interconnect and an output interface� On the output path of a network interface card� Attached to central interconnect like other ports
CS490N -- Chapt. 12 14 2003
![Page 325: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/325.jpg)
An Interesting PotentialRole For Network Processors
In addition to replacing elements of a traditional thirdgeneration architecture, network processors can be attacheddirectly to a central interconnect and used to implement stagesof a macroscopic data pipeline. The interconnect allowsforwarding among stages to be optimized.
CS490N -- Chapt. 12 15 2003
![Page 326: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/326.jpg)
Conventional Processor Design
� Design an instruction set, S� Build an emulator/simulator for S in software� Build a compiler that translates into S� Compile and emulate example programs� Compare results to
– Extant processors
– Alternative designs
CS490N -- Chapt. 12 16 2003
![Page 327: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/327.jpg)
Network Processor Emulation
� Can emulate low-level logic (e.g., Verilog)� Software implementation
– Slow
– Cannot handle real packet traffic� FPGA implementation
– Expensive and time-consuming
– Difficult to make major changes
CS490N -- Chapt. 12 17 2003
![Page 328: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/328.jpg)
Network Processor Design
� Unlike conventional processor design� No existing code base� No prior hardware experience� Each design differs
CS490N -- Chapt. 12 18 2003
![Page 329: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/329.jpg)
Hardware And Software Design
Because a network processor includes many low-level hardwaredetails that require specialized software, the hardware andsoftware designs are codependent; software for a networkprocessor must be created along with the hardware.
CS490N -- Chapt. 12 19 2003
![Page 330: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/330.jpg)
Summary
� Protocol processing divided into ingress and egressoperations
� Network processor design is challenging because
– Desire generality and efficiency
– No existing code base
– Software designs evolving with hardware
CS490N -- Chapt. 12 20 2003
![Page 331: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/331.jpg)
Questions?
![Page 332: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/332.jpg)
XVI
Languages Used For Classification
CS490N -- Chapt. 16 1 2003
![Page 333: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/333.jpg)
Languages For BuildingA Network Stack
� Many questions
– What language(s) are best?
– How should processing be expressed?
– How important is efficiency?
– Must code be optimized by hand?
CS490N -- Chapt. 16 2 2003
![Page 334: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/334.jpg)
Possible Programming Paradigms
� Declarative� Imperative
– With explicit parallelism
– With implicit parallelism
CS490N -- Chapt. 16 3 2003
![Page 335: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/335.jpg)
Choices Depend On
� Underlying hardware architecture� Data and packet rates� Software support available (e.g., compilers)
CS490N -- Chapt. 16 4 2003
![Page 336: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/336.jpg)
Desiderata
� High-level� Declarative� Data-oriented� Efficient� General or extensible� Explicit links to actions
CS490N -- Chapt. 16 5 2003
![Page 337: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/337.jpg)
Possible Hardware Architectures
� RISC processor dedicated to classification� RISC processor shared with other tasks� Multiple, parallel RISC processors� State machine or FPGA
CS490N -- Chapt. 16 6 2003
![Page 338: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/338.jpg)
Example ClassificationLanguages
� We will consider two examples
– NCL
– FPL
CS490N -- Chapt. 16 7 2003
![Page 339: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/339.jpg)
NCL
� Expands to Network Classification Language� Developed by Intel� Runs on StrongARM� Not part of fast path
CS490N -- Chapt. 16 8 2003
![Page 340: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/340.jpg)
NCL
� Network Classification Language� Developed by Intel� Runs on StrongARM� Not part of fast path
CS490N -- Chapt. 16 8 2003
![Page 341: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/341.jpg)
NCL Characteristics
� High level� Mixes declarative and imperative paradigms� Associates action with each classification� Optimized for protocols with fixed-size header fields� Basic unit of data is eight-bit byte
CS490N -- Chapt. 16 9 2003
![Page 342: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/342.jpg)
NCL Notation
� Item specified by tuple
– Offset beyond a base
– Size� Syntax:
base [ offset : size ]� Example
– Two-octet field fourteen octets beyond symbol FRAME
FRAME [ 14 : 2 ]
CS490N -- Chapt. 16 10 2003
![Page 343: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/343.jpg)
NCL Measures Of Data Size
� Octets
– Default
– Syntax consists of integers
– Usually more efficient
CS490N -- Chapt. 16 11 2003
![Page 344: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/344.jpg)
NCL Measures Of Data Size(continued)
� Bits
– Can specify arbitrary bit position / length
– Syntax uses less-than and greater-than
– Usually less efficient
– Example of a four-bit string six bits beyond FRAME
FRAME [ <6> : <4> ]
CS490N -- Chapt. 16 12 2003
![Page 345: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/345.jpg)
NCL Comments
� C-style
/* ... */� C++-style
// ...
CS490N -- Chapt. 16 13 2003
![Page 346: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/346.jpg)
NCL Primitives
� Can name header fields� Names defined by offset / length� No predefined protocol specifications
– NCL does not understand IP, TCP, Ethernet, etc.� No predefined network data types
– NCL does not understand IP addresses, Ethernetaddresses, etc.
CS490N -- Chapt. 16 14 2003
![Page 347: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/347.jpg)
Example NCL Code
protocol ip { /* declaration of datagram header */vershl { ip[0:1] }vers { ( vershl & 0xf0 ) >> 4 }tmphl { ( vershl & 0x0f ) }hlen { tmphl << 2 }totlen { ip[2:2] }ident { ip[4:2] }frags { ip[6:2] }ttl { ip[8:1] }proto { ip[9:1] }cksum { ip[10:2] }source { ip[12:4] }dest { ip[16:4] }
demux {( proto == 6 ) { tcp at hlen }( proto == 17 ) { udp at hlen }( default ) { ipunknown at hlen }
// note: other protocol types can be added here
}} // end of datagram header declaration
CS490N -- Chapt. 16 15 2003
![Page 348: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/348.jpg)
IP Header Length
� Found in second four bits of IP header� Gives header size in 32-bit multiples� NCL can
– Define fields on byte boundaries
– Extract bit fields
– Perform basic arithmetic
CS490N -- Chapt. 16 16 2003
![Page 349: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/349.jpg)
Example NCL Code ToExtract Header Length
vershl { ip[0:1] }vers { ( vershl & 0xf0 ) >> 4 }tmphl { ( vershl & 0x0f ) }hlen { tmphl << 2 }
CS490N -- Chapt. 16 17 2003
![Page 350: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/350.jpg)
Encapsulation
� Needed for layered protocols� Type field from layer N specifies type of message
encapsulated at layer N + 1� Specified using at keyword
CS490N -- Chapt. 16 18 2003
![Page 351: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/351.jpg)
Example NCL CodeFor Encapsulation
( proto == 6 ) { tcp at hlen }( proto == 17 ) { udp at hlen }( default ) { ipunknown at hlen }
� If IP protocol is 6, tcp� If IP protocol is 17, icmp� Other cases classified as ipunknown
CS490N -- Chapt. 16 19 2003
![Page 352: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/352.jpg)
Actions
� Refers to imperative execution� Names function to execute� Associated with each classification� Specified by rule statement
CS490N -- Chapt. 16 20 2003
![Page 353: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/353.jpg)
Example NCL CodeTo Invoke An Action
rule web_filter { ip && tcp.dport == 80 } { web_proc(ip.src) }
� Specifies
– Name is web_filter
– Predicate is ip && tcp.dport == 80
– Action to be invoked is web_proc(ip.src)
CS490N -- Chapt. 16 21 2003
![Page 354: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/354.jpg)
Intrinsic Functions
� Built into language� Handle checksum
– Verification
– Calculation� Understand specific protocols
CS490N -- Chapt. 16 22 2003
![Page 355: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/355.jpg)
Available IntrinsicFunctions
Function Name Purpose
ip.chksumvalid Verifies validity of IP checksumip.genchksum Generates an IP checksumtcp.chksumvalid Verifies validity of TCP checksumtcp.genchksum Generates a TCP checksumudp.chksumvalid Verifies validity of UDP checksumudp.genchksum Generates a UDP checksum
CS490N -- Chapt. 16 23 2003
![Page 356: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/356.jpg)
Named Predicates
� Bind name to predicate� Can be used later in program� Associated with specific protocol
CS490N -- Chapt. 16 24 2003
![Page 357: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/357.jpg)
Example Predicate
predicate ip.fragmentable { ! ( (ip.frags >> 14) & 0x01 ) }
CS490N -- Chapt. 16 25 2003
![Page 358: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/358.jpg)
Conditional Rule Execution
� Specifies order for predicate testing� Linearizes classification� Helps optimize processing� Example use: verify IP checksum before proceeding
CS490N -- Chapt. 16 26 2003
![Page 359: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/359.jpg)
Illustration Of NCPConditional Rule Execution
predicate TCPvalid { tcp && tcp.cksumvalid }
with { ( TCPvalid ) {
predicate SynFound { (tcp.flags & 0x02) }
rule NewConnection { SynFound } { start_conn(tcp_dport ) }
}
CS490N -- Chapt. 16 27 2003
![Page 360: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/360.jpg)
Incremental Protocol Definition
� Allows programmer to add definition of field to pre-existingprotocol specification
� Works well with included files� Useful when generic definition insufficient� General form:
field protocol_name.field_name { definition }
� Example
field ip.flags { ip[6 : <3>] }
CS490N -- Chapt. 16 28 2003
![Page 361: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/361.jpg)
NCL Set Facility
� Table search mechanism� Tables grow dynamically� Up to seven keys (each key is 32-bits)� Programmer gives table size
– Size must be power of two
– Set can grow larger than estimate
CS490N -- Chapt. 16 29 2003
![Page 362: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/362.jpg)
Example NCL Set Declaration
� To declare set my_addrs:
set my_addrs /* define a set of IP addresses */
< 1 > { /* number of keys in each entry */
size_hint { 256 }
}
CS490N -- Chapt. 16 30 2003
![Page 363: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/363.jpg)
Example NCL Set Declaration(continued)
� To search my_addrs:
search my_addrs.ip_src_lookup
(ip.source) /* search key is IP source addr. */
CS490N -- Chapt. 16 31 2003
![Page 364: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/364.jpg)
NCL Preprocessor
� Adopts syntax from C-preprocessor� Provides
– #include
– #ifndef
CS490N -- Chapt. 16 32 2003
![Page 365: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/365.jpg)
Questions?
![Page 366: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/366.jpg)
FPL
� Functional Programming Language� Developed by Agere Systems� Runs on Agere FPP� In the fast path
CS490N -- Chapt. 16 34 2003
![Page 367: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/367.jpg)
FPL
� Functional Programming Language� Developed by Agere Systems� Runs on Agere FPP� In the fast path
CS490N -- Chapt. 16 34 2003
![Page 368: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/368.jpg)
FPL Characteristics
� High level� Unusual, declarative language� Follows pattern paradigm� Designed for networking� Differs from conventional syntactic pattern languages like
SNOBOL or Awk� Permits parallel evaluation
CS490N -- Chapt. 16 35 2003
![Page 369: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/369.jpg)
FPL Syntax
� C++ comments // I can’t understand this...� C-like preprocessor (#define)� Variety of constants
– Decimal
– Hexadecimal (begins with 0x)
– Bit string (begins with 0b)
– Dotted decimal IP addresses
– Dashed hexadecimal addresses
CS490N -- Chapt. 16 36 2003
![Page 370: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/370.jpg)
Two Pass Processing
� Fundamental idea in FPL� Program contains two independent parts� Incoming data processed by both parts� Motivation: underlying interface hardware
CS490N -- Chapt. 16 37 2003
![Page 371: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/371.jpg)
Interface Hardware
� Divides each incoming packet into blocks of 64 octets� Transfers one block at a time to FPP chip� Sends additional information with each block
– Port number on which received
– Flags
* Hardware detected error (e.g., invalid CRC)
* Block is the first block of packet
* Block is the final block of packet
CS490N -- Chapt. 16 38 2003
![Page 372: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/372.jpg)
Pass 1 Processing(blocks)
� Program invoked once for each block� Accommodates blocks from multiple interfaces� Collects the blocks for each packet into a separate queue� Forwards complete queue to Pass 2 when
– Final block of packet arrives
– Error detected and processing aborted
CS490N -- Chapt. 16 39 2003
![Page 373: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/373.jpg)
Collecting Blocks
� Blocks arrive asynchronously
– From arbitrary port
– In arbitrary order� Typical trick: use incoming port number as queue ID
CS490N -- Chapt. 16 40 2003
![Page 374: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/374.jpg)
Pass 2 Processing(packets)
� Program invoked once for each packet� Handles complete packet� Receives
– Packet from Pass 1
– Status information for the packet� Chooses disposition
– Forward packet for transmission
– Discard packet
CS490N -- Chapt. 16 41 2003
![Page 375: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/375.jpg)
Illustration Of FPL Processing
..............................
firstpass
processing
secondpass
processing
queue engine and memory
FPL program
block enqueued complete packetsent back
blockarrives
packetleaves
� Every FPL program contains code for two passes� Status information kept with packet
CS490N -- Chapt. 16 42 2003
![Page 376: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/376.jpg)
FPL Invocation
� Handled by hardware� Pass 1 invoked when block arrives� Pass 2 invoked when packet ready� No explicit polling
CS490N -- Chapt. 16 43 2003
![Page 377: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/377.jpg)
Designating Passes
� Programmer specifies starting label for each pass� Handled by language directives� To designate first pass:
SETUP ROOT( starting_label )� To designate second pass:
SETUP REPLAYROOT( starting_label )
CS490N -- Chapt. 16 44 2003
![Page 378: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/378.jpg)
Using A Pattern MatchFor Conditional Processing
� Fact 1: FPL does not provide a conditional statement� Fact 2: FPL does provide matching among a set of patterns� Trick: use pattern selection to emulate conditional execution
CS490N -- Chapt. 16 45 2003
![Page 379: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/379.jpg)
Example Conditional InFirst Pass Processing
� Hardware sets variable $FramerEOF
– Value is 1 if block is final block of packet
– Value is 0 for other blocks� Built-in functions used to enqueue block
– fQueue adds block to queue
– fQueueEOF adds block to queue and marks queue readyfor second pass
CS490N -- Chapt. 16 46 2003
![Page 380: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/380.jpg)
Example Conditional InFirst Pass Processing
(continued)
� Conceptual algorithm for enqueuing a block:
if ( $FramerEOF ) {fQueueEOF( );
} else {fQueue( );
}
CS490N -- Chapt. 16 47 2003
![Page 381: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/381.jpg)
Pattern Used For Conditional
#include "fpp.fpl"
// Code for the first pass of a program that handles Ethernet packets
SETUP ROOT(HandleBlock);
// Pass 1
HandleBlock: EtherBlock($framerEOF:1);
EtherBlock: 0b0 fQueue(0:2, $portNumber:16, $offset:6, 0:1, 0:2);
EtherBlock: 0b1 fQueueEof(0:2, $portNumber:16, $offset:6, 0:1, 0:2, 0:24);
CS490N -- Chapt. 16 48 2003
![Page 382: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/382.jpg)
Code For Second Pass
// Code for the second pass of a program that classifies Ethernet frames
// Symbolic constants used for classifications
#define FRAMETYPEA 1 // ARP frames#define FRAMETYPEI 2 // IP frames#define FRAMETYPEO 3 // Other frames (not the above)
SETUP REPLAYROOT(HandleFrame);
// Pass 2
HandleFrame:fSkip(96) // skip past dest & src addressescl = CLASS // compute class from frame typefSkipToEnd()fTransmit(cl:21, 0:16, 0:5, 0:6);
CLASS: 0x0806:16 fReturn(FRAMETYPEA);CLASS: 0x0800:16 fReturn(FRAMETYPEI);CLASS: BITS:16 fReturn(FRAMETYPEO);
CS490N -- Chapt. 16 49 2003
![Page 383: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/383.jpg)
Conceptual Pattern Match
� Program specifies sequence of patterns� Pointer moves through packet� Processing proceeds if bits in packet match pattern
CS490N -- Chapt. 16 50 2003
![Page 384: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/384.jpg)
Two Basic Types OfPattern Functions
� Control function
– Sequence of items to match
– Processed in order
– Exactly one path� Tree function
– Set of patterns
– Data in packet compared to all patterns
– Exactly one must match
CS490N -- Chapt. 16 51 2003
![Page 385: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/385.jpg)
Example Control Function
HandleFrame:
fSkip(96) // skip past dest & src addresses
cl = CLASS // compute class from frame type
fSkipToEnd()
fTransmit(cl:21, 0:16, 0:5, 0:6);
CS490N -- Chapt. 16 52 2003
![Page 386: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/386.jpg)
Example Tree Function
CLASS:0x0806:16 fReturn(FRAMETYPEA);
CLASS:0x0800:16 fReturn(FRAMETYPEI);
CLASS:BITS:16 fReturn(FRAMETYPEO);
CS490N -- Chapt. 16 53 2003
![Page 387: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/387.jpg)
Special Patterns
� fSkip
– Skips bits in input� fSkipToEnd
– Skips to end of current packet� fTransmit
– Sends packet to RSP chip� BITS
– Matches arbitrary bits (default case)
� Note: second pass must match entire packet
CS490N -- Chapt. 16 54 2003
![Page 388: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/388.jpg)
Pattern Optimization
� FPL compiler optimizes patterns� Typical optimization: eliminate common prefix
CS490N -- Chapt. 16 55 2003
![Page 389: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/389.jpg)
Example Pattern Optimization
IP: 0x4 0x5 fSkip(152) fReturn(IPOPTIONS0);IP: 0x4 0x6 fSkip(160) fReturn(IPOPTIONS1);IP: 0x4 0x7 fSkip(168) fReturn(IPOPTIONS2);IP: 0x4 BITS:4 fReturn(IPUNKNOWN);
(a)
IP: 0x4 IPHDR;IPHDR: 0x5 fSkip(152) fReturn(IPOPTIONS0);IPHDR: 0x6 fSkip(160) fReturn(IPOPTIONS1);IPHDR: 0x7 fSkip(168) fReturn(IPOPTIONS2);IPHDR: BITS:4 fReturn(IPUNKNOWN);
(b)
CS490N -- Chapt. 16 56 2003
![Page 390: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/390.jpg)
IP Address Example
ipAddrMatch: *.*.*.* fReturn(0);ipAddrMatch: 10.*.*.* fReturn(1);ipAddrMatch: 128.10.*.* fReturn(2);ipAddrMatch: 128.211.*.* fReturn(2);ipAddrMatch: 128.210.*.* fReturn(3);
CS490N -- Chapt. 16 57 2003
![Page 391: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/391.jpg)
FPL Variables
Variable Pass Meaning
$framerSOF 1 Is the current block the first of a frame?$ferr 1 Did the hardware framer detect an error?$portNumber 1 Port number from which block arrived$framerEOF 1 Is the current block the last of a frame?$offset 1 or 2 Offset of data within a buffer$currOffset 1 or 2 Current byte position being matched$currLength 1 or 2 Length of item being processed$pass 1 or 2 Pass being executed$tag 1 or 2 Status value sent from pass 1 to pass 2
CS490N -- Chapt. 16 58 2003
![Page 392: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/392.jpg)
Dynamic Classification
� Can extend tree function at run-time� Requires use of ASI� Pattern converted to internal form
CS490N -- Chapt. 16 59 2003
![Page 393: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/393.jpg)
Summary
� Programming languages for classification are
– Special-purpose
– High-level
– Declarative
– Data-oriented
– Provide links to actions� We examined two languages
– NCL, the Network Classification Language from Intel
– FPL, the Functional Programming Language from Agere
CS490N -- Chapt. 16 60 2003
![Page 394: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/394.jpg)
Questions?
![Page 395: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/395.jpg)
XVII
Design Tradeoffs And Consequences
CS490N -- Chapt. 17 1 2003
![Page 396: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/396.jpg)
Low Development CostVs.
Performance� The fundamental economic motivation� ASIC costs $1M to develop� Network processor costs programmer time
CS490N -- Chapt. 17 2 2003
![Page 397: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/397.jpg)
ProgrammabilityVs.
Processing Speed� Programmable hardware is slower� Flexibility costs...
CS490N -- Chapt. 17 3 2003
![Page 398: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/398.jpg)
SpeedVs.
Functionality� Generic idea:
– Processor with most functionality is slowest
– Adding functionality to NP lowers its overall ‘‘speed’’
CS490N -- Chapt. 17 4 2003
![Page 399: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/399.jpg)
Speed
� Difficult to define� Can include
– Packet Rate
– Data Rate
– Burst size
CS490N -- Chapt. 17 5 2003
![Page 400: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/400.jpg)
Per-Interface RatesVs.
Aggregate Rates� Per-interface rate important if
– Physical connections form bottleneck
– System scales by having faster interfaces� Aggregate rate important if
– Fabric forms bottleneck
– System scales by having more interfaces
CS490N -- Chapt. 17 6 2003
![Page 401: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/401.jpg)
Increasing Processing SpeedVs.
Increasing Bandwidth
Will network processor capabilities or the bandwidth ofnetwork connections increase more rapidly?
� What is the effect of more transistors?� Does Moore’s Law apply to bandwidth?
CS490N -- Chapt. 17 7 2003
![Page 402: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/402.jpg)
Lookaside CoprocessorsVs.
Flow-Through Coprocessors� Flow-through pipeline
– Operates at wire speed
– Difficult to change� Lookaside
– Modular and easy to change
– Invocation can be bottleneck
CS490N -- Chapt. 17 8 2003
![Page 403: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/403.jpg)
Uniform PipelineVs.
Synchronized Pipeline� Uniform pipeline
– Operates in lock-step like assembly line
– Each stage must finish in exactly the same time� Synchronized pipeline
– Buffers allow computation at each stage to differ
– Synchronization expensive
CS490N -- Chapt. 17 9 2003
![Page 404: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/404.jpg)
Explicit ParallelismVs.
Cost And Programmability� Explicit parallelism
– Hardware is less complex
– More difficult to program� Implicit parallelism
– Easier to program
– Slightly lower performance
CS490N -- Chapt. 17 10 2003
![Page 405: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/405.jpg)
ParallelismVs.
Strict Packet Ordering� Increased parallelism
– Improves performance
– Results in out-of-order packets� Strict packet ordering
– Aids protocols such as TCP
– Can nullify use of parallelism
CS490N -- Chapt. 17 11 2003
![Page 406: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/406.jpg)
Stateful ClassificationVs.
High-Speed Parallel Classification� Static classification
– Keeps no state
– Is the fastest� Dynamic classification
– Keeps state
– Requires synchronization for updates
CS490N -- Chapt. 17 12 2003
![Page 407: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/407.jpg)
Memory SpeedVs.
Programmability� Separate memory banks
– Allow parallel accesses
– Yield high performance
– Difficult to program� Non-banked memory
– Easier to program
– Lower performance
CS490N -- Chapt. 17 13 2003
![Page 408: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/408.jpg)
I/O PerformanceVs.
Pin Count� Bus width
– Increase to produce higher throughput
– Decrease to take fewer pins
CS490N -- Chapt. 17 14 2003
![Page 409: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/409.jpg)
Programming Languages
� A three-way tradeoff� Can have two, but not three of
– Ease of programming
– Functionality
– Performance
CS490N -- Chapt. 17 15 2003
![Page 410: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/410.jpg)
Programming Languages ThatOffer High Functionality
� Ease of programming vs. speed
– High-level language offers ease of programming, butlower performance
– Low-level language offers higher performance, butmakes programming more difficult
CS490N -- Chapt. 17 16 2003
![Page 411: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/411.jpg)
Programming Languages ThatOffer Ease Of Programming
� Speed vs. functionality
– For restricted language, compiler can generate optimizedcode
– Broad functionality and ease of programming lead toinefficient code
CS490N -- Chapt. 17 17 2003
![Page 412: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/412.jpg)
Programming Languages ThatOffer High Performance
� Ease of programming vs. functionality
– Optimizing compiler and ease of programming imply arestricted application
– Optimizing code for general applications requires moreprogrammer effort
CS490N -- Chapt. 17 18 2003
![Page 413: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/413.jpg)
Multithreading:Throughput
Vs.Ease Of Programming
� Multiple threads of control can increase throughput� Planning the operation of threads that exhibit less contention
requires more programmer effort
CS490N -- Chapt. 17 19 2003
![Page 414: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/414.jpg)
Traffic ManagementVs.
High-Speed Forwarding� Traffic management
– Can manage traffic on multiple, independent flows
– Requires extra processing� Blind forwarding
– Performed at highest speed
– Does not distinguish among flows
CS490N -- Chapt. 17 20 2003
![Page 415: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/415.jpg)
GeneralityVs.
Specific Architectural Role� General-purpose network processor
– Used in any part of any system
– Used with any protocol
– More expensive� Special-purpose network processor
– Restricted to one role / protocol
– Less expensive, but may need many types
CS490N -- Chapt. 17 21 2003
![Page 416: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/416.jpg)
Special-Purpose MemoryVs.
General-Purpose Memory� General-purpose memory
– Single type of memory serves all needs
– May not be optimal for any use� Special-purpose memory
– Optimized for one use
– May require multiple memory types
CS490N -- Chapt. 17 22 2003
![Page 417: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/417.jpg)
Backward CompatibilityVs.
Architectural Advances� Backward compatibility
– Keeps same instruction set through multiple versions
– May not provide maximal performance� Architectural advances
– Allows more optimizations
– Difficult for programmers
CS490N -- Chapt. 17 23 2003
![Page 418: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/418.jpg)
ParallelismVs.
Pipelining� Both are fundamental performance techniques� Usually used in combination: pipeline of parallel processors
– How long is pipeline?
– How much parallelism at each stage?
CS490N -- Chapt. 17 24 2003
![Page 419: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/419.jpg)
Summary
� Many design tradeoffs� No easy answers
CS490N -- Chapt. 17 25 2003
![Page 420: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/420.jpg)
Questions?
![Page 421: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/421.jpg)
XVIII
Overview Of The Intel Network Processor
CS490N -- Chapt. 18 1 2003
![Page 422: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/422.jpg)
An Example Network Processor
� We will
– Choose one example
– Examine the hardware
– Gain first-hand experience with software� The choice: Intel
CS490N -- Chapt. 18 2 2003
![Page 423: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/423.jpg)
Intel Network Processor Terminology
� Intel Exchange Architecture ( IXA )
– Broad reference to architecture
– Both hardware and software
– Control plane and data plane� Intel Exchange Processor ( IXP )
– Network processor that implements IXA
CS490N -- Chapt. 18 3 2003
![Page 424: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/424.jpg)
Intel IXP1200
� Name of first generation IXP chip� Four models available
Model Pin Support Support Possible Clock
Number Count For CRC For ECC Rates (in MHz)
IXP1200 432 no no 166, 200, or 232
IXP1240 432 yes no 166, 200, or 232
IXP1250 520 yes yes 166, 200, or 232
IXP1250 520 yes yes 166 only
� Differences in speed, power consumption, packaging� Term IXP1200 refers to any model
CS490N -- Chapt. 18 4 2003
![Page 425: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/425.jpg)
IXP1200 Features
� One embedded RISC processor� Six programmable packet processors� Multiple, independent onboard buses� Processor synchronization mechanisms� Small amount of onboard memory� One low-speed serial line interface� Multiple interfaces for external memories� Multiple interfaces for external I/O buses� A coprocessor for hash computation� Other functional units
CS490N -- Chapt. 18 5 2003
![Page 426: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/426.jpg)
IXP1200 External Connections
IXP1200chip
SRAM
FLASH
MemoryMapped
I/O
SDRAM
serialline
PCI bus
IX bus
SRAMbus
SDRAMbus
optional host connection
High-speed I/O bus
CS490N -- Chapt. 18 6 2003
![Page 427: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/427.jpg)
IXP1200 External Connection Speeds
Type Bus Width Clock Rate Data Rate
Serial line (NA) (NA) 38.4 KbpsPCI bus 32 bits 33-66 MHz 2.2 GbpsIX bus 64 bits 66-104 MHz 4.4 GbpsSDRAM bus 64 bits ≤ 232 MHz 928.0 MBpsSRAM bus 16 or 32 bits ≤ 232 MHz 464.0 MBps
� Note: MBps abbreviates Mega Bytes per second
CS490N -- Chapt. 18 7 2003
![Page 428: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/428.jpg)
IXP1200 Internal Units
Quantity Component Purpose
1 Embedded RISC processor Control, higher layer protocols,and exceptions
6 Packet processing engines I/O and basic packet processing1 SRAM access unit Coordinate access to the
external SRAM bus1 SDRAM access unit Coordinate access to the
external SDRAM bus1 IX bus access unit Coordinate access to the
external IX bus1 PCI bus access unit Coordinate access to the
external PCI busseveral Onboard buses Internal control and data transfer
CS490N -- Chapt. 18 8 2003
![Page 429: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/429.jpg)
IXP1200 Internal Architecture
IXP1200 chip
SRAM
FLASH
MemoryMapped
I/O
SDRAM SDRAMaccess
SRAMaccess
scratchmemory
EmbeddedRISC
processor(StrongARM)
Microengine 1
Microengine 2
Microengine 3
Microengine 4
Microengine 5
Microengine 6
PCI access
IX access
serialline
PCI bus
IX bus
SRAMbus
SDRAMbus
multiple,independent
internalbuses
optional host connection
High-speed I/O bus
CS490N -- Chapt. 18 9 2003
![Page 430: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/430.jpg)
Processors On The IXP1200
Processor Type Onboard? Programmable?
General Purpose Processor no yesEmbedded RISC Processor yes yesMicroengines yes yesCoprocessors yes noPhysical Interfaces no no
CS490N -- Chapt. 18 10 2003
![Page 431: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/431.jpg)
IXP1200 Memory Hierarchy
Memory Maximum On TypicalType Size Chip? Use
GP Registers 128 regs. yes Intermediate computationInst. Cache 16 Kbytes yes Recently used instructionsData Cache 8 Kbytes yes Recently used dataMini Cache 512 bytes yes Data that is reused onceWrite buffer unspecified yes Write operation bufferScratchpad 4 Kbytes yes IPC and synchronizationInst. Store 64 Kbytes yes Microengine instructionsFlashROM 8 Mbytes no BootstrapSRAM 8 Mbytes no Tables or packet headersSDRAM 256 Mbytes no Packet storage
CS490N -- Chapt. 18 11 2003
![Page 432: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/432.jpg)
IXP1200 Memory Characteristics
Memory Addressable Relative SpecialType Data Unit (bytes) Access Time Features
Scratch 4 12 - 14 synchronization viatest-and-set and otherbit manipulation,atomic increment
SRAM 4 16 - 20 queue manipulation,bit manipulation,read/write locks
SDRAM 8 32 - 40 direct transfer pathto I/O devices
CS490N -- Chapt. 18 12 2003
![Page 433: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/433.jpg)
Memory Addressability
� Each memory specifies minimum addressable unit
– SRAM organized into 4-byte words
– SDRAM organized into 8-byte longwords� Physical address refers to word or longword� Examples
– SDRAM address 1 refers to bytes 8 through 15
– SRAM address 1 refers to bytes 4 through 7
CS490N -- Chapt. 18 13 2003
![Page 434: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/434.jpg)
Example Of Complexity:SRAM Access Unit
SRAM access unit
SRAMpin
inter-face
SRAM AMBAbus
inter-face
Flash(BootROM)
MemoryMapped
I/O
service priorityarbitration
microengine addr.& command queues
AMBA addr.queuescommand
decoder& addr.
generator
memory& FIFO
addr
microengine data
data
buffer
AMBA
fromStrongARM
Microenginecommands
clock
signals
address
data
CS490N -- Chapt. 18 14 2003
![Page 435: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/435.jpg)
Summary
� We will use Intel IXP1200 as example� IXP1200 offers
– Embedded processor plus parallel packet processors
– Connections to external memories and buses
CS490N -- Chapt. 18 15 2003
![Page 436: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/436.jpg)
Questions?
![Page 437: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/437.jpg)
XIX
Embedded RISC Processor (StrongARM Core)
CS490N -- Chapt. 19 1 2003
![Page 438: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/438.jpg)
StrongARM Role
IXP1200 IXP1200IXP1200IXP1200
GPP GPP
(a) (b)
General-PurposeProcessor
EmbeddedRISC
Processors
physicalinterfaces
� (a) Single IXP1200� (b) Multiple IXP1200s� Role of StrongARM differs
CS490N -- Chapt. 19 2 2003
![Page 439: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/439.jpg)
Tasks That Can Be PerformedBy StrongARM
� Bootstrapping� Exception handling� Higher-layer protocol processing� Interactive debugging� Diagnostics and logging� Memory allocation� Application programs (if needed)� User interface and/or interface to the GPP� Control of packet processors� Other administrative functions
CS490N -- Chapt. 19 3 2003
![Page 440: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/440.jpg)
StrongARM Characteristics
� Reduced Instruction Set Computer (RISC)� Thirty-two bit arithmetic� Vector floating point provided via a coprocessor� Byte addressable memory� Virtual memory support� Built-in serial port� Facilities for a kernelized operating system
CS490N -- Chapt. 19 4 2003
![Page 441: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/441.jpg)
Arithmetic
� StrongARM is configurable in two modes
– Big endian
– Little endian� Choice made at run-time
CS490N -- Chapt. 19 5 2003
![Page 442: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/442.jpg)
StrongARM Memory Organization
� Single, uniform address space� Includes memories and devices� Byte addressable
CS490N -- Chapt. 19 6 2003
![Page 443: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/443.jpg)
StrongARM Address Space
ContentsAddress
SDRAM Bus:
SDRAMScratch Pad
FBI CSRsMicroengine xfer
Microengine CSRs
AMBA xfer
Reserved
System regs
Reserved
PCI Bus:
PCI memoryPCI I/O
PCI configLocal PCI config
SRAM Bus:
SlowPortSRAM CSRs
Push/Pop cmdLocks
BootROM
FFFF FFFF
C000 0000
B000 0000
A000 0000
9000 0000
8000 0000
4000 0000
0000 0000
CS490N -- Chapt. 19 7 2003
![Page 444: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/444.jpg)
Summary
� Embedded processor on IXP1200 is StrongARM� StrongARM addressing
– Single, uniform address space
– Includes all memories
– Byte addressable
CS490N -- Chapt. 19 8 2003
![Page 445: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/445.jpg)
Questions?
![Page 446: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/446.jpg)
XXI
Reference System And Software Development Kit(Bridal Veil, SDK)
CS490N -- Chapt. 21 1 2003
![Page 447: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/447.jpg)
Reference System
� Provided by vendor� Targeted at potential customers� Usually includes
– Hardware testbed
– Development software
– Download and bootstrap software
– Reference implementations
CS490N -- Chapt. 21 2 2003
![Page 448: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/448.jpg)
Intel Reference Hardware
� Single-board network processor testbed� Plugs into PCI bus on a PC
CS490N -- Chapt. 21 3 2003
![Page 449: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/449.jpg)
Intel Reference Hardware(continued)
Quantity or Size Item
1 IXP1200 network processor (232MHz)
8 Mbytes of SRAM memory
256 Mbytes of SDRAM memory
8 Mbytes of Flash ROM memory
4 10/100 Ethernet ports
1 Serial interface (console)
1 PCI bus interface
1 PMC expansion site
CS490N -- Chapt. 21 4 2003
![Page 450: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/450.jpg)
Intel Reference Software
� Known as Software Development Kit (SDK)� Runs on PC� Includes:
Software Purpose
C compiler Compile programs for the StrongARMMicroC compiler Compile programs for the microenginesAssembler Assemble programs for the microenginesDownloader Load software into the network processorMonitor Communicate with the network processor and
interact with running softwareBootstrap Start the network processor runningReference Code Example programs for the IXP1200 that show
how to implement basic functions
CS490N -- Chapt. 21 5 2003
![Page 451: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/451.jpg)
Basic Paradigm
� Build software on conventional computer� Load into reference system� Test / measure results
CS490N -- Chapt. 21 6 2003
![Page 452: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/452.jpg)
Bootstrapping Procedure
1. Powering on host causes network processor board to run boot monitor
from Flash memory
2. Device driver on host communicates across the PCI bus with boot
monitor to load operating system (Embedded Linux) and initial RAM disk
configuration into network processor’s memory
3. Host signals boot monitor to start the StrongARM
4. Operating system runs login process on serial line and starts telnet server
5. Operating systems on host and StrongARM configure PCI bus to act as
Ethernet emulator; the StrongARM uses NFS to mount two file systems,
R and W, from a server on the host
CS490N -- Chapt. 21 7 2003
![Page 453: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/453.jpg)
Starting Software
1. Compile code for StrongARM and microengines
2. Create system configuration file named ixsys.config
3. Copy code and configuration file to read-only public download
directory, R
4. Run terminal emulation program on host and log onto StrongARM
5. Change to NSF-mounted directory R, and run shell script ixstart
with argument ixsys.config
6. Later, to stop the IXP1200, run ixstop script
CS490N -- Chapt. 21 8 2003
![Page 454: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/454.jpg)
Summary
� Reference systems
– Provided by vendor
– Targeted at potential customers
– Usually include
* Hardware testbed
* Cross-development software
– Download and bootstrap software
– Reference implementations
CS490N -- Chapt. 21 9 2003
![Page 455: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/455.jpg)
Questions?
![Page 456: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/456.jpg)
XXII
Programming Model(Intel ACE)
CS490N -- Chapt. 22 1 2003
![Page 457: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/457.jpg)
Active Computing Element (ACE)
� Defined by Intel’s SDK� Not part of hardware
CS490N -- Chapt. 22 2 2003
![Page 458: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/458.jpg)
ACE Features
� Fundamental software building block� Used to construct packet processing systems� Runs on StrongARM, microengine, or host� Handles control plane and fast or slow data path processing� Coordinates and synchronizes with other ACEs� Can have multiple inputs or outputs� Can serve as part of a pipeline
CS490N -- Chapt. 22 3 2003
![Page 459: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/459.jpg)
ACE Terminology
� Library ACE
– Built by Intel
– Available with SDK� Conventional ACE
– Built by Intel customers
– Can incorporate items from Action Service Libraries� MicroACE
– Core component runs on StrongARM
– Microblock component runs on microengine
CS490N -- Chapt. 22 4 2003
![Page 460: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/460.jpg)
More Terminology
� Source microblock
– Handles ingress from I/O device� Sink microblock
– Handles egress to I/O device� Transform microblock
– Intermediate position in a pipeline
CS490N -- Chapt. 22 5 2003
![Page 461: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/461.jpg)
Four Conceptual Parts Of An ACE
� Initialization� Classification� Actions associated with each classification� Message and event management
CS490N -- Chapt. 22 6 2003
![Page 462: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/462.jpg)
Output Targets And Late Binding
� ACE has set of outputs� Each output given target name� Outputs bound dynamically at run time� Unbound target corresponds to packet discard
CS490N -- Chapt. 22 7 2003
![Page 463: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/463.jpg)
Conceptual ACE Interconnection
process ACEingress ACE egress ACEinputports
outputports
� Ingress ACE acts as source� Process ACE acts as transform� Egress ACE acts as sink� Connections created at run time
CS490N -- Chapt. 22 8 2003
![Page 464: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/464.jpg)
Components Of MicroACE(review)
� Single ACE with two components
– StrongARM (core component)
– Microengines (microblock component)� Communication possible between components
CS490N -- Chapt. 22 9 2003
![Page 465: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/465.jpg)
Division Of ACE Into Components
IP ACE(microblock)
ingress ACE(microblock)
egress ACE(microblock)
inputports
outputports
ingress ACE(core)
IP ACE(core)
egress ACE(core)
Stack ACE
StrongARM
microengine
� Microblocks form fast path� Stack ACE runs entirely on StrongARM
CS490N -- Chapt. 22 10 2003
![Page 466: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/466.jpg)
Microblock Group
� Set of one or more microblocks� Treated as single unit� Loaded onto microengine for execution� Can be replicated on multiple microengines for higher speed
CS490N -- Chapt. 22 11 2003
![Page 467: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/467.jpg)
Illustration Of Microblock Groups
ingress ACE(microblock)
IP ACE(microblock)
egress ACE(microblock)
inputports
outputports
ingress ACE(core)
IP ACE(core)
egress ACE(core)
Stack ACE
StrongARM
microengine 1 microengine 2
� Entire microblock group assigned to same microengine
CS490N -- Chapt. 22 12 2003
![Page 468: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/468.jpg)
Replication Of Microblock Group
� Typically used for
– Ingress microblock group
– Egress microblock group� Replications depend on number and speed of interfaces� SDK software computes replications automatically
CS490N -- Chapt. 22 13 2003
![Page 469: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/469.jpg)
Illustration Of Replication
ingress ACE(microblock)
IP ACE(microblock)
ingress ACE(microblock)
IP ACE(microblock)
egress ACE(microblock)
inputports
outputports
microengine 1
microengine 2
microengine 3
� Ingress microblock group replicated on microengines 1 & 3� Incoming packet taken by either copy
CS490N -- Chapt. 22 14 2003
![Page 470: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/470.jpg)
Microblock Structure
� Asynchronous model� Programmer creates
– Initialization macro
– Dispatch loop
CS490N -- Chapt. 22 15 2003
![Page 471: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/471.jpg)
Dispatch Loop
� Determines disposition of each packet� Uses return code to either
– Send packet to StrongARM
– Forward packet to ‘‘next’’ microblock
– Discard packet
CS490N -- Chapt. 22 16 2003
![Page 472: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/472.jpg)
Dispatch Loop Algorithm
Allocate global registers;Initialize dispatch loop; Initialize Ethernet devices;Initialize ingress microblock; Initialize IP microblock;while (1) {
Get next packet from input device(s);Invoke ingress microblock;if ( return code == 0 ) {
Drop the packet;} else if ( return code == 1 ) {
Send packet to ingress core component;} else { /* IP packet */
Invoke IP microblock;if ( return code == 0 ) {
Drop packet;} else if ( return code == 1 ) {
Send packet to IP core component;} else {
Send packet to egress microblock;}
}}
CS490N -- Chapt. 22 17 2003
![Page 473: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/473.jpg)
Arguments Passed ToProcessing Macro
� A buffer handle for a frame that contains a packet� A set of state registers
– Contain information about the frame
– Can be modified� A variable named dl_next_block
– Used to store return code
CS490N -- Chapt. 22 18 2003
![Page 474: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/474.jpg)
Packet Queues
� Placed between ACE components� Buffer packets� Permit asynchronous operation
CS490N -- Chapt. 22 19 2003
![Page 475: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/475.jpg)
Illustration Of Packet Queues
ingress ACEmicroblock
IP ACEmicroblock
egress ACEmicroblock
inputports
outputports
ingress ACE(core)
IP ACE(core)
egress ACE(core)
Stack ACE
StrongARM
microengines
CS490N -- Chapt. 22 20 2003
![Page 476: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/476.jpg)
Exceptions
� Packets passed from microblock to core component� Mechanism
– Microcode sets dl_next_block to IX_EXCEPTION
– Dispatch loop forwards packet to core
– ACE tag used to identify corresponding component
– Exception handler is invoked in core component
CS490N -- Chapt. 22 21 2003
![Page 477: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/477.jpg)
Possible Actions Core ComponentApplies To Exception
� Consume the packet and free the buffer� Modify the packet before sending it on� Send the packet back to the microblock for further
processing� Forward the packet to another ACE on the StrongARM
CS490N -- Chapt. 22 22 2003
![Page 478: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/478.jpg)
Crosscall Mechanism
� Used between
– Core component of one ACE and another
– ACE core component and non-ACE application� Not intended for packet transfer� Operates like Remote Procedure Call (RPC)� Mechanism known as crosscall
CS490N -- Chapt. 22 23 2003
![Page 479: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/479.jpg)
Crosscall Implementation
� Both caller and callee programmed to use crosscall� Declaration given in Interface Definition Language (IDL)� IDL compiler
– Reads specification
– Generates stubs that handle calling details
CS490N -- Chapt. 22 24 2003
![Page 480: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/480.jpg)
Crosscall Implementation(continued)
� Three types of crosscalls
– Deferred: caller does not block; return notificationasynchronous
– Oneway: caller does not block; no value returned
– Twoway: caller blocks; callee returns a value� Twoway call corresponds to traditional RPC� Type of call determined at compile time
CS490N -- Chapt. 22 25 2003
![Page 481: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/481.jpg)
Twoway Calling And Blocking
Because the core component of an ACE is prohibited fromblocking, an ACE cannot make a twoway call.
CS490N -- Chapt. 22 26 2003
![Page 482: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/482.jpg)
Summary
� Intel SDK uses ACE programming model� ACE
– Basic unit of computation
– Can include code for StrongARM (core) andmicroengines (microblock)
� Packet queues used to pass packets between ACEs� Crosscall mechanism used for nonpacket communication
CS490N -- Chapt. 22 27 2003
![Page 483: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/483.jpg)
Questions?
![Page 484: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/484.jpg)
XXIII
ACE Run-Time StructureAnd
StrongARM Facilities
CS490N -- Chapt. 23 1 2003
![Page 485: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/485.jpg)
StrongARM Responsibilities
� Loading ACE software� Creating and initializing ACE� Resolving names� Managing the operation of ACEs� Allocating and reclaiming resources (e.g., memory)� Controlling microengine operation� Communication among core components� Interface to non-ACE applications� Interface to operating system facilities� Forwarding packets between core component and
microblock(s)
CS490N -- Chapt. 23 2 2003
![Page 486: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/486.jpg)
Illustration Of ACE Components
Resource Manager(loadable kernel module)
Action ServicesLibrary
(loadable kernel module)
OMS
Resolver(process)
NameServer
(process)
Operating SystemSpecific Library(shared library)
. . .
ACE core components (processes)
StrongARMmicroengines
. . .
microblocks
CS490N -- Chapt. 23 3 2003
![Page 487: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/487.jpg)
Components
� Object Management System
– Binds names to objects
– Contains
* Resolver
* Name server� Resource Manager
– Access to OS
– Communication with microengines
– Memory management
CS490N -- Chapt. 23 4 2003
![Page 488: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/488.jpg)
Components(continued)
� Operating System Specific Library
– Shared library
– Provides virtual API
– Should be named OS independent library� Action Services Library
– Loadable kernel module
– Run-time support for core component
– TCP/IP support
CS490N -- Chapt. 23 5 2003
![Page 489: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/489.jpg)
Microengine Assignment
� Automated by SDK� Programmer specifies
– Type (speed) of each port
– Type of each ACE� SDK chooses
– Number of replications for each microblock
– Assignment of microblocks to microengines
CS490N -- Chapt. 23 6 2003
![Page 490: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/490.jpg)
Type Values Used In Configuration File
� Only two port types
– Slow corresponds to 10 / 100 Ethernet
– Fast corresponds to Gigabit Ethernet
Numeric Value Meaning
0 Slow ingress file1 Slow egress file2 Fast ingress file for Port 13 Fast ingress file for Port 24 Fast egress file
CS490N -- Chapt. 23 7 2003
![Page 491: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/491.jpg)
Pieces Of ACE Code Programmer Writes
� Initialization function called when ACE begins� Exception handler receives packets sent by microblock� Action functions that correspond to packet classifications� Crosscalls ACE accepts� Timer functions for timed events� Callback functions for returned values� Termination function called when the ACE terminates
CS490N -- Chapt. 23 8 2003
![Page 492: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/492.jpg)
Reserved Names
� SDK software uses fixed names for some functions� Examples
– ACE initialization function must be named ix_init
– ACE termination function must be named ix_fini
CS490N -- Chapt. 23 9 2003
![Page 493: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/493.jpg)
Overall Structure Of Core Component
� Core component of ACE runs as separate Linux process� Always the same structure� Programmer does not write main program� SDK supplies structure and main program� Uses an event loop
CS490N -- Chapt. 23 10 2003
![Page 494: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/494.jpg)
Conceptual Structure Of ACE Core Component
main( ) /* Core component of an ACE */{
Intel_init( ); /* Perform internal initialization */ix_init( ); /* Call user’s initialization function */Intel_event_loop( ): /* Perform internal event loop */ix_fini( ); /* Call user’s termination function */Intel_fini( ); /* Perform internal cleanup */exit( ); /* Terminate the Linux process */
}
CS490N -- Chapt. 23 11 2003
![Page 495: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/495.jpg)
Event Loop
� Central to asynchronous programming model� Uses polling� Repeatedly checks for presence of event(s) and calls
appropriate handler� Can be hidden from programmer� In ACE model
– Explicit
– Programmer can modify / extend
CS490N -- Chapt. 23 12 2003
![Page 496: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/496.jpg)
Illustration Of ACE Event Loop
Intel_event_loop( ){
do forever {E = getnextevent( );if (E is termination event) {
return to caller;} else if (E is exception event) {
call exception handler function;} else if (E is timer event) {
call timer handler function;/* Note: additional event tests can be added here*/}
}}
CS490N -- Chapt. 23 13 2003
![Page 497: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/497.jpg)
A Note About Event Loops
Beware: although it may seem trivial, the event loop mechanismhas several surprising consequences for programmers.
CS490N -- Chapt. 23 14 2003
![Page 498: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/498.jpg)
Event Loop Processing
� If event loop stops
– All processing stops
– The arrival of a new event will not trigger invocation ofan event handler
� Conclusion: the event loop must go on
CS490N -- Chapt. 23 15 2003
![Page 499: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/499.jpg)
Asynchronous Programming, Event Loops,Threads, And Blocking
Because each core component executes as a single thread ofcontrol, no handler function is permitted to block because doingso will stop the event loop and block the entire core component.
CS490N -- Chapt. 23 16 2003
![Page 500: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/500.jpg)
Prohibition On Blocking Calls
� Programmer must use asynchronous call model� Calling program specifies
– Function to invoke
– Arguments to pass
– Callback function� Called program
– Invoked asynchronously
– Receives copy of arguments, and computes result
– Specifies callback to be invoked
CS490N -- Chapt. 23 17 2003
![Page 501: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/501.jpg)
Illustration Of Asynchronous Callback (part 1)
h { /* Synchronous version */y = f( x ); /* Call f (potentially blocking) */z = g( y ); /* Use the result from f to call g */q += z; /* Use the value of z to update q*/return;
}
h1 { /* Asynchronous version */allocate global variables y, z, and q;establish cbf1 as the callback function for f1;establish cbg1 as the callback function for g1;Start f1(x) with a nonblocking call;return;
}
CS490N -- Chapt. 23 18 2003
![Page 502: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/502.jpg)
Illustration Of Asynchronous Callback (part 2)
function cbf1(retval) { /* Callback function for f1 */y = retval;start g1(y) with a nonblocking call;return;
}
function cbg1(retval) { /* Callback function for g1 */z = retval;q += z;return;
}
CS490N -- Chapt. 23 19 2003
![Page 503: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/503.jpg)
Code Complexity
Because it uses the asynchronous paradigm, the corecomponent of an ACE is usually significantly more complex anddifficult to understand than a synchronous program that solvesthe same problem.
CS490N -- Chapt. 23 20 2003
![Page 504: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/504.jpg)
Some Good News
� ACE core component structured around event loop� Underlying system is sequential� No mutual exclusion needed!
CS490N -- Chapt. 23 21 2003
![Page 505: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/505.jpg)
Some Good News
� ACE core component structured around event loop� Underlying system is sequential� No mutual exclusion needed!
CS490N -- Chapt. 23 22 2003
![Page 506: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/506.jpg)
Memory Allocation
� Performed by StrongARM� Function RmMalloc (in Resource Manager) allocates
memory� Request must specify type of memory
– SRAM
– SDRAM
– Scratch� Function RmGetPhysOffset maps virtual address to physical
address� Resulting physical address passed to microblock
CS490N -- Chapt. 23 23 2003
![Page 507: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/507.jpg)
Steps Taken At System Startup
� Load and start ASL kernel module, name server, andresolver
� Load and start device drivers for the network interfaces� Load and initialize the Resource Manager which determines
how many copies of each ingress and egress microblockgroup to run
� Parse and check configuration file ixsys.config� Start core component of each ACE, which causes ACE to
invoke ix_init function
CS490N -- Chapt. 23 24 2003
![Page 508: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/508.jpg)
Steps Taken At System Startup(continued)
� Turn on interfaces, assign microblock groups tomicroengines, and resolve external references (known aspatching)
� Bind the ACE targets and physical interfaces� Start microengines running
CS490N -- Chapt. 23 25 2003
![Page 509: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/509.jpg)
ACE Data Structure
� Stores information needed by Intel SDK software� Must be allocated by ix_init and deallocated by ix_fini� Uses structure ix_ace� Can be extended with data defined by programmer
CS490N -- Chapt. 23 26 2003
![Page 510: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/510.jpg)
Example ACE Data Declaration
#include <ix/asl.h> /* Include Intel’s library declarations*/
structmyace { /* Programmer’s control block */
struct ix_ace ace; /* Intel’s ace embedded as first item*/
int mycount; /* Counter used by programmer’s code*/
/* Programmer can insert additional data items here... */
}
CS490N -- Chapt. 23 27 2003
![Page 511: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/511.jpg)
Example Data Structure Allocation
ix_init ( ... , ix_ace **app , ... )
{
struct myace *ap; /* Ptr to programmer’s control block*/
ap = malloc ( sizeof ( struct myace ) ) ;
*app = &ap->ace ;
ix_ace_init ( &ap->ace ) ;
/* Other initialization code goes here... */
}
CS490N -- Chapt. 23 28 2003
![Page 512: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/512.jpg)
Crosscall
� Uses OMS� Three types
– Oneway: no return
– Twoway: conventional procedure call
– Deferred: asynchronous return through callback
CS490N -- Chapt. 23 29 2003
![Page 513: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/513.jpg)
Possible Crosscall Communication
Caller Called Procedure Block?
ACE core component ACE core component no
ACE core component non-ACE application no
non-ACE application ACE core component no
non-ACE application non-ACE application yes
CS490N -- Chapt. 23 30 2003
![Page 514: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/514.jpg)
Crosscall Declaration
� Uses Interface Definition Language (IDL)� Declares type of
– Exported functions
– Arguments
CS490N -- Chapt. 23 31 2003
![Page 515: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/515.jpg)
Example IDL Declaration
interface PacketCounter{
struct packetinfo {int numpackets; /* Count of total packets */int numbcasts; /* Count of broadcast packets*/
};
oneway void incTcount ( in int addedpackets ) ;deferred int resetTcount ( in int newnumpkts ) ;twoway int resetPinfo ( in struct packetinfo ) ;
};
CS490N -- Chapt. 23 32 2003
![Page 516: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/516.jpg)
Timer Management
� Performed on StrongARM� Uses OMS� Follows asynchronous model
CS490N -- Chapt. 23 33 2003
![Page 517: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/517.jpg)
Using A Timer
1. Initialize H, a handle for an ix_event.
2. Associate handle H with a callback function, CB.
3. Calculate T, a time for the event to occur.
4. Schedule event H at time T.
5. At any time prior to T, event H can be cancelled.
6. At time T, function CB will be called if the event has
not been cancelled.
CS490N -- Chapt. 23 34 2003
![Page 518: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/518.jpg)
Summary
� Core component of ACE
– Runs as Linux process
– Provides asynchronous API
– Uses event loop mechanism
– Includes functions named ix_init and ix_fini
– Responsible for memory allocation and timer management
CS490N -- Chapt. 23 35 2003
![Page 519: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/519.jpg)
Questions?
![Page 520: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/520.jpg)
XX
Packet Processor Hardware(Microengines And FBI)
CS490N -- Chapt. 20 1 2003
![Page 521: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/521.jpg)
Role Of Microengines
� Packet ingress from physical layer hardware� Checksum verification� Header processing and classification� Packet buffering in memory� Table lookup and forwarding� Header modification� Checksum computation� Packet egress to physical layer hardware
CS490N -- Chapt. 20 2 2003
![Page 522: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/522.jpg)
Microengine Characteristics
� Programmable microcontroller� RISC design� One hundred twenty-eight general-purpose registers� One hundred twenty-eight transfer registers� Hardware support for four threads and context switching� Five-stage execution pipeline� Control of an Arithmetic Logic Unit (ALU)� Direct access to various functional units
CS490N -- Chapt. 20 3 2003
![Page 523: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/523.jpg)
Microengine Level
� Not a typical CPU� Does not contain native instruction for each operation� Really a microsequencer
CS490N -- Chapt. 20 4 2003
![Page 524: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/524.jpg)
Consequence Of Microsequencing
Because it functions as a microsequencer, a microengine doesnot provide native hardware instructions for arithmeticoperations, nor does it provide addressing modes for directmemory access. Instead, a program running on a microenginecontrols and uses functional units on the chip to access memoryand perform operations.
CS490N -- Chapt. 20 5 2003
![Page 525: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/525.jpg)
Microengine Instruction Set
Instruction DescriptionArithmetic, Rotate, And Shift Instructions
Branch and Jump Instructions
Reference Instructions
Local Register Instructions
Miscellaneous Instructions
ALUALU_SHFDBL_SHIFT
Perform an arithmetic operationPerform an arithmetic operation and shiftConcatenate and shift two longwords
BR, BR=0, BR!=0, BR>0, BR>=0, BR<0,BR<=0, BR=count, BR!=count
BR_BSET, BR_BCLRBR=BYTE, BR!=BYTEBR=CTX, BR!=CTXBR_INP_STATEBR_!SIGNALJUMPRTN
Branch or branch conditional
Branch if bit set or clearBranch if byte equal or not equalBranch on current contextBranch on event stateBranch if signal deassertedJump to labelReturn from branch or jump
CSRFAST_WRLOCAL_CSR_RD, LOCAL_CSR_WRR_FIFO_RDPCI_DMASCRATCHSDRAMSRAMT_FIFO_WR
CSR referenceWrite immediate data to thd_done CSRsRead and write CSRsRead the receive FIFOIssue a request on the PCI busScratchpad memory requestSDRAM referenceSRAM referenceWrite to transmit FIFO
FIND_BST, FIND_BSET_WITH_MASKIMMEDIMMED_B0, IMMED_B1, IMMED_B2, IMMED_B3IMMED_W0, IMMED_W1LD_FIELD, LD_FIELD_W_CLRLOAD_ADDRLOAD_BSET_RESULT1, LOAD_BSET_RESULT2
Find first 1 bit in a valueLoad immediate value and sign extendLoad immediate byte to a fieldLoad immediate word to a fieldLoad byte(s) into specified field(s)Load instruction addressLoad the result of find_bset
CTX_ARBNOPHASH1_48, HASH2_48, HASH3_48HASH1_64, HASH2_64, HASH3_64
Perform context swap and wake on eventSkip to next instructionPerform 48-bit hash function 1, 2, or 3Perform 64-bit hash function 1, 2, or 3
CS490N -- Chapt. 20 6 2003
![Page 526: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/526.jpg)
Microengine View Of Memory
� Separate address spaces� Specific instruction to reference each memory type
– Instruction sdram to access SDRAM memory
– Instruction sram to access SRAM memory
– Instruction scratch to access Scratchpad memory� Consequence: early binding of data to memory
CS490N -- Chapt. 20 7 2003
![Page 527: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/527.jpg)
Five-Stage Instruction Pipeline
Stage Description
1 Fetch the next instruction2 Decode the instruction and get register address(es)3 Extract the operands from registers4 Perform ALU, shift, or compare operations and set
the condition codes5 Write the results to the destination register
CS490N -- Chapt. 20 8 2003
![Page 528: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/528.jpg)
Example Of Pipeline Execution
stage 5stage 4stage 3stage 2stage 1clock
1
2
3
4
5
6
7
8
inst. 1
inst. 2
inst. 3
inst. 4
inst. 5
inst. 6
inst. 7
inst. 8
-
inst. 1
inst. 2
inst. 3
inst. 4
inst. 5
inst. 6
inst. 7
-
-
inst. 1
inst. 2
inst. 3
inst. 4
inst. 5
inst. 6
-
-
-
inst. 1
inst. 2
inst. 3
inst. 4
inst. 5
-
-
-
-
inst. 1
inst. 2
inst. 3
inst. 4
Time
� Once pipeline is started, one instruction completes per cycle
CS490N -- Chapt. 20 9 2003
![Page 529: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/529.jpg)
Instruction Stall
� Occurs when operand not available� Processor temporarily stops execution� Reduces overall speed� Should be avoided when possible
CS490N -- Chapt. 20 10 2003
![Page 530: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/530.jpg)
Example Instruction Stall
� Consider two instructions:
K: ALU operation to add the contents of R1 to R2
K+1: ALU operation to add the contents of R2 to R3
� Second instruction cannot access R2 until value has beenwritten
� Stall occurs
CS490N -- Chapt. 20 11 2003
![Page 531: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/531.jpg)
Effect Of Instruction Stall
stage 5stage 4stage 3stage 2stage 1clock
1
2
3
4
5
6
7
8
inst. K
inst. K+1
inst. K+2
inst. K+3
inst. K+3
inst. K+3
inst. K+4
inst. K+5
inst. K-1
inst. K
inst. K+1
inst. K+2
inst. K+2
inst. K+2
inst. K+3
inst. K+4
inst. K-2
inst. K-1
inst. K
inst. K+1
inst. K+1
inst. K+1
inst. K+2
inst. K+3
inst. K-3
inst. K-2
inst. K-1
inst. K
-
-
inst. K+1
inst. K+2
inst. K-4
inst. K-3
inst. K-2
inst. K-1
inst. K
-
-
inst. K+1
Time
� Bubble develops in pipeline� Bubble eventually reaches final stage
CS490N -- Chapt. 20 12 2003
![Page 532: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/532.jpg)
Sources Of Delay
� Access to result of previous / earlier operation� Conditional branch� Memory access
CS490N -- Chapt. 20 13 2003
![Page 533: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/533.jpg)
Memory Access Delays
Type Of Approximate Access TimeMemory (in clock cycles)
Scratchpad 12 - 14SRAM 16 - 20SDRAM 32 - 40
� Delay is surprisingly large
CS490N -- Chapt. 20 14 2003
![Page 534: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/534.jpg)
Threads Of Execution
� Technique used to speed processing� Multiple threads of execution remain ready to run� Program defines threads and informs processor� Processor runs one thread at a time� Processor automatically switches context to another thread
when current thread blocks� Known as hardware threads
CS490N -- Chapt. 20 15 2003
![Page 535: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/535.jpg)
Illustration Of Hardware Threads
thread 1
thread 2
thread 3
thread 4
time t1 time t2 time t3
time
context switch
� White - ready but idle� Blue - being executed by microengine� Gray - blocked (e.g., during memory access)
CS490N -- Chapt. 20 16 2003
![Page 536: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/536.jpg)
The Point Of Hardware Threads
Hardware threads increase overall throughput by allowing amicroengine to handle up to four packets concurrently; withthreads, computation can proceed without waiting for memoryaccess.
CS490N -- Chapt. 20 17 2003
![Page 537: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/537.jpg)
Context Switching Time
� Low-overhead context switch means one instruction delay ashardware switches from one thread to another
� Zero-overhead context switch means no delay during contextswitch
� IXP1200 offers zero-overhead context switch
CS490N -- Chapt. 20 18 2003
![Page 538: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/538.jpg)
Microengine Instruction Store
� Private instruction store per microengine� Advantage: no contention� Disadvantage: small (1024 instructions)
CS490N -- Chapt. 20 19 2003
![Page 539: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/539.jpg)
General-Purpose Registers
� One hundred twenty-eight per microengine� Thirty-two bits each� Used for computation or intermediate values� Divided into banks� Context-relative or absolute addresses
CS490N -- Chapt. 20 20 2003
![Page 540: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/540.jpg)
Forms Of Addressing
� Absolute
– Entire set available
– Uses integer from 0 to 127� Context-relative
– One quarter of set available to each thread
– Uses integer from 0 to 31
– Allows same code to run on multiple microengines
CS490N -- Chapt. 20 21 2003
![Page 541: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/541.jpg)
Register Banks
� Mechanism commonly used with RISC processor� Registers divided into A bank and B bank� Maximum performance achieved when each instruction
references a register from the A bank and a register from theB bank
CS490N -- Chapt. 20 22 2003
![Page 542: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/542.jpg)
Summary Of General-Purpose Registers
context 3 (16 regs.)
context 2 (16 regs.)
context 1 (16 regs.)
context 0 (16 regs.)
context 3 (16 regs.)
context 2 (16 regs.)
context 1 (16 regs.)
context 0 (16 regs.)
0 - 15
0 - 15
0 - 15
0 - 15
0 - 15
0 - 15
0 - 15
0 - 15
48 - 63
32 - 47
16 - 31
0 - 15
48 - 63
32 - 47
16 - 31
0 - 15
A bank(64 regs.)
B bank(64 regs.)
used byabsolute addr. relative addr.
� Note: half of the registers for each context are from A bankand half from B bank
CS490N -- Chapt. 20 23 2003
![Page 543: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/543.jpg)
Transfer Registers
� Used to buffer external memory transfers� Example: read a value from memory
– Copy value from memory into transfer register
– Move value from transfer register into general-purposeregister
� One hundred twenty-eight per microengine� Divided into four types
– SRAM or SDRAM
– Read or write
CS490N -- Chapt. 20 24 2003
![Page 544: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/544.jpg)
Transfer Register Addresses
context 3 ( 8 regs. )
context 2 ( 8 regs. )
context 1 ( 8 regs. )
context 0 ( 8 regs. )
context 3 ( 8 regs. )
context 2 ( 8 regs. )
context 1 ( 8 regs. )
context 0 ( 8 regs. )
context 3 ( 8 regs. )
context 2 ( 8 regs. )
context 1 ( 8 regs. )
context 0 ( 8 regs. )
context 3 ( 8 regs. )
context 2 ( 8 regs. )
context 1 ( 8 regs. )
context 0 ( 8 regs. )
0 - 7
0 - 7
0 - 7
0 - 7
0 - 7
0 - 7
0 - 7
0 - 7
0 - 7
0 - 7
0 - 7
0 - 7
0 - 7
0 - 7
0 - 7
0 - 7
24 - 31
16 - 23
8 - 15
0 - 7
24 - 31
16 - 23
8 - 15
0 - 7
24 - 31
16 - 23
8 - 15
0 - 7
24 - 31
16 - 23
8 - 15
0 - 7
SDRAM read( 32 regs. )
SDRAM write( 32 regs. )
SRAM read( 32 regs. )
SRAM write( 32 regs. )
used byabsolute addr. relative addr.
CS490N -- Chapt. 20 25 2003
![Page 545: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/545.jpg)
Local Control And Status Registers
� Used to interrogate or control the IXP1200� All mapped into StrongARM address space� Microengine can only access its own local CSRs
CS490N -- Chapt. 20 26 2003
![Page 546: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/546.jpg)
Local CSRs
Local CSR Purpose
USTORE_ADDRESS Load the microengine control storeUSTORE_DATA Load a value into the control storeALU_OUTPUT Debugging: allows StrongARM to read
GPRs and transfer registersACTIVE_CTX_STS Determine context statusENABLE_SRAM_JOURNALING Debugging: place journal in SRAMCTX_ARB_CTL Context arbiter controlCTX_ENABLES Debugging: enable a contextCC_ENABLE Enable condition codesCTX_n_STS Determine context statusCTX_n_SIG_EVENTS Determine signal statusCTX_n_WAKEUP_EVENTS Determine which wakeup events
are currently enabled
Note: n is digit from 0 through 3 (hardware contains a separateCSR for each of the four contexts).
CS490N -- Chapt. 20 27 2003
![Page 547: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/547.jpg)
Interprocessor Communication Mechanisms
� Thread-to-StrongARM communication
– Interrupt
– Signal event� Thread-to-thread communication within one IXP1200
– Signal event� Thread-to-thread communication across multiple IXP1200s
– Ready bus
CS490N -- Chapt. 20 28 2003
![Page 548: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/548.jpg)
FBI Unit
� Interface between processors and other functional units
– Scratchpad memory
– Hash unit
– FBI control and status registers
– Control and operation of the Ready bus
– Control and operation of the IX bus
– Data buffers that hold data arriving from the IX bus
– Data buffers that hold data sent to the IX bus� Operates like DMA controller� Uses FIFOs
CS490N -- Chapt. 20 29 2003
![Page 549: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/549.jpg)
FIFOs
� Hardware buffers� Transfer performed in 64 byte blocks� Misleading name: access is random� Only path between external device and microengine� Receive FIFO (RFIFO)
– Handles input
– Physical interface moves data to RFIFO
– FBI moves data from RFIFO to memory
CS490N -- Chapt. 20 30 2003
![Page 550: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/550.jpg)
FIFOs(continued)
� Transmit FIFO (TFIFO)
– Handles output
– FBI moves data from memory to TFIFO
– Physical interface extracts data from TFIFO
CS490N -- Chapt. 20 31 2003
![Page 551: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/551.jpg)
A Note About FIFO Transfer
It is possible for a microengine to transfer data to or from aFIFO without going to memory.
CS490N -- Chapt. 20 32 2003
![Page 552: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/552.jpg)
FBI
� Initiates and controls data transfer� Contains active components
– Push engine
– Pull engine
CS490N -- Chapt. 20 33 2003
![Page 553: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/553.jpg)
FBI Architecture (simplified)
TFIFO
RFIFO
Pull Engine
Push Engine
CSRs
Scratch
Hash8-entry(push)
8-entry(pull)
command queues
8-entry(hash)
data fromSRAM transfer
registers
data toSRAM transfer
registers
commands frommicroengine
command bus
commands arrive
data enters
commands arrive
data leaves
CS490N -- Chapt. 20 34 2003
![Page 554: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/554.jpg)
ScratchPad Memory
� Organized into 1K words of 4 bytes each� Offers special facilities
– Test-and-set of individual bits
– Autoincrement (low latency)
CS490N -- Chapt. 20 35 2003
![Page 555: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/555.jpg)
Hash Unit
� Configurable coprocessor� Operates asynchronously� Intended for table lookup when multiplication or division
required (ALU does not have multiply instruction)
CS490N -- Chapt. 20 36 2003
![Page 556: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/556.jpg)
Hash Unit Computation
� Computes quotient Q(x) and remainder R(x):
A(x) ∗ M(x) / G(x) → Q(x) + R(x)
� A(x) is input value� M(x) is hash multiplier (configurable)� G(x) is built-in value� Two values for G — one for 48-bit hash, one for 64-bit
hash
CS490N -- Chapt. 20 37 2003
![Page 557: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/557.jpg)
Hash Mathematics
� Integer value interpreted as polynomial over field [0,1]� Example:
2040116
� Is interpreted as
x17 + x10 + 1
� Similarly, value G(x) used in 48-bit hash
100100200040116
� Is interpreted as
x48 + x36 + x25 + x10 + 1
CS490N -- Chapt. 20 38 2003
![Page 558: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/558.jpg)
Hash Example
A = 80000000000116 ( x47 + 1 )
G = 100100200040116 ( x48 + x36 + x25 + x10 + 1 )
M = 20D16 ( x9 + x3 + x2 + 1 )
� Hash computes R, remainder of M times A divided by G
H(X) = R = M ∗ A % G
CS490N -- Chapt. 20 39 2003
![Page 559: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/559.jpg)
Hash Example(continued)
� Multiplying yields
M ∗ A = x56 + x50 + x49 + x47 + x9 + x3 + x2 + 1
� Furthermore
M ∗ A = Q ∗ G + R� Where
Q = x8 + x2 + x1
CS490N -- Chapt. 20 40 2003
![Page 560: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/560.jpg)
Hash Example(continued)
� So, Q is:
10616
� And R is:
x47 + x44 + x38 + x37 + x33 + x27 + x26 + x18 + x12 +
x11 + x9 + x8 + x3 + x1 + 1
� The hash unit returns R as the value of the computation:
90620C041B0B16
CS490N -- Chapt. 20 41 2003
![Page 561: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/561.jpg)
Other IXP1200 Hardware
� The IXP1200 contains over 1500 registers used for
– Configuration and bootstrapping
– Control of functional units and buses
– Checking status of processors, threads, and onboardfunctional units
� Example: IX bus registers used to
– Control bus operation
– Configure the bus for 64-bit or 32-bit mode operation
– Control devices attached to the bus
– Specify whether a MAC device or the IXP handlescontrol signals
CS490N -- Chapt. 20 42 2003
![Page 562: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/562.jpg)
Summary
� Microengines
– Low-level, programmable packet processors
– Use RISC design with instruction pipeline
– Have hardware threads for higher throughput
– Use transfer registers to access memory
– Use FIFOs for I/O
– Do not have multiply, but have access to hash unit
CS490N -- Chapt. 20 43 2003
![Page 563: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/563.jpg)
Questions?
![Page 564: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/564.jpg)
XXIV
Microengine Programming I
CS490N -- Chapt. 24 1 2003
![Page 565: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/565.jpg)
Microengine Code
� Many low-level details� Close to hardware� Written in assembly language
CS490N -- Chapt. 24 2 2003
![Page 566: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/566.jpg)
Features Of Intel’s Microengine Assembler
� Directives to control assembly� Symbolic register names� Macro preprocessor (extension of C preprocessor)� Set of structured programming macros
CS490N -- Chapt. 24 3 2003
![Page 567: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/567.jpg)
Statement Syntax
� General form:
label: operator operands token
� Interpretation of token depends on instruction
CS490N -- Chapt. 24 4 2003
![Page 568: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/568.jpg)
Comment Statements
� Three styles available
– C style (between /* and */ )
– C++ style ( // until end of line )
– Traditional assembly style ( ; until end of line )� Only traditional comments remain in code for intermediate
steps of assembly
CS490N -- Chapt. 24 5 2003
![Page 569: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/569.jpg)
Assembler Directives
� Begin with period in column one� Can
– Generate code
– Control assembly process� Example: associate myname with register five in the A
register bank
.areg myname 5
CS490N -- Chapt. 24 6 2003
![Page 570: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/570.jpg)
Example Operand Syntax
� Instruction alu invokes the ALU
alu [ dst, src1, op, src2 ]
� Four operands
– Destination register
– First source register
– Operation
– Second source register� Two minus signs ( – – ) can be specified for destination, if
none needed
CS490N -- Chapt. 24 7 2003
![Page 571: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/571.jpg)
Memory Operations
� Programmer specifies
– Type of memory
– Direction of transfer
– Address in memory (two registers used)
– Starting transfer register
– Count of words to transfer
– Optional token
CS490N -- Chapt. 24 8 2003
![Page 572: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/572.jpg)
Memory Operations(continued)
� General forms
sram [ direction, xfer_reg, addr1, addr2, count ], optional_token
sdram [ direction, xfer_reg, addr1, addr2, count ], optional_token
scratch [ direction, xfer_reg, addr1, addr2, count ], optional_token
CS490N -- Chapt. 24 9 2003
![Page 573: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/573.jpg)
Memory Addressing
� Specified with operands addr1 and addr2� Each operand corresponds to register� Use of two operands allows
– Base + offset
– Scaling to large memory
CS490N -- Chapt. 24 10 2003
![Page 574: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/574.jpg)
Immediate Instruction
� Place constant in thirty-two bit register
immed [ dst, ival, rot ]
� Upper sixteen bits of ival must be all zeros or all ones� Operand rot specifies bit rotation
0 No rotation
<< 0 No rotation (same as 0)
<< 8 Rotate to the left by eight bits
<< 16 Rotate to the left by sixteen bits
CS490N -- Chapt. 24 11 2003
![Page 575: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/575.jpg)
Register Names
� Usually automated by assembler� Directives available for manual assignment
Directive Type Of Register Assigned
.areg General-purpose register from the A bank
.breg General-purpose register from the B bank
.$reg SRAM transfer register
.$$reg SDRAM transfer register
CS490N -- Chapt. 24 12 2003
![Page 576: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/576.jpg)
Automated Register Assignment
� Programmer
– Uses .local directive to declare register names
– Uses .endlocal to terminate scope
– References names in instructions� Assembler
– Assigns registers
– Chooses bank for each register
– Replaces names in code with correct reference
CS490N -- Chapt. 24 13 2003
![Page 577: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/577.jpg)
Illustration Of Automated Register Naming
� One or more register names specified on .local� Example
.local myreg loopctr tot
.endlocal
.
.
.
code in this block can useregisters myreg, loopctr, and tot
� Names valid only within scope� Scopes can be nested
CS490N -- Chapt. 24 14 2003
![Page 578: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/578.jpg)
Illustration Of Nested Register Scope
.local myreg loopctr
.endlocal
.local rone rtwo
.endlocal
.local rthree rfour
.endlocal
.
.
.
.
.
.
outer scope that defines registersmyreg and loopctr
nested scope that defines registersrone and rtwo
nested scope that defines registersrthree and rfour
CS490N -- Chapt. 24 15 2003
![Page 579: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/579.jpg)
Conflicts
� Operands must come from separate banks� Some code sequences cause conflict� Example:
Z ← Q + R;Y ← R + S;X ← Q + S;
� No assignment is valid� Programmer must change code
CS490N -- Chapt. 24 16 2003
![Page 580: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/580.jpg)
Macro Preprocessor Features
� File inclusion� Symbolic constant substitution� Conditional assembly� Parameterized macro expansion� Arithmetic expression evaluation� Iterative generation of code
CS490N -- Chapt. 24 17 2003
![Page 581: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/581.jpg)
Macro Preprocessor Statements
Keyword Use
#include Include a file#define Define symbolic constant#define_eval Eval arithmetic expression & define constant#undef Remove previous definition#macro Start macro definition#endm End a macro definition
#ifdef Start conditional compilation if defined#ifndef Start conditional compilation if not defined#if Start conditional compilation if expr. true#else Else part of conditional compilation#elif Else if part of conditional compilation#endif End of conditional compilation
#for Start definite iteration#while Start indefinite iteration#repeat Start indefinite iteration (test after)#endloop Terminate an iteration
CS490N -- Chapt. 24 18 2003
![Page 582: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/582.jpg)
Macro Definition
� Can occur at any point in program� General form:
#macro name [ parameter1 , parameter2 , . . . ]lines of text
#endm
CS490N -- Chapt. 24 19 2003
![Page 583: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/583.jpg)
Macro Example
� Compute a = b + c + 5
/* example macro add5 computes a=b+c+5 */#macro add5[a, b, c]
.local tmpalu[tmp, c, +, 5]alu[a, b, +, tmp]
.endlocal#endm
� Assumes values a, b, and c in registers
CS490N -- Chapt. 24 20 2003
![Page 584: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/584.jpg)
Macro Expansion Example
� Call of add5[var1, var2, var3] expands to:
.local tmpalu[tmp, var3, +, 5]alu[var1, var2, +, tmp]
.endlocal
� Warning: can generate illegal code
CS490N -- Chapt. 24 21 2003
![Page 585: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/585.jpg)
Generation Of Repeated Code
� Macro preprocessor
– Supports #while statement for iteration
– Uses #define_eval for arithmetic evaluation� Can be used to generate repeated code
CS490N -- Chapt. 24 22 2003
![Page 586: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/586.jpg)
Example Of Repeated Code
� Preprocessor code:
#define LOOP 1#while (LOOP < 4)
alu_shf[reg, -, B, reg, >>LOOP]#define_eval LOOP LOOP + 1#endloop
� Expands to:
alu_shf[reg, -, B, reg, >>1]alu_shf[reg, -, B, reg, >>2]alu_shf[reg, -, B, reg, >>3]
CS490N -- Chapt. 24 23 2003
![Page 587: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/587.jpg)
Structured Programming Directives
� Make code appear to follow structured programmingconventions
� Include break statement al la C
Directive Meaning
.if Conditional execution
.elif Terminate previous conditional execution andstart a new conditional execution
.else Terminate previous conditional execution anddefine an alternative
.endif End .if conditional
.while Indefinite iteration with test before
.endw End .while loop
.repeat Indefinite iteration with test after
.until End .repeat loop
.break Leave a loop
.continue Skip to next iteration of loop
CS490N -- Chapt. 24 24 2003
![Page 588: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/588.jpg)
Mechanisms For Context Switching
� Context switching is voluntary� Thread can execute:
– ctx_arb instruction
– Reference instruction (e.g., memory reference)
CS490N -- Chapt. 24 25 2003
![Page 589: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/589.jpg)
Argument To ctx_arb Instruction
� Determines disposition of thread
– voluntary: thread suspended until later
– signal_event: thread suspended until specified eventoccurs
– kill: thread terminated
CS490N -- Chapt. 24 26 2003
![Page 590: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/590.jpg)
Context Switch On Reference Instruction
� Token added to instruction to control context switch� Two possible values
– ctx_swap: thread suspended until operation completes
– sig_done: thread continues to run, and signal postedwhen operation completes
� Signals available for SRAM, SDRAM, FBI, PCI, etc.
CS490N -- Chapt. 24 27 2003
![Page 591: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/591.jpg)
Indirect Reference
� Poor choice of name� Hardware optimization� Found on other RISC processors� Result of one instruction modifies next instruction� Avoids stalls� Typical use
– Compute N, a count of words to read from memory
– Modify memory access instruction to read N words
CS490N -- Chapt. 24 28 2003
![Page 592: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/592.jpg)
Fields That Can Be Modified
� Microengine associated with a memory reference� Starting transfer register� Count of words of memory to transfer� Thread ID of the hardware thread (i.e., thread to signal
upon completion)
CS490N -- Chapt. 24 29 2003
![Page 593: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/593.jpg)
How Indirect Reference Operates
� Programmer codes two instructions
– ALU operation
– Instruction with indirect reference set� Note: destination of ALU operation is – – (i.e., no
destination)� Hardware
– Executes ALU instruction
– Uses result of ALU instruction to modify field in nextinstruction
CS490N -- Chapt. 24 30 2003
![Page 594: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/594.jpg)
Example Of Indirect Reference
� Example code
alu_shf [ – –, – –, b, 0x13, << 16 ]scratch [ read, $reg0, addr1, addr2, 0 ], indirect_ref
� Memory instruction coded with count of zero� ALU instruction computes count
CS490N -- Chapt. 24 31 2003
![Page 595: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/595.jpg)
External Transfers
� Microengine cannot directly access
– Memory
– Buses ( I/O devices )� Intermediate hardware units used
– Known as transfer registers
– Multiple registers can be used as large, contiguous buffer
CS490N -- Chapt. 24 32 2003
![Page 596: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/596.jpg)
External Transfer Procedure
� Allocate contiguous set of transfer registers to hold data� Start reference instruction that moves data to or from
allocated registers� Arrange for thread to wait until the operation completes
CS490N -- Chapt. 24 33 2003
![Page 597: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/597.jpg)
Allocating Contiguous Registers
� Registers assigned by assembler� Programmer needs to ensure transfer registers contiguous� Assembler provides .xfer_order directive� Example: allocate four continuous SRAM transfer registers
.local $reg1 $reg2 $reg3 $reg4
.xfer_order $reg1 $reg2 $reg3 $reg4
CS490N -- Chapt. 24 34 2003
![Page 598: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/598.jpg)
Summary
� Microengines programmed in assembly language� Intel’s assembler provides
– Directives for structuring code
– Macro preprocessor
– Automated register assignment
CS490N -- Chapt. 24 35 2003
![Page 599: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/599.jpg)
Questions?
![Page 600: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/600.jpg)
XXV
Microengine Programming II
CS490N -- Chapt. 25 1 2003
![Page 601: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/601.jpg)
Specialized Memory Operations
� Buffer pool manipulation� Processor coordination via bit testing� Atomic memory increment� Processor coordination via memory locking
CS490N -- Chapt. 25 2 2003
![Page 602: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/602.jpg)
Buffer Pool Manipulation
� SRAM facility� Eight linked lists� Operations push and pop� General form of pop
sram [ pop, $xfer, – –, – –, listnum ]
CS490N -- Chapt. 25 3 2003
![Page 603: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/603.jpg)
Processor CoordinationVia Bit Testing
� Provided by SRAM and Scratchpad memories� Atomic test-and-set� Mask used to specify bit in a word� General form
scratch [ bit_wr, $xfer, addr1, addr2, op ]
CS490N -- Chapt. 25 4 2003
![Page 604: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/604.jpg)
Bit Manipulation Operations
Operation Meaning
set_bits Set the specified bits to one
clear_bits Set the specified bits to zero
test_and_set_bits Place the original word in the read transfer
register, and set the specified bits to one
test_and_clear_bits Place the original word in the read transfer
register, and set the specified bits to zero
CS490N -- Chapt. 25 5 2003
![Page 605: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/605.jpg)
Atomic Memory Increment
� Memory shared among
– StrongARM
– Microengines� Need atomic increment to avoid incorrect results� General form
scratch [ incr, – –, addr1, addr2, 1 ]
CS490N -- Chapt. 25 6 2003
![Page 606: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/606.jpg)
Processor CoordinationVia Memory Locking
� Word of memory acts as mutual exclusion lock� Achieved with memory lock instruction� Single read_lock instruction
– Obtains a lock on location X in memory
– Reads values starting at location X� Single write_unlock instruction
– Writes values to memory
– Unlocks the specified location
CS490N -- Chapt. 25 7 2003
![Page 607: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/607.jpg)
Processor CoordinationVia Memory Locking
(continued)
� General form of read_lock
sram [ read_lock, $xfer, addr1, addr2, count ], ctx_swap
� Token ctx_swap required (thread blocks until lock obtained)� General form of write_unlock
sram [ unlock, – –, addr1, addr2, 1 ]
CS490N -- Chapt. 25 8 2003
![Page 608: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/608.jpg)
Implementation Of Memory Locking
� Achieved with a CAM that holds eight items� Only releases contiguous items in CAM� Consequences
– At most eight addresses can be locked at any time
– Thread can remained blocked even if its request can besatisfied
CS490N -- Chapt. 25 9 2003
![Page 609: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/609.jpg)
Control And Status Registers
� Over one hundred fifty� Provide access to hardware units on the chip� Allow processors to
– Configure
– Control
– Interrogate
– Monitor� Access
– StrongARM: mapped into address space
– Microengines: special instructions
CS490N -- Chapt. 25 10 2003
![Page 610: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/610.jpg)
Csr Instruction
� Used on microengines� General form
csr [ cmd, $xfer, CSR, count ]
� Alternatives
– Instruction fast_wr provides access to subset of CSRsthrough the FBI unit
– Instruction local_csr_rd provides access to local CSRs
CS490N -- Chapt. 25 11 2003
![Page 611: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/611.jpg)
Intel Dispatch Loop Macros
� Each microengine executes event loop
– Analogous to core event loop
– Events are low level (e.g., hardware device becomesready)
– Known as dispatch loop� SDK includes predefined macros related to dispatch loop
CS490N -- Chapt. 25 12 2003
![Page 612: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/612.jpg)
Predefined Dispatch Loop Macros
Macro Purpose
Buf_Alloc Allocate a packet bufferBuf_GetData Get the SDRAM address of a packet buffer
(note: the name is misleading)DL_Drop Drop a packet and recycle the bufferDL_GetBufferLength Compute the length (in bytes) of the packet
portion of a bufferDL_GetBufferOffset Compute the offset of packet data within a bufferDL_GetInputPort Obtain the input port over which the packet arrivedDL_GetOutputPort Find the port over which the packet will be sentDL_GetRxStat Obtain the receive status of the packetDL_Init Initialize the dispatch loop macrosDL_MESink Send a packet to next microblock groupDL_SASink Send a packet to the StrongARMDL_SASource Receive a packet from the StrongARMDL_SetAceTag Specify the microblock that is handling a packet
so the StrongARM will knowDL_SetBufferLength Specify the length of packet data in the bufferDL_SetBufferOffset Specify the offset of packet data in the bufferDL_SetExceptionCode Specify the exception code for the StrongARMDL_SetInputPort Specify the port over which the packet arrivedDL_SetOutputPort Specify the port on which the packet will be sentDL_SetRxStat Specify the receive status of the packet
CS490N -- Chapt. 25 13 2003
![Page 613: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/613.jpg)
Illustration Of Packet Flow
dispatch
loop
packet frommicroengine or
physical input port
packet fromcore componenton StrongARM
exception passedup to the
StrongARM
packetdropped
packet sent tophysical device or
next microACE
CS490N -- Chapt. 25 14 2003
![Page 614: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/614.jpg)
Packet Selection Macros
� Also supplied by SDK� Allow selection of packet from queues� Two approaches
– Round-robin from a set of queues
* Used for 10 / 100 Ethernet
* Macro is RoundRobinSource
– Strict FIFO ordering
* Used for Gigabit Ethernet
* Macro is FifoSource
CS490N -- Chapt. 25 15 2003
![Page 615: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/615.jpg)
Obtaining Packets From StrongARM
� Microengine can receive packets from
– Input port (ingress) or another microengine
– StrongARM� Dispatch loop tests for each possibility� Packets from StrongARM should have lower priority
CS490N -- Chapt. 25 16 2003
![Page 616: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/616.jpg)
Implementation Of Priority
� Macro EthernetIngress obtains Ethernet frame� Macro DL_SASource obtains packet from StrongARM� Each returns IX_BUFFER_NULL, if no packet waiting� To achieve priority, DL_SASource counts
SA_CONSUME_NUM times before returning a packet
CS490N -- Chapt. 25 17 2003
![Page 617: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/617.jpg)
Packet Disposition
� Auxiliary variable used� Dispatch loop finds a packet and calls macro X to process� Macro X sets variable dl_buffer_next� Dispatch loop examines dl_buffer_next and invokes
– DL_Drop to drop packet
– DL_SASink to send to StrongARM
– DL_MESink to send to ‘‘next’’ ACE
CS490N -- Chapt. 25 18 2003
![Page 618: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/618.jpg)
Other Predefined Macros
� Macros needed for
– Buffer manipulation
– Packet processing
– Other tasks� Predefined macros available
CS490N -- Chapt. 25 19 2003
![Page 619: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/619.jpg)
Example Code To Process Packet Header
/* Allocate eight SDRAM transfer registers to hold the packet header */
xbuf_alloc [ $$hdr, 8 ]
/* Reserve two general-purpose registers for the computation */
.local base offset
/* Compute the SDRAM address of the data buffer */
Buf_GetData [ base, dl_buffer_handle ]
/* Compute the byte offset of the start of the packet in the buffer */
DL_GetBufferOffset [ offset ]
/* Convert the byte offset to SDRAM words by dividing by eight */
/* (shift right by three bits) */
alu_shf [ offset, – –, B, offset, >>3 ]
/* Load thirty-two bytes of data from SDRAM into eight SDRAM */
/* transfer registers. Start at SDRAM address base + offset */
sdram [ read, $$hdr0, base, offset, 4 ]
CS490N -- Chapt. 25 20 2003
![Page 620: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/620.jpg)
Example Code To Process Packet Header(continued)
/* Inform the assembler that we have finished using the two */
/* registers: base and offset */
.endlocal
/* Process the packet header in the SDRAM transfer registers
/* starting at register $$hdr */
. . .
/* Free the SDRAM transfer registers when finished */
xbuf_free [ $$hdr ]
CS490N -- Chapt. 25 21 2003
![Page 621: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/621.jpg)
Using Intel Dispatch Macros
� Programmer must perform five steps
– Define three symbolic constants
– Declare registers with a .local directive
– Use a .import_var directive to name tag values(optional)
– Include the macros pertinent to the microblock
– Initialize the macros as the first part of a dispatch loop� Note: three constants must be defined before macros are
included
– SA_CONSUME_NUM
– IX_EXCEPTION
– SEQNUM_IGNORECS490N -- Chapt. 25 22 2003
![Page 622: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/622.jpg)
Required Register Declarations
� The following declarations are required
.local dl_reg1 dl_reg2 dl_reg3 dl_buffer_handle dl_next_block
CS490N -- Chapt. 25 23 2003
![Page 623: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/623.jpg)
Naming Tag Values
� Required in microblock that sends packets (exceptions) tocore component
� Uses .import_var directive� Example
.import_var IPV4_TAG
CS490N -- Chapt. 25 24 2003
![Page 624: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/624.jpg)
Including Intel Macros
� Must include two files
– DispatchLoop_h.uc
– DispatchLoopImportVars.h� Ingress microblock includes
– EthernetIngress.uc� Egress microblock includes
– EthernetEgress.uc
CS490N -- Chapt. 25 25 2003
![Page 625: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/625.jpg)
Initialization Of Intel Macros
� Program calls DL_Init to perform initialization� Typical initialization sequence
DL_Init [ ]EthernetIngress_Init [ ]. . . /* Other microblock initialization calls */
CS490N -- Chapt. 25 26 2003
![Page 626: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/626.jpg)
Packet I/O
� Physical frame divided into sixty-four octet units for transfer� Each unit known as mpacket� Division performed by interface hardware� Microengine transfers each mpacket separately� Header uses two bits to specify position of mpacket
– Start Of Packet (SOP) set for first mpacket of frame
– End Of Packet (EOP) set for last mpacket of frame� Note: cell can have both SOP and EOP set
CS490N -- Chapt. 25 27 2003
![Page 627: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/627.jpg)
Packet I/O(continued)
� No interrupts� Dispatch loop uses polling
– Ready Bus Sequencer checks devices and sets bit
– Dispatch loop tests bit
CS490N -- Chapt. 25 28 2003
![Page 628: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/628.jpg)
Ingress Packet Transfer
� Incoming mpacket placed in Receive FIFO (RFIFO)� Microengine can transfer from RFIFO to
– SRAM transfer registers
– Directly into SDRAM� SDRAM transfer has form
sdram [ r_fifo_rd, $$xfer, addr1, addr2, count ], indirect_ref
CS490N -- Chapt. 25 29 2003
![Page 629: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/629.jpg)
Packet Egress
� Steps required
– Reserve space in Transmit FIFO (TFIFO)
– Copy mpacket from memory into TFIFO
– Set SOP and EOP bits
– Set valid flag in XMIT_VALIDATE register� General form used to copy from SDRAM to TFIFO
sdram [ t_fifo_wr, $$xfer, addr1, addr2, count ], indirect_ref
CS490N -- Chapt. 25 30 2003
![Page 630: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/630.jpg)
Setting The Valid FlagIn A XMIT_VALIDATE Register
� Valid bit is in CSR� Use fast_wr instruction to access
fast_wr [ 0, xmit_validate ], indirect_ref
CS490N -- Chapt. 25 31 2003
![Page 631: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/631.jpg)
Other Details
� Microengine must check status of mpacket to determine if
– MAC hardware detected problem (e.g., bad CRC)
– Mpacket arrived with no problems� Information found in RCV_CNTL CSR
CS490N -- Chapt. 25 32 2003
![Page 632: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/632.jpg)
Summary
� Special operations used for
– Synchronization
– Memory access
– CSR access� Microengine executes event loop known as dispatch loop
– Checks for packets arriving
– Calls macro(s) to process each packet
– Sends packets to next specified destination� Intel supplies large set of dispatch loop macros
CS490N -- Chapt. 25 33 2003
![Page 633: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/633.jpg)
Summary(continued)
� Many details required to use Intel dispatch macros� Packet I/O performed on sixty-four byte units called
mpackets� Mpacket can be transferred from RFIFO to
– SRAM transfer registers
– SDRAM� Many details required to perform trivial operations on
packet
CS490N -- Chapt. 25 34 2003
![Page 634: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/634.jpg)
Questions?
![Page 635: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/635.jpg)
XXVI
An Example ACE
CS490N -- Chapt. 26 1 2003
![Page 636: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/636.jpg)
We Will
� Consider an ACE� Examine all the user-written code� See how the pieces fit together
CS490N -- Chapt. 26 2 2003
![Page 637: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/637.jpg)
Choice Of Network System
� Used to demonstrate
– Basic concepts
– Code structure and organization� Need to
– Minimize code size and complexity
– Avoid excessive detail
– Ignore performance optimizations
CS490N -- Chapt. 26 3 2003
![Page 638: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/638.jpg)
The Example
� Trivial network system� In-line paradigm using two ports
– Ethernet-to-Ethernet connectivity
– Known as bump-in-the-wire� Count Web packets
– Frame carries IP
– Datagram carries TCP
– Destination port is 80� Named WWBump ( Web Wire Bump )
CS490N -- Chapt. 26 4 2003
![Page 639: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/639.jpg)
Illustration Of WWBump
router
switch
wwbump. . .
connections tolocal systems
to Internet
� Passes all traffic in either direction� Counts Web packets
CS490N -- Chapt. 26 5 2003
![Page 640: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/640.jpg)
Design
� Code written for bridal veil testbed� Accepts input from either port zero or port one� Forwards incoming traffic to opposite port� Uses the ingress and egress library ACEs supplied by Intel� Defines a wwbump MicroACE� Passes each Web packet to the core component (exception)� Provides access to current packet count via the crosscall� Uses an ixsys.config file to specify binding of targets
CS490N -- Chapt. 26 6 2003
![Page 641: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/641.jpg)
Organization Of ACEs
wwbump ACE(microblock)
ingress ACE(microblock)
egress ACE(microblock)
inputports
outputports
StrongARM
microengine
wwbump ACE(core)
crosscallclient
CS490N -- Chapt. 26 7 2003
![Page 642: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/642.jpg)
Static ACE Data Structure
� Intel software requires
– Static control block for ACE
– Structure is named ( ix_ace )� Programmer can add additional fields to control block
CS490N -- Chapt. 26 8 2003
![Page 643: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/643.jpg)
Fields In The WwbumpControl Block
Field Type Purpose
name string Name of the ACE in ixsys.configtag int ID of ACE assigned by Resource Managerue int Bitmask of microengines running the ACEbname string TAG name that the microblock usesccbase ix_base_t Handle for crosscall service
CS490N -- Chapt. 26 9 2003
![Page 644: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/644.jpg)
Example Declarations (wwbump.h)
/* wwbump.h -- global constants and declarations for wwbump */
#ifndef __WWBUMP_H#define __WWBUMP_H
#include <ix/asl.h>#include <ix/microace/rm.h>#include <ix/asl/ixbasecc.h>#include <wwbump_import.h>
typedef struct wwbump wwbump;struct wwbump{
ix_ace ace; /* Mandatory handle for all ACEs */
char name[128]; /* Name of this ACE given in ixsys.config */int tag; /* ID assigned by RM for SA-to-ME communic. */int ue; /* Bitmask of MEs this MicroACE runs on */char bname[128]; /* Block name (need .import_var $bname_TAG) */ix_base_t ccbase; /* Handle for cross call service */
};
/* Exception handler prototype */int exception(void *ctx, ix_ring r, ix_buffer b);
#endif /* __WWBUMP_H */
CS490N -- Chapt. 26 10 2003
![Page 645: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/645.jpg)
Shared Symbolic Constants
� Core component programmed in C� Microblock component programmed in assembly language� Typical trick: bracket lines of code with #ifndef to prevent
multiple instantiations� Unusual trick
– Both languages share C-preprocessor syntax
– Can place all constant definitions in shared file
CS490N -- Chapt. 26 11 2003
![Page 646: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/646.jpg)
Distinguishing Code At Compile Time
� Preprocessor can distinguish
– Code for core component
– Code for microblock component� Technique: symbolic constant MICROCODE only defined
when assembling microcode� Can use #ifdef
CS490N -- Chapt. 26 12 2003
![Page 647: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/647.jpg)
Example File Of SymbolicConstants (wwbump_import.h)
/* wwbump_import.h -- define or import tag for wwbump MicroACE */
#ifndef _WWBUMP_IMPORT_H#define _WWBUMP_IMPORT_H
#ifdef MICROCODE
.import_var WWBUMP_TAG
#else
#define WWBUMP_TAG_STR "WWBUMP_TAG"
#endif
#endif /* _WWBUMP_IMPORT_H */
CS490N -- Chapt. 26 13 2003
![Page 648: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/648.jpg)
Division Of MicrocodeInto Files
� Two main files� Follow Intel naming convention: macro and file names
related to microcode start with uppercase letter� WWBump.uc
– Code for packet processing
– Defines macro WWBump� WWB_dl.uc
– Contains code for dispatch loop
CS490N -- Chapt. 26 14 2003
![Page 649: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/649.jpg)
WWBump Macro
� Performs two functions
– Specifies eventual output port for packet
– Classifies the packet� Classification
– Represented as disposition
* StrongARM (exception)
* Next microblock (Egress)
– Stored in register dl_next_block
CS490N -- Chapt. 26 15 2003
![Page 650: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/650.jpg)
WWBump Details
� Uses Intel ingress macro� Calls DL_GetInputPort to determine input port (0 or 1)� Calls DL_SetOutputPort to specify output port� When passing packet to core component
– Set exception to WWBUMP_TAG
– Set return code to IX_EXCEPTION
CS490N -- Chapt. 26 16 2003
![Page 651: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/651.jpg)
Steps WWBump Takes During Classification
� Read packet header into six SDRAM transfer registers� Check
– Ethernet type field (080016)
– IP type field (6)
– TCP destination port (80)� Difficult (ugly) in microcode
– Must use IP header length to calculate port offset
– Port number is 16 bits; transfer register is 32
CS490N -- Chapt. 26 17 2003
![Page 652: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/652.jpg)
File WWBump.uc (part 1)
/* WWBump.uc - microcode to process a packet */
#define ETH_IP 0x800 ; Ethernet type for IP#define IPT_TCP 6 ; IP type for TCP#define TCP_WWW 80 ; Dest. port for WWW
#macro WWBumpInit[]/* empty because no initialization is needed */
#endm
#macro WWBump[]xbuf_alloc[$$hdr,6] ; Allocate 6 SDRAM registers
/* Reserve a register (ifn) and compute the output port for the *//* frame; a frame that arrives on port 0 will go out port 1, and *//* vice versa */
.local ifnDL_GetInputPort[ifn] ; Copy input port number to ifnalu [ ifn, ifn, XOR, 1 ] ; XOR with 1 to reverse numberDL_SetOutputPort[ifn] ; Set output port for egress
.endlocal
CS490N -- Chapt. 26 18 2003
![Page 653: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/653.jpg)
File WWBump.uc (part 2)
/* Read first 24 bytes of frame header from SDRAM */
.local base offBuf_GetData[base, dl_buffer_handle] ; Get the base SDRAM addressDL_GetBufferOffset[off] ; Get packet offset in bytesalu_shf[off, --, B, off, >>3] ; Convert to Quad-wordssdram[read, $$hdr0, base, off, 3], ctx_swap ; Read 3 Quadwords
; (six registers).endlocal
/* Classify the packet. If any test fails, branch to NotWeb# */
/* Verify frame type is IP (1st two bytes of the 4th longword) */
.local etypeimmed[etype, ETH_IP]alu_shf[ --, etype, -, $$hdr3, >>16] ; 2nd operand is shiftedbr!=0[NotWeb#]
.endlocal
/* Verify IP type is TCP (last byte of the 6th longword) */
br!=byte[$$hdr5, 0, IPT_TCP, NotWeb#]
CS490N -- Chapt. 26 19 2003
![Page 654: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/654.jpg)
File WWBump.uc (part 3)
/* Verify destination port is web (offset depends on IP header size */
.local base boff dpoff dport
/* Get length of IP header (3rd byte of 4th longword), and *//* convert to bytes by shifting by six bits instead of eight */
ld_field_w_clr[dpoff, 0001, $$hdr3, >>6] ; Extract header length
/* Mask off bits above and below the IP length */
.local maskimmed[mask, 0x3c] ; Mask out upper and lower 2 bitsalu [ dpoff, dpoff, AND, mask ]
.endlocal
CS490N -- Chapt. 26 20 2003
![Page 655: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/655.jpg)
File WWBump.uc (part 4)
/* Register dpoff contains the IP header length in bytes. Add *//* Ethernet header length (14) and offset of the destination *//* port (2) to obtain offset from the beginning of the packet *//* of the destination port. Add to SDRAM address of buffer, *//* and convert to quad-word offset by dividing by 8 (shift 3). */
alu[dpoff, dpoff, +, 16] ; Add Ether+TCP offsetsBuf_GetData[base, dl_buffer_handle] ; Get buffer base addressDL_GetBufferOffset[boff] ; Get data offset in buf.alu[boff, boff, +, dpoff] ; Compute byte addressalu_shf[boff, --, B, boff, >>3] ; Convert to Q-Word addr.sdram[read, $$hdr0, base, boff, 1], ctx_swap ; Read 8 bytes
/* Use lower three bits of the byte offset to determine which *//* byte the destination port will be in. If value >= 4, dest. *//* port is in the 2nd longword; otherwise it’s in the first. */
alu[ dpoff, dpoff, AND, 0x7 ] ; Get lowest three bitsalu[ --, dpoff, -, 4] ; Test and conditionalbr>=0[SecondWord#] ; branch if value >=4
CS490N -- Chapt. 26 21 2003
![Page 656: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/656.jpg)
File WWBump.uc (part 5)
FirstWord#: /* Load upper two bytes of register $$hdr0 */ld_field_w_clr[dport, 0011, $$hdr0, >>16] ; Shift before maskbr[GotDstPort#] ; Check port number
SecondWord#: /* Load lower two bytes of register $$hdr1 */
ld_field_w_clr[dport, 0011, $$hdr1, >>16] ; Shift before mask
GotDstPort#: /* Verify destination port is 80 */
.local wprtimmed[wprt, TCP_WWW] ; Load 80 in reg. wprtalu[--, dport, -, wprt] ; Compare dport to wprtbr!=0[NotWeb#] ; and branch if not equal
.endlocal.endlocal
CS490N -- Chapt. 26 22 2003
![Page 657: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/657.jpg)
File WWBump.uc (part 6)
IsWeb#: /* Found a web packet, so send to the StrongARM */
/* Set exception code to zero (we must set this) */.local exc ; Declare register exc
immed[exc, 0] ; Place zero in exc andDL_SetExceptionCode[exc] ; set exception code
.endlocal
/* Set tag core component’s tag (required by Intel macros) */.local ace_tag ; Declare register ace_tag
immed32[ace_tag, WWBUMP_TAG] ; Place wwbump tag in reg.DL_SetAceTag[ace_tag] ; and set tag
.endlocal
/* Set register dl_next_block to IX_EXCEPTION to cause dispatch *//* to pass packet to StrongARM as an exception */immed[dl_next_block, IX_EXCEPTION] ; Store return valuebr[Finish#] ; Done, so branch to end
NotWeb#: /* Found a non-web packet, so forward to next microblock*/immed32[dl_next_block, 1] ; Set return code to 1
CS490N -- Chapt. 26 23 2003
![Page 658: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/658.jpg)
File WWBump.uc (part 7)
Finish#: /* Packet processing is complete, so clean up */xbuf_free[$$hdr] ; Release xfer registers#endm
CS490N -- Chapt. 26 24 2003
![Page 659: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/659.jpg)
Dispatch Loop Algorithm
initialize dispatch loop macros;do forever {
if (packet has arrived from the StrongARM)Send the packet to egress microblock;
if (packet has arrived from Ethernet port) {Invoke WWBump macro to process the packet;if (return code specifies exception) {
Send packet to StrongARM;} else {
Send packet to egress microblock;}
}}
CS490N -- Chapt. 26 25 2003
![Page 660: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/660.jpg)
Dispatch Loop Code
� Contained in file WWB_dl.uc� Not defined as separate macro� Includes several files
– Standard (Intel) header files to define basic macros
– Our packet processing macro (file WWBump.uc)
CS490N -- Chapt. 26 26 2003
![Page 661: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/661.jpg)
File WWB_dl.uc (part 1)
/* WWB_dl.uc - dispatch loop for wwbump program */
/* Constants */
#define IX_EXCEPTION 0 ; Return value to raise an exception#define SA_CONSUME_NUM 31 ; Ignore StrongARM packets 30 of 31 times#define SEQNUM_IGNORE 31 ; StrongARM fastport sequence num
/* Register declarations (as required for Intel dispatch loop macros) */.local dl_reg1 dl_reg2 dl_reg3 dl_reg4 dl_buffer_handle dl_next_block
/* Include files for Intel dispatch loop macros */#include "DispatchLoop_h.uc"#include "DispatchLoopImportVars.h"#include "EthernetIngress.uc"#include "wwbump_import.h"
/* Include the packet processing macro defined previously */#include "WWBump.uc"
CS490N -- Chapt. 26 27 2003
![Page 662: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/662.jpg)
File WWB_dl.uc (part 2)
/* Microblock initialization */DL_Init[]EthernetIngress_Init[]WWBumpInit[]
/* Dispatch loop that runs forever */.while(1)
Top_Of_Loop#: /* Top of dispatch loop (for equivalent of C continue) */
/* Test for a frame from the StrongARM */DL_SASource[ ] ; Get frame from SAalu[--, dl_buffer_handle, -, IX_BUFFER_NULL]; If no frame, go testbr=0[Test_Ingress#], guess_branch ; for ingress framebr[Send_MB#] ; If frame, go send it
� Macro DLSAsource
– Checks for packets from StrongARM
– Returns one packet after SA_CONSUME_NUM calls
CS490N -- Chapt. 26 28 2003
![Page 663: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/663.jpg)
File WWB_dl.uc (part 3)
Test_Ingress#: /* Test for an Ethernet frame */
EthernetIngress[ ] ; Get an Ethernet framealu[--, dl_buffer_handle, -, IX_BUFFER_NULL]; If no frame, go backbr=0[Top_Of_Loop#] ; to start of loop
/* Check if ingress frame valid and drop if not */br!=byte[dl_next_block, 0, 1, Drop_Packet#]
/* Invoke WWBump macro to set output port and classify the frame */WWBump[]
/* Use return value from WWBump to dispose of frame: *//* if exception, jump to code that sends to StrongARM *//* else jump to code that sends to egress */
alu[ --, dl_next_block, -, IX_EXCEPTION] ; Return code is exceptionbr=0[Send_SA#] ; so send to StrongARM
br[Send_MB#] ; Otherwise, send to next; microblock
� Note dl_next_block specifies whether the frame should betreated as an exception or forwarded to next microblock
CS490N -- Chapt. 26 29 2003
![Page 664: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/664.jpg)
File WWB_dl.uc (part 4)
Send_SA#:/* Send the frame to the core component on the StrongARM as an *//* exception. Note that tag and exception code are assigned by *//* the microblock WWBump. */DL_SASink[ ].continue ; Continue dispatch loop
Send_MB#:/* Send the frame to the next microblock (egress). Note that the *//* output port (field oface hidden in the internal structure) has *//* been assigned by microblock WWBump. */DL_MESink[ ]nop.continue
Drop_Packet#:/* Drop the frame and start over getting a new frame */DL_Drop[ ]
.endw
nop ; Although the purpose of these no-ops isnop ; undocumented, Intel examples include them.nop.endlocal
CS490N -- Chapt. 26 30 2003
![Page 665: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/665.jpg)
Code For Core Component
� Written in C� Defines two basic functions
– Exception handler called when packet arrives fromStrongARM
– Default action called when another frame arrives (e.g.,from another ACE)
CS490N -- Chapt. 26 31 2003
![Page 666: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/666.jpg)
Exception Handler
� Function named exception� Receives packets from microblock� Must call Intel function RmGetExceptionCode� Can call Intel function RmSendPacket to forward packet to a
microblock� Three arguments
– Pointer to ACE control block
– Ring buffer (not used in our code)
– Pointer to ix_buffer holding packet
CS490N -- Chapt. 26 32 2003
![Page 667: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/667.jpg)
Default Action
� Function named ix_action_default� Required part of every ACE� Called when frame arrives from source other than
microblock component� Our version merely discards the packet� Code in file action.c
CS490N -- Chapt. 26 33 2003
![Page 668: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/668.jpg)
File action.c (part 1)
/* action.c -- Core component of wwbump that handles exceptions */
#include <wwbump.h>#include <stdlib.h>#include <wwbcc.h>
ix_error exception(void *ctx, ix_ring r, ix_buffer b){
struct wwbump *wwb = (struct wwbump *) ctx; /* ctx is the ACE */ix_error e;unsigned char c;(void) r; /* Note: not used in our example code */
/* Get the exception code: Note: Intel code requires this step */e = RmGetExceptionCode(wwb->tag, &c);if ( e ) {
fprintf(stderr, "%s: Error getting exception code", wwb->name);ix_error_dump(stderr, e);ix_buffer_del(b);return e;
}
� Call to RmGetExceptionCode is required
CS490N -- Chapt. 26 34 2003
![Page 669: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/669.jpg)
File action.c (part 2)
Webcnt++; /* Count the packet as a web packet */
/* Send the packet back to wwbump microblock */e = RmSendPacket(wwb->tag, b);if ( e ) { /* If error occurred, report the error */
ix_error_dump(stderr, e);ix_buffer_del(b);return e;
}
return 0;}
� First line of this page
– Performs entire ‘‘work’’ of exception handler
– Increments global counter ( Webcnt )� Once packet has been counted, RmSendPacket enqueues
packet back to microblock group
CS490N -- Chapt. 26 35 2003
![Page 670: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/670.jpg)
File action.c (part 3)
/* A core component must define an ix_action_default function that is *//* invoked if a frame arrives from the core component of another ACE. *//* Because wwbump does not expect such frames, the version of the *//* default function used with wwmbump simply deletes any packet that *//* it receives via this interface. */
int ix_action_default(ix_ace * a, ix_buffer b){
(void) a; /* This line prevents a compiler warning*/ix_buffer_del(b); /* Delete the frame */return RULE_DONE; /* This step required */
}
� Code for default action is trivial: discard any packet thatarrives
CS490N -- Chapt. 26 36 2003
![Page 671: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/671.jpg)
Initialization And Finalization
� Additional code needed for each ACE� Function ix_init performs initialization
– Allocates memory for the ACE control block
– Invokes Intel macros
* ix_ace_init for internal initialization
* RmInit to form connection to Resource Manager
* RmRegister to register with Resource Manager
CS490N -- Chapt. 26 37 2003
![Page 672: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/672.jpg)
Initialization And Finalization(continued)
� Function ix_fini performs finalization
– Invokes Intel macros
* cc_fini to terminate crosscalls
* RmTerm to terminate Resource Manager
* ix_ace_fini to deallocate ACE control block
CS490N -- Chapt. 26 38 2003
![Page 673: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/673.jpg)
File init.c (part 1)
/* init.c -- Initialization and completion routines for the wwbump ACE */
#include <stdlib.h>#include <string.h>#include <wwbump.h>#include <wwbcc.h>
/* Initialization for the wwbump ACE */ix_error ix_init(int argc, char **argv, ix_ace ** ap){
struct wwbump *wwb;ix_error e;(void)argc;
/* Set so ix_fini won’t free a random value if ix_init fails */*ap = 0;
/* Allocate memory for the WWBump structure (includes ix_ace) */wwb = malloc(sizeof(struct wwbump));if ( wwb == NULL )
return ix_error_new(IX_ERROR_LEVEL_LOCAL, IX_ERROR_OOM, 0,"couldn’t allocate memory for ACE");
CS490N -- Chapt. 26 39 2003
![Page 674: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/674.jpg)
File init.c (part 2)
/* Microengine mask is always passed as the third argument */wwb->ue = atoi(argv[2]);
/* Set blockname used to associate the ACE with its microblock */strcpy(wwb->bname, "WWBUMP");
/* The name of the ACE is always the 2nd argument *//* The first argument is the name of the executable */wwb->name[sizeof(wwb->name) - 1] = ’\0’;strncpy(wwb->name, argv[1], sizeof(wwb->name) - 1);
/* Initializes the ix_ace handle (including dispatch loop, control *//* access point, etc) */e = ix_ace_init(&wwb->ace, wwb->name);if (e) {
free(wwb);return ix_error_new(0,0,e,"Error in ix_ace_init()\n");
}
CS490N -- Chapt. 26 40 2003
![Page 675: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/675.jpg)
File init.c (part 3)
/* Initialize a connection to the resource manager */e = RmInit();if (e) {
ix_ace_fini(&wwb->ace);free(wwb);return ix_error_new(0,0,e,"Error in RmInit()\n");
}
/* Register with the resource manager (including exception handler) */e = RmRegister(&wwb->tag, wwb->bname,&wwb->ace, exception, wwb,
wwb->ue);if (e) {
RmTerm();ix_ace_fini(&wwb->ace);free(wwb);return ix_error_new(0,0,e,"Error in RmRegister()\n");
}
CS490N -- Chapt. 26 41 2003
![Page 676: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/676.jpg)
File init.c (part 4)
/* Initialize crosscalls */e = cc_init(wwb);if ( e ) {
RmUnRegister(&wwb->tag);RmTerm();ix_ace_fini(&wwb->ace);free(wwb);return e;
}
*ap = &wwb->ace;return 0;
}
ix_error ix_fini(int argc, char **argv, ix_ace * ap){
struct wwbump *wwb = (struct wwbump *) ap;ix_error e;(void)argc;(void)argv;
/* ap == 0 if ix_init() fails */if ( ! ap )return 0;
CS490N -- Chapt. 26 42 2003
![Page 677: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/677.jpg)
File init.c (part 5)
/* Finalize crosscalls */e = cc_fini(wwb);if ( e )
return e;
/* Unregister the exception handler and microblocks */e = RmUnRegister(wwb->tag);if ( e )
return ix_error_new(0,0,e,"Error in RmUnRegister()\n");
/* Terminate connection with resource manager */e = RmTerm();if ( e )
return ix_error_new(0,0,e,"Error in RmTerm()\n");
/* Finalize the ix_ace handle */e = ix_ace_fini(&wwb->ace);if ( e )
return ix_error_new(0,0,e,"Error in ix_ace_fini()\n");
/* Free the malloc()ed memory */free(wwb);return 0;
}
CS490N -- Chapt. 26 43 2003
![Page 678: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/678.jpg)
The Important Point
� Wwbump performs a trivial task� The code invokes Intel’s ingress and egress ACEs� The code is written using Intel’s dispatch loop macros� The code is large.
CS490N -- Chapt. 26 44 2003
![Page 679: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/679.jpg)
The Important Point
� Wwbump performs a trivial task� The code invokes Intel’s ingress and egress ACEs� The code is written using Intel’s dispatch loop macros� The code is huge!
CS490N -- Chapt. 26 44 2003
![Page 680: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/680.jpg)
An Example Crosscall
� Trivial example
– Provide access to current Web packet count
– Wwbump ACE acts as crosscall server
– Only exports one function
– Can be called from non-ACE program� Exported function named getcnt
CS490N -- Chapt. 26 44 2003
![Page 681: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/681.jpg)
Basic Steps When BuildingA Crosscall Facility
� Define a set of functions that the crosscall server exports� Create an Interface Definition Language ( IDL ) declaration
for each function, and use the IDL compiler to generate thecorresponding stub code
� Write code for each exported function with arguments thatmatch the generated stub arguments
� Write an initialization function� Write a finalization function
CS490N -- Chapt. 26 45 2003
![Page 682: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/682.jpg)
Example IDL Specification
interface wwbump{
twoway long getcnt();};
� IDL specification declares
– Name and type of exported function
– Type of each argument
CS490N -- Chapt. 26 46 2003
![Page 683: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/683.jpg)
Files Generated By The IDL Compiler
File Contents Description
IFACE_stub_c.h Crosscall data types and types forinitialization and finalization functions
IFACE_intName Interface name as a stringIFACE_FCN_opName Function name as a stringstub_IFACE_init() Client crosscall initializationstub_IFACE_fini() Client crosscall finalizationIFACE_FCN_fptr Function pointer for FCNstub_IFACE_FCN() Client-side crosscall functionCC_VMT_IFACE Crosscall Virtual Method Table structuredeferred_cb_IFACE_IFCN() Client callback prototypeIFACE_FCN_cb_fptr Client callback function pointerCB_VMT_IFACE Client callback VMTgetCBVMT_IFACE() Find VMT for callback
IFACE_sk_c.h Declarations of server-side interfacessk_IFACE_init() Server initializationsk_IFACE_fini() Server finalizationgetCCVMT_IFACE() Find the VMT for the interfaceinvoke_IFACE_IFCN() Invoke a crosscallIFACE_stub_c.h Included file
CS490N -- Chapt. 26 47 2003
![Page 684: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/684.jpg)
More Files Generated By The IDL Compiler
File Contents Description
IFACE_cc_c.h Prototypes for individual crosscall functionsIFACE_FCN() Prototype for function FCNIFACE_sk_c.h Included file
IFACE_cb_c.h Prototypes and Virtual Method Tablesfor client-side callback functions
IFACE_FCN_cb() Prototype for client callbackIFACE_stub_c.h Included file
IFACE_stub_c.c Code for client-side initialization andfinalization crosscall functions
IFACE_sk_c.c Code for server-side initialization andfinalization and VMT accessor functions
IFACE_cc_c.h Prototypes for individual crosscall functionsIFACE_FCN() Prototype for function FCNIFACE_sk_c.h Included file
IFACE_cb_c.h Prototypes and Virtual Method Tablesfor client-side callback functions
IFACE_FCN_cb() Prototype for client callbackIFACE_stub_c.h Included file
CS490N -- Chapt. 26 48 2003
![Page 685: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/685.jpg)
Even More IDL-Generated Files
File Description
IFACE_cc_c.c Default code for crosscall functionsthat merely return an error.
IFACE_cb_c.c Default implementations of client callbackfunctions that merely return an error.
CS490N -- Chapt. 26 49 2003
![Page 686: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/686.jpg)
Code For A (Trivial) Crosscall Server
� Programmer must supply
– Code for exported function(s)
– Initialization function
– Finalization function
– Global variable declarations
– Code to change pointer(s) in global virtual method table� Example code in file wwbcc.c
CS490N -- Chapt. 26 50 2003
![Page 687: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/687.jpg)
File wwbcc.c (part 1)
/* wwbcc.c -- wwbump crosscall functions */
#include <stdlib.h>#include <string.h>#include <wwbump.h>#include <wwbcc.h>#include "wwbump_sk_c.h"#include "wwbump_cc_c.h"
long Webcnt; /* Stores the count of web packets */
/* Initialization function for the crosscall */
ix_error cc_init(struct wwbump *wwb){
CC_VMT_wwbump *vmt = 0;ix_cap *capp;ix_error e;
/* Initialize an ix_base_t structure to 0 */memset(&wwb->ccbase, 0, sizeof(wwb->ccbase));
/* Get the OMS communications access point (CAP) of the ACE */ix_ace_to_cap(&wwb->ace, &capp);
CS490N -- Chapt. 26 51 2003
![Page 688: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/688.jpg)
File wwbcc.c (part 2)
/* Invoke the crosscall initialization function and check for error */e = sk_wwbump_init(&wwb->ccbase, capp);if (e)
return ix_error_new(0,0,e,"Error in sk_wwbump_init()\\n");
/* Retarget incoming crosscalls to our getcnt function */
/* Get a pointer to the CrossCall Virtual Method Table */e = getCCVMT_wwbump(&wwb->ccbase, &vmt);if (e){
sk_wwbump_fini(&wwb->ccbase);return ix_error_new(0,0,e,"Error in getCCVMT_wwbump()\\n");
}
/* Retarget function pointer in the table to getcnt */vmt->_pCC_wwbump_getcnt = getcnt;
/* Set initial count of web packets to zero */Webcnt = 0;
return 0;}
CS490N -- Chapt. 26 52 2003
![Page 689: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/689.jpg)
File wwbcc.c (part 3)
/* Cross call termination function */ix_error cc_fini(struct wwbump *wwb){
ix_error e;/* Finalize crosscall and check for error */e = sk_wwbump_fini(&wwb->ccbase);if ( e )
return ix_error_new(0,0,e,"Error in sk_wwbump_fini()\\n");
return 0; /* If no error, indicate sucessful return */}
/* Function that is invoked each time a crosscall occurs */ix_error getcnt(ix_base_t* bp, long* rv){
(void)bp; /* Reference unused arg to prevent compiler warnings */
/* Actual work: copy the web count into the return value */*rv = Webcnt;
/* Return 0 for success */return 0;
}
CS490N -- Chapt. 26 53 2003
![Page 690: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/690.jpg)
Tying It All Together
� Need to specify
– ACEs to be loaded
– Which ACEs are ingress / egress
– Number and speed of network ports� Configuration file used
– Named ixsys.config
– Programmer usually modifies sample file
CS490N -- Chapt. 26 54 2003
![Page 691: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/691.jpg)
Configuration Examples
file 0 /mnt/SlowIngressWWBump.uof
� Defines file /mnt/lowIngressWWBump.uof to be
– Ingress microcode (type 0)
– Associated with slow ports (10/100 ports)
CS490N -- Chapt. 26 55 2003
![Page 692: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/692.jpg)
Configuration Examples(continued)
microace wwbump /mnt/wwbump none 0 0
� Specifies
– wwbump is a microace
– Runs on the ingress side (first zero)
– Has unknown type (second zero)
CS490N -- Chapt. 26 56 2003
![Page 693: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/693.jpg)
Configuration Examples(continued)
bind static ifaceInput/default wwbump
bind static wwbump/default ifaceOutput
� Specifies
– Default binding for ingress ACE is wwbump
– Default binding for output from wwbump is ACE namedifaceOutput
CS490N -- Chapt. 26 57 2003
![Page 694: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/694.jpg)
Configuration Examples(continued)
sh command
� Specifies command to be run on the StrongARM� Typical use: preload ARP cache
sh arp -s 10.1.0.2 01:02:03:04:05:06
CS490N -- Chapt. 26 58 2003
![Page 695: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/695.jpg)
Summary
A huge amount of low-level code is required, even for atrivial ACE that counts Web packets or a trivial crosscall thatreturns the contents of an integer. In addition to the code, adetailed configuration file must be created to specify bindingsamong ACEs.
CS490N -- Chapt. 26 59 2003
![Page 696: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/696.jpg)
Questions?
![Page 697: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/697.jpg)
X
Switching Fabrics
CS490N -- Chapt. 10 1 2003
![Page 698: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/698.jpg)
Physical Interconnection
� Physical box with backplane� Individual blades plug into backplane slots� Each blade contains one or more network connections
CS490N -- Chapt. 10 2 2003
![Page 699: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/699.jpg)
Logical Interconnection
� Known as switching fabric� Handles transport from one blade to another� Becomes bottleneck as number of interfaces scales
CS490N -- Chapt. 10 3 2003
![Page 700: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/700.jpg)
Illustration Of Switching Fabric
switchingfabric
1
2
N
...
1
2
M
...
CPU
input ports output ports
inputarrives
outputleaves
� Any input port can send to any output port
CS490N -- Chapt. 10 4 2003
![Page 701: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/701.jpg)
Switching Fabric Properties
� Used inside a single network system� Interconnection among I/O ports (and possibly CPU)� Can transfer unicast, multicast, and broadcast packets� Scales to arbitrary data rate on any port� Scales to arbitrary packet rate on any port� Scales to arbitrary number of ports� Has low overhead� Has low cost
CS490N -- Chapt. 10 5 2003
![Page 702: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/702.jpg)
Types Of Switching Fabrics
� Space-division (separate paths)� Time-division (shared medium)
CS490N -- Chapt. 10 6 2003
![Page 703: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/703.jpg)
Space-Division Fabric (separate paths)
switching fabric
1
2
N
...
1
2
M
...
input ports output ports
inputarrives
outputleaves
interface hardware
� Can use multiple paths simultaneously
CS490N -- Chapt. 10 7 2003
![Page 704: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/704.jpg)
Space-Division Fabric (separate paths)
switching fabric
1
2
N
...
1
2
M
...
input ports output ports
inputarrives
outputleaves
interface hardware
� Can use multiple paths simultaneously� Still have port contention
CS490N -- Chapt. 10 7 2003
![Page 705: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/705.jpg)
Desires
CS490N -- Chapt. 10 8 2003
![Page 706: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/706.jpg)
Desires
� High speed
CS490N -- Chapt. 10 8 2003
![Page 707: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/707.jpg)
Desires
� High speed� Low cost
CS490N -- Chapt. 10 8 2003
![Page 708: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/708.jpg)
Desires
� High speed and low cost!
CS490N -- Chapt. 10 8 2003
![Page 709: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/709.jpg)
Possible Compromise
� Separation of physical paths� Less parallel hardware� Crossbar design
CS490N -- Chapt. 10 9 2003
![Page 710: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/710.jpg)
Space-Division (Crossbar Fabric)
switching fabric
controller
1
2
N
...
1 2 M...
input ports
output ports
inactiveconnection
activeconnection
interface hardware
CS490N -- Chapt. 10 10 2003
![Page 711: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/711.jpg)
Crossbar
� Allows simultaneous transfer on disjoint pairs of ports� Can still have port contention
CS490N -- Chapt. 10 11 2003
![Page 712: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/712.jpg)
Crossbar
� Allows simultaneous transfer on disjoint pairs of ports� Can still have port contention
CS490N -- Chapt. 10 11 2003
![Page 713: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/713.jpg)
Solving Contention
� Queues (FIFOs)
– Attached to input
– Attached to output
– At intermediate points
CS490N -- Chapt. 10 12 2003
![Page 714: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/714.jpg)
Crossbar Fabric With Queuing
switching fabric
controller
1
2
N
...
1 2 M...
input ports
output ports
input queues
output queues
CS490N -- Chapt. 10 13 2003
![Page 715: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/715.jpg)
Time-Division Fabric (shared bus)
shared bus
1 2 N.. . 1 2 M.. .
input ports output ports
� Chief advantage: low cost� Chief disadvantage: low speed
CS490N -- Chapt. 10 14 2003
![Page 716: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/716.jpg)
Time-Division Fabric (shared memory)
shared memoryswitching fabric
controller
1
2
N
...
1
2
M
...
input ports output ports
memoryinterface
� May be better than shared bus� Usually more expensive
CS490N -- Chapt. 10 15 2003
![Page 717: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/717.jpg)
Multi-Stage Fabrics
� Compromise between pure time-division and pure space-division
� Attempt to combine advantages of each
– Lower cost from time-division
– Higher performance from space-division� Technique: limited sharing
CS490N -- Chapt. 10 16 2003
![Page 718: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/718.jpg)
Banyan Fabric
� Example of multi-stage fabric� Features
– Scalable
– Self-routing
– Packet queues allowed, but not required
CS490N -- Chapt. 10 17 2003
![Page 719: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/719.jpg)
Basic Banyan Building Block
2-inputswitch
"0"
"1"
input #1
input #2
� Address added to front of each packet� One bit of address used to select output
CS490N -- Chapt. 10 18 2003
![Page 720: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/720.jpg)
4-Input And 8-Input Banyan Switches
4-input switch
(a)
SW1 SW3
SW2 SW4
"00"
"01"
"10"
"11"
8-input switch
(b)
4-input switch(for detailssee above)
4-input switch(for detailssee above)
SW1
SW2
SW3
SW4
"000"
"001"
"010"
"011"
"100"
"101"
"110"
"111"
inputs outputs
inputs outputs
CS490N -- Chapt. 10 19 2003
![Page 721: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/721.jpg)
Summary
� Switching fabric provides connections inside single networksystem
� Two basic approaches
– Time-division has lowest cost
– Space-division has highest performance� Multistage designs compromise between two� Banyan fabric is example of multistage
CS490N -- Chapt. 10 20 2003
![Page 722: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/722.jpg)
?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
?Questions?
![Page 723: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/723.jpg)
XIII
Network Processor Architectures
CS490N -- Chapt. 13 1 2003
![Page 724: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/724.jpg)
Architectural Explosion
An excess of exuberance and a lack of experience haveproduced a wide variety of approaches and architectures.
CS490N -- Chapt. 13 2 2003
![Page 725: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/725.jpg)
Principle Components
� Processor hierarchy� Memory hierarchy� Internal transfer mechanisms� External interface and communication mechanisms� Special-purpose hardware� Polling and notification mechanisms� Concurrent and parallel execution support� Programming model and paradigm� Hardware and software dispatch mechanisms
CS490N -- Chapt. 13 3 2003
![Page 726: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/726.jpg)
Processing Hierarchy
� Consists of hardware units� Performs various aspects of packet processing� Includes onboard and external processors� Individual processor can be
– Programmable
– Configurable
– Fixed
CS490N -- Chapt. 13 4 2003
![Page 727: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/727.jpg)
Typical Processor Hierarchy
Level Processor Type Programmable? On Chip?
8 General purpose CPU yes possibly7 Embedded processor yes typically5 I/O processor yes typically6 Coprocessor no typically4 Fabric interface no typically3 Data transfer unit no typically2 Framer no possibly1 Physical transmitter no possibly
CS490N -- Chapt. 13 5 2003
![Page 728: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/728.jpg)
Memory Hierarchy
� Memory measurements
– Random access latency
– Sequential access latency
– Throughput
– Cost� Can be
– Internal
– External
CS490N -- Chapt. 13 6 2003
![Page 729: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/729.jpg)
Typical Memory Hierarchy
Memory Type Rel. Speed Approx. Size On Chip?
Control store 100 103 yesG.P. Registers† 90 102 yesOnboard Cache 40 103 yesOnboard RAM 7 103 yesStatic RAM 2 107 noDynamic RAM 1 108 no
CS490N -- Chapt. 13 7 2003
![Page 730: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/730.jpg)
Internal Transfer Mechanisms
� Internal bus� Hardware FIFOs� Transfer registers� Onboard shared memory
CS490N -- Chapt. 13 8 2003
![Page 731: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/731.jpg)
External Interface AndCommunication Mechanisms
� Standard and specialized bus interfaces� Memory interfaces� Direct I/O interfaces� Switching fabric interface
CS490N -- Chapt. 13 9 2003
![Page 732: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/732.jpg)
Example Interfaces
� System Packet Interface Level 3 or 4 (SPI-3 or SPI-4)� SerDes Framer Interface (SFI)� CSIX fabric interface
Note: The Optical Internetworking Forum (OIF) controls the SPI and SFIstandards.
CS490N -- Chapt. 13 10 2003
![Page 733: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/733.jpg)
Polling And Notification Mechanisms
� Handle asynchronous events
– Arrival of packet
– Timer expiration
– Completion of transfer across the fabric� Two paradigms
– Polling
– Notification
CS490N -- Chapt. 13 11 2003
![Page 734: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/734.jpg)
Concurrent Execution Support
� Improves overall throughput� Multiple threads of execution� Processor switches context when a thread blocks
CS490N -- Chapt. 13 12 2003
![Page 735: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/735.jpg)
Support For Concurrent Execution
� Embedded processor
– Standard operating system
– Context switching in software� I/O processors
– No operating system
– Hardware support for context switching
– Low-overhead or zero-overhead
CS490N -- Chapt. 13 13 2003
![Page 736: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/736.jpg)
Concurrent Support Questions
� Local or global threads (does thread execution spanmultiple processors)?
� Forced or voluntary context switching (are threadspreemptable)?
CS490N -- Chapt. 13 14 2003
![Page 737: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/737.jpg)
Hardware And Software Dispatch Mechanisms
� Refers to overall control of parallel operations� Dispatcher
– Chooses operation to perform
– Assigns to a processor
CS490N -- Chapt. 13 15 2003
![Page 738: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/738.jpg)
Implicit And Explicit Parallelism
� Explicit parallelism
– Exposes parallelism to programmer
– Requires software to understand parallel hardware� Implicit parallelism
– Hides parallel copies of functional units
– Software written as if single copy executing
CS490N -- Chapt. 13 16 2003
![Page 739: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/739.jpg)
Architecture Styles, Packet Flow,And Clock Rates
� Embedded processor plus fixed coprocessors� Embedded processor plus programmable I/O processors� Parallel (number of processors scales to handle load)� Pipeline processors� Dataflow
CS490N -- Chapt. 13 17 2003
![Page 740: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/740.jpg)
Embedded Processor Architecture
f(); g(); h()
� Single processor
– Handles all functions
– Passes packet on� Known as run-to-completion
CS490N -- Chapt. 13 18 2003
![Page 741: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/741.jpg)
Parallel Architecture
f(); g(); h()
f(); g(); h()
f(); g(); h()
...
coordinationmechanism
� Each processor handles 1/N of total load
CS490N -- Chapt. 13 19 2003
![Page 742: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/742.jpg)
Pipeline Architecture
f () g () h ()
� Each processor handles one function� Packet moves through ‘‘pipeline’’
CS490N -- Chapt. 13 20 2003
![Page 743: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/743.jpg)
Clock Rates
� Embedded processor runs at > wire speed� Parallel processor runs at < wire speed� Pipeline processor runs at wire speed
CS490N -- Chapt. 13 21 2003
![Page 744: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/744.jpg)
Software Architecture
� Central program that invokes coprocessors like subroutines� Central program that interacts with code on intelligent,
programmable I/O processors� Communicating threads� Event-driven program� RPC-style (program partitioned among processors)� Pipeline (even if hardware does not use pipeline)� Combinations of the above
CS490N -- Chapt. 13 22 2003
![Page 745: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/745.jpg)
Example Uses Of Programmable Processors
General purpose CPUHighest level functionalityAdministrative interfaceSystem controlOverall management functionsRouting protocols
Embedded processorIntermediate functionalityHigher-layer protocolsControl of I/O processorsException and error handlingHigh-level ingress (e.g., reassembly)High-level egress (e.g., traffic shaping)
I/O processorBasic packet processingClassificationForwardingLow-level ingress operationsLow-level egress operations
CS490N -- Chapt. 13 23 2003
![Page 746: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/746.jpg)
Using The Processor Hierarchy
To maximize performance, packet processing tasks should beassigned to the lowest level processor capable of performingthe task.
CS490N -- Chapt. 13 24 2003
![Page 747: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/747.jpg)
Packet Flow Through The Hierarchy
Standard CPU (external)
Embedded (RISC) Processor
I/O Processor
Lower Levels Of Processor Hierarchy
dataarrives
dataleaves
data to / fromprogrammable processors
small amountof data
almost nodata
CS490N -- Chapt. 13 25 2003
![Page 748: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/748.jpg)
Summary
� Network processor architectures characterized by
– Processor hierarchy
– Memory hierarchy
– Internal buses
– External interfaces
– Special-purpose functional units
– Support for concurrent or parallel execution
– Programming model
– Dispatch mechanisms
CS490N -- Chapt. 13 26 2003
![Page 749: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/749.jpg)
Questions?
![Page 750: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/750.jpg)
XIV
Issues In Scaling A Network Processor
CS490N -- Chapt. 14 1 2003
![Page 751: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/751.jpg)
Design Questions
� Can we make network processors
– Faster?
– Easier to use?
– More powerful?
– More general?
– Cheaper?
– All of the above?� Scale is fundamental
CS490N -- Chapt. 14 2 2003
![Page 752: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/752.jpg)
Scaling The Processor Hierarchy
� Make processors faster� Use more concurrent threads� Increase processor types� Increase numbers of processors
CS490N -- Chapt. 14 3 2003
![Page 753: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/753.jpg)
The Pyramid Of Processor Scale
CPU
Embedded Proc.
I / O Processors
Lower Levels Of Processor Hierarchy
� Lower levels need the most increase
CS490N -- Chapt. 14 4 2003
![Page 754: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/754.jpg)
Scaling The Memory Hierarchy
� Size� Speed� Throughput� Cost
CS490N -- Chapt. 14 5 2003
![Page 755: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/755.jpg)
Memory Speed
� Access latency
– Raw read/write access speed
– SRAM 2 - 10 ns
– DRAM 50 - 70 ns
– External memory takes order of magnitude longer thanonboard
CS490N -- Chapt. 14 6 2003
![Page 756: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/756.jpg)
Memory Speed(continued)
� Memory cycle time
– Measure of successive read/write operations
– Important for networking because packets are large
– Read Cycle time (tRC) is time for successive fetchoperations
– Write Cycle time (tWC) is time for successive storeoperations
CS490N -- Chapt. 14 7 2003
![Page 757: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/757.jpg)
The Pyramid Of Memory Scale
Reg.
Onboard mem.
External SRAM
External DRAM
� Largest memory is least expensive
CS490N -- Chapt. 14 8 2003
![Page 758: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/758.jpg)
Memory Bandwidth
� General measure of throughput� More parallelism in access path yields more throughput� Cannot scale arbitrarily
– Pinout limits
– Processor must have addresses as wide as bus
CS490N -- Chapt. 14 9 2003
![Page 759: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/759.jpg)
Types Of Memory
Memory Technology Abbreviation Purpose
Synchronized DRAM SDRAM Synchronized with CPUfor lower latency
Quad Data Rate SRAM QDR-SRAM Optimized for low latencyand multiple access
Zero Bus Turnaround SRAM ZBT-SRAM Optimized for randomaccess
Fast Cycle RAM FCRAM Low cycle time optimizedfor block transfer
Double Data Rate DRAM DDR-DRAM Optimized for lowlatency
Reduced Latency DRAM RLDRAM Low cycle time andlow power requirements
CS490N -- Chapt. 14 10 2003
![Page 760: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/760.jpg)
Memory Cache
� General-purpose technique� May not work well in network systems
CS490N -- Chapt. 14 11 2003
![Page 761: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/761.jpg)
Memory Cache
� General-purpose technique� May not work well in network systems
– Low temporal locality
CS490N -- Chapt. 14 11 2003
![Page 762: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/762.jpg)
Memory Cache
� General-purpose technique� May not work well in network systems
– Low temporal locality
– Large cache size (either more entries or largergranularity of access)
CS490N -- Chapt. 14 11 2003
![Page 763: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/763.jpg)
Content Addressable Memory (CAM)
� Combination of mechanisms
– Random access storage
– Exact-match pattern search� Rapid search enabled with parallel hardware
CS490N -- Chapt. 14 12 2003
![Page 764: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/764.jpg)
Arrangement Of CAM
CAM
...
one slot
� Organized as array of slots
CS490N -- Chapt. 14 13 2003
![Page 765: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/765.jpg)
Lookup In Conventional CAM
� Given
– Pattern for which to search
– Known as key� CAM returns
– First slot that matches key, or
– All slots that match key
CS490N -- Chapt. 14 14 2003
![Page 766: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/766.jpg)
Ternary CAM (T-CAM)
� Allows masking of entries� Good for network processor
CS490N -- Chapt. 14 15 2003
![Page 767: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/767.jpg)
T-CAM Lookup
� Each slot has bit mask� Hardware uses mask to decide which bits to test� Algorithm
for each slot do {
if ( ( key & mask ) == ( slot & mask ) ) {
declare key matches slot;
} else {
declare key does not match slot;}
}
CS490N -- Chapt. 14 16 2003
![Page 768: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/768.jpg)
Partial Matching With A T-CAM
08 00 45 06 00 50 00 02
ff ff ff ff ff ff 00 00
08 00 45 06 00 35 00 03
ff ff ff ff ff ff 00 00
08 00 45 06 00 50 00 00
slot #1
slot #2
key
mask
mask
� Key matches slot #1
CS490N -- Chapt. 14 17 2003
![Page 769: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/769.jpg)
Using A T-CAM For Classification
� Extract values from fields in headers� Form values in contiguous string� Use a key for T-CAM lookup� Store classification in slot
CS490N -- Chapt. 14 18 2003
![Page 770: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/770.jpg)
Classification Using A T-CAM
CAM RAM
...
storage for key pointer
CS490N -- Chapt. 14 19 2003
![Page 771: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/771.jpg)
Software Scalability
� Not always easy� Many resource constraints� Difficulty arises from
– Explicit parallelism
– Code optimized by hand
– Pipelines on heterogeneous hardware
CS490N -- Chapt. 14 20 2003
![Page 772: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/772.jpg)
Summary
� Scalability key issue� Primary subsystems affecting scale
– Processor hierarchy
– Memory hierarchy� Many memory types available
– SRAM
– SDRAM
– CAM� T-CAM useful for classification
CS490N -- Chapt. 14 21 2003
![Page 773: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/773.jpg)
Questions?
![Page 774: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/774.jpg)
XV
Examples Of Commercial Network Processors
CS490N -- Chapt. 15 1 2003
![Page 775: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/775.jpg)
Commercial Products
� Emerge in late 1990s� Become popular in early 2000s� Exceed thirty vendors by 2003
CS490N -- Chapt. 15 2 2003
![Page 776: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/776.jpg)
Examples
� Chosen to
– Illustrate concepts
– Show broad categories
– Expose the variety� Not necessarily ‘‘best’’� Not meant as an endorsement of specific vendors� Show a snapshot as of 2003
CS490N -- Chapt. 15 3 2003
![Page 777: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/777.jpg)
Multi-Chip Pipeline (Agere)
� Brand name PayloadPlus� Three individual chips
– Fast Pattern Processor (FPP) for classification
– Routing Switch Processor (RSP) for forwarding
– Agere System Interface (ASI) for traffic managementand exceptions
CS490N -- Chapt. 15 4 2003
![Page 778: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/778.jpg)
Multi-Chip Pipeline (Agere)(continued)
configuration bus
FastPattern
Processor( FPP )
RoutingSwitch
Processor( RSP )
AgereSystem
Interface( ASI )
packetsarrive
packets sentto fabric
CS490N -- Chapt. 15 5 2003
![Page 779: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/779.jpg)
Languages Used By Agere
� FPL
– Functional Programming Language
– Produces code for FPP
– Non-procedural� ASL
– Agere Scripting Language
– Produces code for ASI
– Similar to shell scripts
CS490N -- Chapt. 15 6 2003
![Page 780: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/780.jpg)
Architecture Of Agere’s FPP Chip
inputframer
outputinterface
conf. businterface
functional businterface
data buffercontroller
ALU
checksum /CRC engine
queueengine
patternengine
block buffers andcontext memory
programmemory
controlmemory
databuffer
CS490N -- Chapt. 15 7 2003
![Page 781: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/781.jpg)
Processors On Agere’s FPP
Processor Or Unit Purpose
Pattern processing engine Perform pattern matching on each packetQueue engine Control packet queueingChecksum/CRC engine Compute checksum or CRC for a packetALU Conventional operations
Input interface and framer Divide incoming packet into 64-octet blocksData buffer controller Control access to external data bufferConfiguration bus interface Connect to external configuration busFunctional bus interface Connect to external functional busOutput interface Connect to external RSP chip
� Note: Functional bus interface implements an RPC-likeinterface
CS490N -- Chapt. 15 8 2003
![Page 782: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/782.jpg)
Architecture Of Agere’s RSP Chip
inputinterface
outputinterface
packetassembler
streameditor
Transmit Queue Logic
queuemanager
logic
trafficmanagerengine
trafficshaperengine
ext. sched.SSRAM
ext. sched.interface
ext. queueentry SSRAM
ext. linkedlist SSRAM
config.inter-face
packets in SDRAM SSRAM
input output
CS490N -- Chapt. 15 9 2003
![Page 783: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/783.jpg)
Processors On Agere’s RSP
Processor Or Unit Purpose
Stream editor engine Perform modifications on packetTraffic manager engine Police traffic and keep statisticsTraffic shaper engine Ensure QoS parameters
Input interface Accept packet from FPPPacket assembler Store incoming packet in memoryQueue manager logic Interface to external traffic schedulerConfiguration bus interface Connect to external configuration busOutput interface External connection for outgoing packets
CS490N -- Chapt. 15 10 2003
![Page 784: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/784.jpg)
Augmented RISC (Alchemy)
� Based on MIPS-32 CPU
– Five-stage pipeline� Augmented for packet processing
– Instructions (e.g. multiply-and-accumulate)
– Memory cache
– I/O interfaces
CS490N -- Chapt. 15 11 2003
![Page 785: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/785.jpg)
Alchemy Architecture
fast IrDA
EJTAG
DMA controller
Ethernet MAC
LCD controller
USB-Host contr.
USB-Device contr.
interrupt controller
GPIO
I2S
Serial line UART (2)
SDRAM controller
MAC
MIPS-32embed.proc.
instruct.cache
bus unit
datacache
SRAM controller
AC ’97 controller
SSI (2)
power management
RTC (2)
SRAMbus
toSDRAM
CS490N -- Chapt. 15 12 2003
![Page 786: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/786.jpg)
Parallel Embedded ProcessorsPlus Coprocessors (AMCC)
� One to six nP core processors� Various engines
– Packet metering
– Packet transform
– Packet policy
CS490N -- Chapt. 15 13 2003
![Page 787: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/787.jpg)
AMCC Architecture
control iface debug port inter mod. test iface
input outputpacket transform engine
external searchinterface
external memoryinterface
hostinterface
memory access unit
onboardmemory
sixnP cores
policyengine
meteringengine
CS490N -- Chapt. 15 14 2003
![Page 788: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/788.jpg)
Pipeline Of HomogeneousProcessors (Cisco)
� Parallel eXpress Forwarding (PXF)� Arranged in parallel pipelines� Packet flows through one pipeline� Each processor in pipeline dedicated to one task
CS490N -- Chapt. 15 15 2003
![Page 789: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/789.jpg)
Cisco Architecture
input
output
MAC classify
Accounting & ICMP
FIB & Netflow
MPLS classify
Access Control
CAR
MLPPP
WRED
CS490N -- Chapt. 15 16 2003
![Page 790: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/790.jpg)
Configurable Instruction SetProcessors (Cognigine)
� Up to sixteen parallel processors� Connected in a pipeline� Processor called Reconfigurable Communication Unit
(RCU)� Interconnected by Routing Switch Fabric (RSF)� Instruction set determined by loadable dictionary
CS490N -- Chapt. 15 17 2003
![Page 791: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/791.jpg)
Cognigine Architecture
routing switch fabric connector
pointerfile
diction-ary
instr.cache
datamemory
addr.calc.
dict.de-
code
pipe-linectl.
packet buffers
registers & scratch memory
execut.unit
execut.unit
execut.unit
execut.unit
sourceroute
sourceroute
sourceroute
sourceroute
CS490N -- Chapt. 15 18 2003
![Page 792: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/792.jpg)
Pipeline Of ParallelHeterogeneous Processors
(EZchip)� Four processor types� Each type optimized for specific task
CS490N -- Chapt. 15 19 2003
![Page 793: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/793.jpg)
EZchip Architecture
TOPparse TOPsearch TOPresolve TOPmodify
memory memory memory memory
...........
...........
...........
...........
CS490N -- Chapt. 15 20 2003
![Page 794: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/794.jpg)
EZchip Processor Types
Processor Type Optimized For
TOPparse Header field extraction and classificationTOPsearch Table lookupTOPresolve Queue management and forwardingTOPmodify Packet header and content modification
CS490N -- Chapt. 15 21 2003
![Page 795: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/795.jpg)
Extensive And Diverse Processors(IBM)
� Multiple processor types� Extensive use of parallelism� Separate ingress and egress processing paths� Multiple onboard data stores� Model is NP4GS3
CS490N -- Chapt. 15 22 2003
![Page 796: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/796.jpg)
IBM NP4GS3 Architecture
ingressdatastore
SRAMfor
ingressdata
egressdatastore
trafficmanag.
andsched.
ingressswitch
interface
egressswitch
interfaceinternalSRAM
Embedded Processor Complex(EPC)
ingressphysical
MACmultiplexor
egressphysical
MACmultiplexor
to switchingfabric
PCIbus
external DRAMand SRAM
from switchingfabric
egressdata store
packets fromphysical devices
packets tophysical devices
CS490N -- Chapt. 15 23 2003
![Page 797: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/797.jpg)
IBM’s Embedded Processor Complex
control memory arbiter
H0 H1 H2 H3 H4 S D0 D1 D2 D3 D4 D5 D6
frame dispatch
instr. memory classifier assist bus arbiter
ingressdataiface egress
dataiface
embed.PowerPC
inter. bus controlhardware regs.
completion unit
debug & inter.
programmableprotocol processors
(16 picoengines)
.....................................................
ingressdatastore
egressdatastore
to onboard memory to external memory
internalbus
PCIbus
egressqueue
ingressdatastore egress
datastore
ingressqueue
interrupts
exceptions
CS490N -- Chapt. 15 24 2003
![Page 798: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/798.jpg)
Packet Engines
� Found in Embedded Processor Complex� Programmable� Handle many packet processing tasks� Operate in parallel (sixteen)� Known as picoengines
CS490N -- Chapt. 15 25 2003
![Page 799: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/799.jpg)
Other Processors On The IBM Chip
Coprocessor Purpose
Data Store Provides frame buffer DMAChecksum Calculates or verifies header checksumsEnqueue Passes outgoing frames to switch or target queuesInterface Provides access to internal registers and memoryString Copy Transfers internal bulk data at high speedCounter Updates counters used in protocol processingPolicy Manages trafficSemaphore Coordinates and synchronizes threads
CS490N -- Chapt. 15 26 2003
![Page 800: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/800.jpg)
Flexible RISC Plus Coprocessors(Motorola C-PORT)
� Onboard processors can be
– Dedicated
– Parallel clusters
– Pipeline
CS490N -- Chapt. 15 27 2003
![Page 801: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/801.jpg)
C-Port Architecture
switching fabric
networkprocessor
1
networkprocessor
2
networkprocessor
N. . .
physicalinterface 1
physicalinterface 2
physicalinterface N
CS490N -- Chapt. 15 28 2003
![Page 802: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/802.jpg)
C-Port Configured AsParallel Clusters
multiple onboard buses
queuemgmt.
unitfabricproc.
tablelookup
unit
buffermgmt.
unitExec. Processor
pci ser. prom
. . .CP-0 CP-1 CP-2 CP-3 CP-12 CP-13 CP-14 CP-15
clusters
SRAM fabricswitching
SRAM PCI bus serial PROM DRAM
connections tophysical interfaces
CS490N -- Chapt. 15 29 2003
![Page 803: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/803.jpg)
Internal Structure Of AC-Port Channel Processor
memory bus
RISC Processor
extractspace
mergespace
Serial DataProcessor
(in)
Serial DataProcessor
(out)
To external DRAM
packets arrive packets leave
� Actually a processor complex
CS490N -- Chapt. 15 30 2003
![Page 804: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/804.jpg)
Summary
� Many network processor architecture variations� Examples include
– Multi-chip and single-chip products
– Augmented RISC processor
– Embedded parallel processors plus coprocessors
– Pipeline of homogeneous processors
– Configurable instruction set
– Pipeline of parallel heterogeneous processors
– Extensive and diverse processors
– Flexible RISC plus coprocessors
CS490N -- Chapt. 15 31 2003
![Page 805: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/805.jpg)
Questions?
![Page 806: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/806.jpg)
XXVII
Intel’s Second Generation Processors
CS490N -- Chapt. 27 1 2003
![Page 807: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/807.jpg)
Terminology
In the network processor industry, one has to be careful whendescribing successive versions of network processors becausethe delay between the announcement of a new product and theactual date at which customers can obtain the product gives asecond meaning to the phrase ‘‘later versions of a networkprocessor’’.
CS490N -- Chapt. 27 2 2003
![Page 808: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/808.jpg)
General Characteristics
� Same basic architecture
– Embedded processor
– Parallel microengines� Enhancements
– Speed
– Amount of parallelism
– Functionality
CS490N -- Chapt. 27 2 2003
![Page 809: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/809.jpg)
Models Of Next Generation Chips
� Two primary models
– IXP2400 (2.4 Gbps)
– IXP2800 (10 Gbps)� Version of 2800 with onboard encryption processor
– IXP2850
CS490N -- Chapt. 27 3 2003
![Page 810: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/810.jpg)
Use Of Two Chips For Higher Throughput
� Problem
– One chip insufficient for full duplex� Solution
– Increase parallelism
– Use separate chips for ingress and egress processing� Need communication path between them
CS490N -- Chapt. 27 4 2003
![Page 811: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/811.jpg)
Conceptual Data Flow Through Two Chips
F
A
B
R
I
C
fabricgasket
IXP2400(ingress)
IXP2400(egress)
inputdemux
networkinterface
� Many details not shown
– Inter-chip communication mechanism
– Memory interfaces
CS490N -- Chapt. 27 5 2003
![Page 812: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/812.jpg)
IXP2400 Features
� XScale embedded processor (ARM compliant) with caches� Eight microengines (400 or 600 MHz)� Eight hardware threads per microengine� Multiple microengine instruction stores of one thousand
instructions each� Two hundred fifty-six general-purpose registers� Five hundred twelve transfer registers� Addressing for two gigabyte DDR-DRAM� Addressing for thirty-two megabyte QDR-SRAM� Sixteen words of Next Neighbor Registers
CS490N -- Chapt. 27 6 2003
![Page 813: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/813.jpg)
Memory Hierarchy
� External DDR-DRAM� External QDR-SRAM
– Q-array hardware in controller� Onboard Scratchpad memory� Onboard local memory
CS490N -- Chapt. 27 7 2003
![Page 814: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/814.jpg)
External Connections And Buses
� No IX bus� High-speed interfaces attach to Media or Switch Fabric
(MSF) interface� MSF configurable to
– Utopia 1, 2, or 3 interface
– CSIX-L1 fabric interface
– SPI-3 (POS-PHY 2/3) interface
CS490N -- Chapt. 27 8 2003
![Page 815: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/815.jpg)
Illustration Of External Connections
IXP2400chip
PCI bus
coprocessorbus
classif.acceler.
ASIC
...
FlashMem.
QDRSRAM
DDRDRAM
flowcontrol
bus
input and output demux
Slow Portinterface
optional host connection
Media orSwitch Fabric
CS490N -- Chapt. 27 9 2003
![Page 816: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/816.jpg)
Internal Architecture
IXP2400 chip
PCI bus
PCIiface.
coprocessorbus
coproc.iface.
classif.acceler.
ASIC
...
FlashMem.
slowport
QDRSRAM
SRAMiface.
DDRDRAM
DRAMiface.
flowcontrol
bus
FC busiface.
input and output demux
receive transmit
XScaleRISC
processor
MEs 1 - 4
MEs 5 - 8
hashunit
scratchmemory
multiple,independent
internalbuses
Media orSwitch Fabric
interface
optional host connection
Media orSwitch Fabric
CS490N -- Chapt. 27 10 2003
![Page 817: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/817.jpg)
Microengine Enhancements
� Multiplier unit� Pseudo-random number generator� CRC calculator� Four thirty-two bit timers and timer signaling� Sixteen entry CAM� Timestamping unit� Support for generalized thread signaling� Six hundred forty words of local memory� Queue manipulation mechanism that eliminates the need for
mutual exclusion
CS490N -- Chapt. 27 11 2003
![Page 818: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/818.jpg)
Microengine Enhancements(continued)
� ATM segmentation and reassembly hardware� Byte alignment facilities� Two ME clusters with independent buses� Called Microengine Version 2
CS490N -- Chapt. 27 12 2003
![Page 819: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/819.jpg)
Other Facilities
� Reflector mode pathways
– Internal, unidirectional buses
– Allow microengines to communicate� Next Neighbor Registers
– Pass data from one ME to another
– Intended for software pipeline
CS490N -- Chapt. 27 13 2003
![Page 820: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/820.jpg)
IXP2800 Features
� Same general architecture as 2400� Sixteen microengines (1.4 GHz)� Two unidirectional sixteen-bit Low Voltage Differential
Signaling data interfaces that can be configured as
– SPI-4 Phase 2
– CSIX switching fabric interface� Four QDR-SRAM interfaces (1.6 gigabytes per second
each)� Three RDRAM interfaces (1.6 gigabytes per second each)
CS490N -- Chapt. 27 14 2003
![Page 821: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/821.jpg)
Summary
� Intel is moving to second generation of network processors� Three models announced
– IXP2400
– IXP2800
– IXP2850� Same general architecture, except
– Faster processors
– More parallelism
– More hardware facilities
– Newer, faster external connections
CS490N -- Chapt. 27 15 2003
![Page 822: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/822.jpg)
Questions?
![Page 823: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/823.jpg)
Stop Here!
![Page 824: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/824.jpg)
Stop Here!
![Page 825: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/825.jpg)
Stop Here!
![Page 826: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/826.jpg)
Stop Here!
![Page 827: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/827.jpg)
Stop Here!
![Page 828: Network System Design](https://reader033.vdocuments.net/reader033/viewer/2022042505/5436ca80219acd5b118b45db/html5/thumbnails/828.jpg)
Questions?