some thoughts on e2epi shawn mckee pipefitters meeting, internet2 spring meeting 8 april, 2003
DESCRIPTION
The issue… We all know high bandwidth links are not sufficient to provide high network performance. General users of the network, even technically proficient ones, can’t be expected to be network wizards nor have intimate knowledge of their end-to- end path… Problems arise continually and can be due to hardware, applications, hosts and misconfiguration What are we to do to make the most impact in the least amount of time?TRANSCRIPT
Some thoughts on E2EPIShawn McKee <[email protected]>Pipefitters Meeting,Internet2 Spring Meeting8 April, 2003
The Problem
Applications Developer
System Administrator
LAN Administrator
CampusNetworking
Gigapop Gigapop
Backbone
CampusNetworking
LAN Administrator
System Administrator
Applications Developer
How do you solvea problem along a path?
Hey, this is not working right!
The computerIs working OK
Talk to the other guys
Everything isAOK
No othercomplaints
The network is lightly loaded
All the lights are green
We don’t see anything wrong
Looks fine
Others are getting in ok
Not our problem
2
The issue…
We all know high bandwidth links are not sufficient to provide high network performance.• General users of the network, even technically proficient ones, can’t be expected to be network wizards nor have intimate knowledge of their end-to-end path…• Problems arise continually and can be due to hardware, applications, hosts and misconfiguration• What are we to do to make the most impact in the least amount of time?
End to End Performance Issues
We require knowledge of the end-systems as well as the intervening network segments to evaluate the observed performance and diagnose problems
There are a number of pieces to the puzzle…
Major Impacts in E2E Performance
What is the CPU, disks, interfaces and memory of the end hosts…and what are their performance?What is the network interface: type, firmware, parameters and expected performance, both long-term and based upon recent results?
System state is critical; what is the CPU load and estimated bus loading?Final piece: what is the upcoming workload in the network, both locally and globally?
Adapting Existing Tools for E2E
Many of the tools being developed for data-intensive Grids have to address similar issues in monitoring and planning.These tools should be looked at for how well they can meet the requirements of the End-to-end InitiativeMonaLisaMonaLisa is one example of an already deployed grid-monitoring package which may be a good match for the needs of E2E…
Start at the ends: Hosts
We must enable “data acquisition” from the hosts involved.• These hosts represent the logical dividing point between the “network” and the “user”• Many problems are related to host configs, TCP/IP stacks, OS version, NICs, firmware, application design, etcPROBLEMS: Hosts run many different OS’s on many different hardware platforms…how to generically capture needed info while minimizing user involvement?
First pass at Host info gathering…
We need a system which can dynamically download a data gathering application which can run on most systems…
JAVAJAVA seems to be the most likely candidate. • Pervasive• Can be cryptographically signed• Permissions can be fine grained• Runs on MANY OS’s
What to “acquire” from each host?
Stable InfoOperating system and version:
• RedHat V8.0 or WindowsXP SP1
• Processor details•NIC info (firmware, brand, type)• Memory info• TCP stack parameters
Dynamic InfoInterrupts/secCPU usageNIC bandwidth, errors, queue lengthsMemory usageBus usage
Standards are critical..
Whatever we do for host data acquisition we should insure the output is in some “standard” format The GGF Network Measurement group is just now grappling with measurement profiles and data schema. We should plan to use this and contribute to its developmentHost information exists in CIM, DTMF and others…lets pick something capable of storing what we need and move on.
Host applets
Having a system accessible thru the web and supporting Linux and Windows would give us the broadest initial coverage.First time users connect to a E2E server or peer-to-peer system and download a signed Java applet to their host.
Starting the Applets
The user allows the applet to start (security signing)The applet starts and creates a GUID for this host and records, in standard format, the “stable” host details. Each new invocation of the applet will verify the currency of the stable informationThe applet can provide the GUID and host details to a registration server with or without “anonymization” of identifying details
User Interface
Once the info is locally (and optionally remotely) stored, the user can be presented with a user interface:
Register hostTest pathClientServerLog ProblemSearch Database
Testing the path
A user with a problem could initiate path testing (in conjunction with a remote user or a PMP)User specifies “client” or “server” mode and partner IP informationThe test could be a defined set of measurements:
• Ping (reachability/RTT)• Traceroute (both forward and backward)• One-way loss (each direction)• Iperf (bandwidth EACH way, measured simultaneously)
Path testing
The series of tests is run by a Java application. Missing components are downloaded from servers.Bandwidth testing is done both ways, simultaneously to find duplex problems on the pathDynamic host information is recorded at both ends during each sub-test
Test results
Test results would be saved locally and a summary given to the user.A Java analysis applet could parse the info, looking for common problemsResults could be “logged” with a central service Logged events could be further analyzed by central servers with access to current network details. Users could be referred to the most likely problem domain with current contact information provided by the central server
Advantages of such a system
Creating such a service could bootstrap the effort.First step toward improving user experience is determining what is limiting performanceDatabase of test results is enormously important:
• Sets “baseline” for various hosts, locations, applications, etc.• Provides problem frequency data so we can focus on fixing the
most pervasive or restrictive problems first• Allows analysis of components: NIC vs OS, Firmware X vs Y, etc.
•Will require host applets (OS specfic), host specific flavors of applications, central servers and a distributed database
Some Goals
Put the “wizard” knowledge into the appletsEnable ordinary users to perform state of the art testingProvide a reference set of network testing applications by host type for usersDefine a network measurements database for the network users communityInteroperate with PMP stations in the networkInstrument applications to automatically provide data to system
HENP Efforts
The HENP Sponsored Interest Group is also focused on end-to-end issues.
http://www.internet2.edu/henpWe have a list of 9 goals related to networking, many of which are also related to the issues the E2E initiative is trying to addressHENP can help build the infrastructure and serve as a test case for the E2E Pipes effortMonaLisa is a deployed application which provides a measurement framework and could be adapted for the E2E effort
Monitors and Beacons at Michigan
We have an effort at Michigan to instrument our gigabit backbone and selected sites with “Monitor & Beacon” boxes.Each station has dual gigabit adapters and fast raid disks for real-time traffic capture and analysisWe are extending the GARA GT2.2 code to provide fully authorized functionality, both for individual “one-off” testing and scheduled testing via our web portalSome of this effort may be able to be used to further the E2E goals
Applets Needed
Host “stable” data acquisitionHost “dynamic” data acquisitionNetwork “measurement” Network testing results “logger”Network measurement “analysis”Network problem “logger”
Future DevelopmentsHost network “tuner”“Finger pointer” (working with PMPs and databases) “Search” (find info on your NIC, OS, bandwidth history, etc)“Visualization” (Dynamic plotting of network data)
Conclusion
We need to get some capability deployed which can make a difference in the user’s experienceHENPHENP has a very strong interest in solving this problem and is willing to put in resources to help get things doneStarting from the host is a logical first step which has the potential to make the biggest impact