internet 2 members meeting 20 april 2004 arlington, va mohan ramamurthy unidata program center ucar...
TRANSCRIPT
Internet 2 Members Meeting20 April 2004Arlington, VA
Mohan RamamurthyUnidata Program CenterUCAR Office of Programs
Boulder, CO
CRAFT and Unidata: A Natural CRAFT and Unidata: A Natural PartnershipPartnership
UnidataUnidataA Program in UCAR*A Program in UCAR*
Unidata Mission Statement:
Provide data, tools, and community leadership for enhanced Earth-system education and research.
At the Unidata Program Center, we
• Facilitate [Real-time] Data AccessFacilitate [Real-time] Data Access
• Provide ToolsProvide Tools
• Support Faculty and StaffSupport Faculty and Staff
• Build and Advocate for a CommunityBuild and Advocate for a Community
*University Corporation for Atmospheric Research (UCAR) is a nonprofit consortium of 68 universities committed to advancing our understanding of atmosphere and other Earth systems.
Internet Data DistributionInternet Data Distribution(IDD)(IDD)
Source
LDM
Source
Source
LDM LDM
LDMLDM
LDM LDM
LDM
LDM
Internet
Radar
Model
Satellite
About 150+ sites are participating in Unidata Internet Data Distribution (IDD) system
Internet2 Traffic Internet2 Traffic (Week of 4/5/04)(Week of 4/5/04)
ApplicationApplication OctetsOctets OctetsOctets
1)1) IperfIperf 21.43%21.43% 102.3TB102.3TB
2)2) NNTPNNTP 8.88%8.88% 42.42TB42.42TB
3)3) HTTPHTTP 7.80%7.80% 37.24TB37.24TB
4)4) FTPFTP 3.29%3.29% 15.73TB15.73TB
5)5) BitTorrentBitTorrent 8.14%8.14% 38.87TB38.87TB
6)6) ShoutcastShoutcast 2.73%2.73% 13.04TB13.04TB
7)7) LDMLDM 2.02%2.02% 9.639TB9.639TB
Unidata IDD/LDM uses more of the Internet2 than any other advanced application;
The Unidata LDMThe Unidata LDM
The Local Data Manager (LDM)The Local Data Manager (LDM)• is a collection of cooperating programs that is a collection of cooperating programs that
select, capture, manage, and distribute arbitrary select, capture, manage, and distribute arbitrary data products. data products.
• is designed for event-driven data distribution, and is designed for event-driven data distribution, and is currently used in is currently used in Internet Data Distribution (IDD).
• includes network client and server programs and includes network client and server programs and their shared protocols. their shared protocols.
• supports flexible, site-specific configuration, supports flexible, site-specific configuration, multiple sources of data products, and user-multiple sources of data products, and user-customizable actions on received data productscustomizable actions on received data products
Data-Feed TopologyData-Feed Topology
SourceLDM
LDM
LDM LDM
LDM LDM
Primary Feed
Secondary Feed
… …
(Relay node)(Relay node)(Relay node)(Relay node)
(Leaf node)(Leaf node) (Relay node)(Relay node) (Leaf node)(Leaf node)
Real-time data distribution via IDD/LDM6
Unidata Unidata community community now extends now extends internationally internationally on several on several continentscontinents
Technology Transfer: Operational LDM Use in the NWS
Recently, the Korean Meteorological Administration has started using the LDM for some of their internal data distribution to/from nearly 40 weather service offices.
IDD TopologiesIDD Topologies
Lightning dataLightning data
Radar dataRadar data
Satellite dataSatellite data
Surface/Upper-air dataSurface/Upper-air data
SuomiNet and LDMSuomiNet and LDM
A network of GPS receivers to provide real-time atmospheric precipitable water vapor measurements and other geodetic and meteorological information
SuomiNet collects data from 100+ GPS receivers distributed throughout the world.
The observations are sent to Boulder, CO for processing and analysis and then redistributed to the community using the LDM.
Generic LDM InstallationGeneric LDM Installation
ProductQueue
IngesterLDMServer
SendingLDM
ReceivingLDM
pqact DecoderExecution
Data Flowpqact
LDM SpecificsLDM Specifics Uses registered port 388Uses registered port 388 Uses TCP and ONC RPCUses TCP and ONC RPC Actions driven by configuration-fileActions driven by configuration-file Arbitrary data of arbitrary size (0 – 2Arbitrary data of arbitrary size (0 – 23232-1 bytes)-1 bytes) One process per downstream (upstream) LDMOne process per downstream (upstream) LDM MetadataMetadata
• Size of data in bytesSize of data in bytes• Creation timeCreation time• Creation hostnameCreation hostname• Feed-type (32 bit integer)Feed-type (32 bit integer)• Product identifier (255 byte string)Product identifier (255 byte string)• MD5 checksumMD5 checksum
LDM StrengthsLDM Strengths Robust operation in the face of network Robust operation in the face of network
congestion and outagescongestion and outages EfficientEfficient
• Event drivenEvent driven• Capable of transmitting 2 TB/d from a well-connected, Capable of transmitting 2 TB/d from a well-connected,
upstream hostupstream host Highly user-configurableHighly user-configurable
• Authentication of feed requestsAuthentication of feed requests• Requested productsRequested products• Processing of received productsProcessing of received products
Proven technology: operational since 1994Proven technology: operational since 1994
LDM LimitationsLDM Limitations
Current implementation is UNIX-onlyCurrent implementation is UNIX-only Data-product availability is limited by size Data-product availability is limited by size
of product-queue on upstream LDMof product-queue on upstream LDM Upstream sites are fixed for duration of Upstream sites are fixed for duration of
LDM session (i.e., static routing)LDM session (i.e., static routing) Performance over unreliable connection is Performance over unreliable connection is
limited by inherent receive-timeouts of limited by inherent receive-timeouts of ONC RPCONC RPC
Adaptability is limited by single-threaded, Adaptability is limited by single-threaded, multiple-process implementationmultiple-process implementation
LDM-6 UpgradeLDM-6 Upgrade
Use TCP guaranteed delivery instead of Use TCP guaranteed delivery instead of RPCsRPCs
User-selectable product “chunking”User-selectable product “chunking” Real-time statistics gathering and displayReal-time statistics gathering and display Code cleanup for greater efficiencyCode cleanup for greater efficiency
LDM-6 Performance LDM-6 Performance Model Data: Highest Volume DatastreamModel Data: Highest Volume Datastream
U. AlbanyU. Albany U. IllinoisU. Illinois
U. WashingtonU. Washington U. UtahU. Utah
NLDM Benefits Over LDM Flooding algorithm: Flooding algorithm:
• automated routing through redundancyautomated routing through redundancy Virtually unlimited number of hierarchically structured Virtually unlimited number of hierarchically structured
newsgroups: newsgroups: • finer granularity of product categorization and subscriptionfiner granularity of product categorization and subscription
Cross posting: Cross posting: • Multiple “views” of same data productsMultiple “views” of same data products
Multiple numbers and types of storage buffersMultiple numbers and types of storage buffers Dynamic construction and destruction of connectionsDynamic construction and destruction of connections Backlog handlingBacklog handling Protocol supports both push and pull transmissionProtocol supports both push and pull transmission
NLDM:News Server Technologyto Relay Data in Near Real Time
NLDM ResultsNLDM Results
INN used “out of the box” with one modification:INN used “out of the box” with one modification:• Message ID generation and handling modified to use product Message ID generation and handling modified to use product
signaturessignatures Default approach is to create message ID based on host nameDefault approach is to create message ID based on host name Doesn't allow duplicate detection based on product contentDoesn't allow duplicate detection based on product content
Robust delivery with latencies comparable to current LDMRobust delivery with latencies comparable to current LDM• May be improved further with additional tuningMay be improved further with additional tuning
Local management functionality comparable to current LDMLocal management functionality comparable to current LDM Robust automated, dynamic routingRobust automated, dynamic routing Automated connection managementAutomated connection management
NLDM Routing StatisticsCONDUIT, Boulder to D.C.
● Isabel hit Washington, D.C.Two day window, 30 second bin size
Average Latencies
Maximum Latencies Boulder → U.Oregon.2 → D.C.
Boulder → U.Oregon.1 → D.C.
Direct path: Boulder → D.C.
MeteoForumMeteoForum
MeteoForum NetworkMeteoForum Network
AMPATHAMPATH
CLARACLARA
Buenos Aires
Belem
San Jose
Rio de Janeiro
Caracas
Bridgetown
MeteoForum is a joint project between Unidata and COMET, two programs in UCAR
IDD-BrazilIDD-Brazil Connects to Unidata IDD Connects to Unidata IDD
Current ParticipantsCurrent Participants• UnidataUnidata• University of MiamiUniversity of Miami• UFRJ (Brazil)UFRJ (Brazil)• UFPA (Brazil)UFPA (Brazil)• CPTEC/INPE (Brazil)CPTEC/INPE (Brazil)• USP (Brazil)USP (Brazil)
InfrastructureInfrastructure• Internet2Internet2• AMPATH (NSF)AMPATH (NSF)• FIUFIU• Global CrossingGlobal Crossing• RNPRNP• ANSPANSP
IDD-BrazilIDD-Brazil - Stress Testing - Stress Testing
IDD relay of all open feeds to UFRJIDD relay of all open feeds to UFRJ• Sustained 1.6 GB/hr over 10 day period at end of Sustained 1.6 GB/hr over 10 day period at end of
DecemberDecember• Peak rates routinely over 2.3 GB/hr over same periodPeak rates routinely over 2.3 GB/hr over same period• Typical latencies range from 1 to several secondsTypical latencies range from 1 to several seconds• Impact on relay machine negligibleImpact on relay machine negligible
Visualization SoftwareVisualization SoftwareGEMPAK, IDV, McIDASGEMPAK, IDV, McIDAS
Shaping the Future of Data Use in the Geosciences
We are moving from an era of data provision towards one in which data- and related web-services are important;
Multidisciplinary integration and synthesis are emphasized.
User applications: e.g., McIDAS, IDV,
LAS, IDL, MatLab...
DLESEDigital Library for
Earth-System Education
HydrologyData, e.g.
GeophysicalData, e.g.
Satellite Images, e.g.Satellite
Images, e.g.Satellite
Images, e.g.Satellite
Imagery...
OpenDAP, ADDE, &
FTP protocols
IDD
DLinterchange
protocol
IDD
Discovery
IDD IDD IDD
User applications: e.g., McIDAS, IDV,
LAS, IDL, MatLab...
DLESEDigital Library for
Earth-System Education
HydrologyData, e.g.
GeophysicalData, e.g.
Satellite Images, e.g.Satellite
Images, e.g.Satellite
Images, e.g.Satellite
Imagery...
OpenDAP, ADDE, &
FTP protocols
IDD
DLinterchange
protocol
IDD
Discovery
IDD IDD IDD
PeoplePeople
DocumentsDocuments DataData
Catalog
Generation Tools
Analysis andVisualization Tools
Data Services
Discovery andPublication Tools
Discovery and Publication Services
Dat
a C
atal
ogS
ervi
ces
PeoplePeople
DocumentsDocuments DataData
Catalog
Generation Tools
Analysis andVisualization Tools
Data Services
Discovery andPublication Tools
Discovery and Publication Services
Dat
a C
atal
ogS
ervi
ces
Thematic Real-time Environmental Distributed Data Servers (THREDDS)
• Combines IDD “push” with several forms of “pull” and DL discovery
• About 25 data providers are partners in THREDDS
• To make it possible to publish, locate, analyze, visualize, and integrate a variety of environmental data
• Connecting People with Documents and Data
THREDDSMiddleware