collaborative peer-to-peer information sharing
TRANSCRIPT
Collaborative Peer-to-Peer Information Sharing
Content Query System
Tom LendackyIBM Corporation
OverviewDistributed Peer-to-Peer Information Sharing System
CommunitiesCollection of related data (i.e. defect data, music, sales)Optional password authentication to control data provider(s)
Content SourcesCollection of specific data (i.e. Linux defect data, rock, regional sales)
XML MessagesClient Request Interface and Control Interface
"Pluggable" Query EnginesAllows for any type of information to be returnedDefined API's and Input/Output formats (XML)
Browser User InterfaceApache/PHP based UIXML results translated to HTML
XML allows for any type of UI
BackgroundInitially developed to:
Provide easier way to access Linux defect dataCurrently must go to each web site to query defect data
Improve service and supportQuicker response
Reduce duplicate effortProblem could already be known with a patch in processProblem could already be fixed with patch available
But it has many more uses than that...
Host A
Host C
Host B
Host D
Host E
Host F
Host G
Community "A"
Community "B"
CommunityCommunity
Collection of related informationPassword authentication (optional)
Controls who can provide informationContent Sources (specific information)
Multiple Content Sources supportedCan specify the same Content Source multiple times (i.e.. different information sources)Can specify multiple Content Sources
Community...Community...
Uses TCP/IP Sockets for connections between HostsConfigurable maximum connections limitConfigurable AutoConnect capability
Examines (Announce) messages to find new Hosts to connect toConfigurable Retry capability
Repeated attempt to connect to "Startup" connections that are not connected"Startup" connections
Attempt to connect to specific Hosts during CQS startupConnection "Listener"
Listens for requests from Hosts to connect to the CommunityDedicated connections
A connection is only used for the Community to which it connectedAccess control
Based upon IP address
Community...Community...
Handshake used to join CommunityCommunity Name: Verify Community to JoinHost UUID: Verify connection to/from a unique Host
Only 1 Connection per Host allowedUses Host UUID (as opposed to IP address) to allow connections from multiple Hosts on a single machine
Authentication: SHA-1 message digest value for newly generated UUID and password
Password does not flow in the clear
XML messages over connections4 byte header (length of message that follows)XML message
Host A
Host C
Host B
Host D
Community "A"
Content SourceContent Source
Part of a CommunityCollection of specific informationQuery input is standardizedUses a Query Engine to interface with the information
Host announces Content Sources
ANNOUNCE messageName and URI of input formInforms other Hosts of Content Sources in the Community
ContentSources:"A"
ContentSources:"A" and "B"
ContentSources:"B" and "C"
ContentSources:"B"
Content Source...Content Source
Supports multiple Content SourcesBoth identically or uniquely named
<CONTENT-SOURCE name="A" module="libA1.so" data="db1"... /><CONTENT-SOURCE name="A" module="libA2.so" data="db2"... /><CONTENT-SOURCE name="A" module="libA2.so" data="db3"... /><CONTENT-SOURCE name="B" module="libB1.so" data="db4"... />
Identically named Content Sources are processed sequentially
XML MessagesXML Messages
Defined format<MESSAGE uuid= ttl= hops= reply=... >
<MESSAGE-HOST... /><message-name... ></message-name>
</MESSAGE>Attributes on <MESSAGE> tag identify message and purpose
UUID: Identifies the message, used to avoid processing duplicate requestsTTL/HOPS: Indicates how many more times the message can be forwarded and how many times it already hasREPLY: Indicates whether this message is a reply (no=forward to all connected Hosts, yes=return to sender)
Allows new messages to be created without having to upgrade the HostHost processes message that contain a message-name that is recognizedHost simply forwards/returns messages that are not recognized
Three defined messages currentlyANNOUNCE, CONTENT-SOURCES, QUERY-CONTENT
Reply defined as message-name-REPLYDefined XML query language for QUERY-CONTENT
Both input and output
Request InterfaceRequest Interface
Used by a Client User InterfaceObtain information from a Host and request a Host to perform an action
Uses TCP/IP Sockets for request initiationOne request per socket connectionAccess control
Based upon IP address
Four defined requests currentlyGet Community namesGet Content Source name within a CommunityGet Content Source URI for Content Source within a CommunityStart a query for a Content Source within a Community
Query results returned in XML format
Control InterfaceControl Interface
Provides some administration information and controlUses TCP/IP Sockets for control initiation
One control request per socket connectionAccess control
Based upon IP address
Six defined controls currentlyGet Community namesGet Community informationGet Community connections (active connections)Get ContentSource names within a Community (locally defined only)Get ContentSource configuration information within a CommunityShutdown CQS
Query EnginesQuery Engines
"Pluggable" interface between Host and informationSpecified on the Content Source configuration statement
<CONTENT-SOURCE name= uri= module= data= map= />MODULE: the path/name of the module to load for this Content SourceDATA: Query Engine defined value - supplied to the Query Engine during initialization
Possible use as configuration inputMAP: Query Engine defined value - supplied to the Query Engine during initialization
Possible use as configuration input
Must support define API setFive API's
Startup / Termination related (invoked only once):CQELoad: Invoked during CQS startup just after the Query Engine has been loadedCQEUnload: Invoked during CQS termination just before the Query Engine is unloaded
Query related (invoked as a series of calls each time a query request is received):CQECreate: Invoked to create a new instance of an object that will be responsible for performing the queryCQEQuery: Invoked to perform the queryCQEDestroy: Invoked to cleanup / destroy the instance of the object that performed the query
Defined XML format for query languageInput and output
Browser User InterfaceBrowser UI
Apache/PHP basedDefined set of URIs to access the CQS system
http://hostname/.../CQSURI to display the startup webpage
http://hostname/.../CQS/CommunitiesURI to display a webpage of a list of Communities that the Host has joined
http://hostname/.../CQS/ContentSources/CommunityURI to display a webpage of a list of Content Sources within the specified Community
http://hostname/.../CQS/ContentSource/Community/ContentSource/InputURI to display a webpage of the input form page to submit the query (Refer to documentation on how the form must be designed
http://hostname/.../CQS/ContentSource/Community/ContentSource/QueryURI to initiate the Query request.Results are displayed in table format as they are returned by the participating CQS systemsTranslates XML query results into HTML for presentation on browsers
Project InformationOpen Source
Hosted on DeveloperWorkshttp://oss.software.ibm.com/developerworks/projects/cqs
Project HomePagehttp://oss.software.ibm.com/developerworks/opensource/cqs/index.htmlDocumentationDeveloper InformationBuild Information