integrated collaborative information systems ahmet e. topcu [email protected]...
TRANSCRIPT
Integrated Collaborative Information
Systems
Ahmet E. [email protected]
Advisor: Prof Dr. Geoffrey Fox
1
Outline Introduction Motivation Research Issues Architecture Measurements and Analysis Conclusions
Contributions Future Works
2
Introduction Efforts for collaboration and sharing between users
and communities in Web 2.0 domain
Web 2.0 Represents new web-based services
Provides rich and lightweight online tools
Provides reusable services and data
Updates software and data often very rapidly
Provides interactive user interfaces
Provides an architecture for easy user contribution
3
Web 2.0 Examples Blogs (blogger.com, GoogleBlog)
Wikis(Wikipedia, WikiWikiWeb)
Social Networking Tools(MySpace ,LinkedIn)
Social Bookmarking Tools(del.icio.us ,YouTube)
Domain of scientific research (CiteULike , Connotea , and Bibsonomy)
Domain specific academic search tools(CiteSeer, Google Scholar, Windows Live Academic)
4
Motivation Numerous annotation and search tools. Each of them
has different capability and not completed defined metadata
Need for exploiting large set of data sources from various tools
Integration of major annotation and search tools in order to use them having additional functionalities for scientific research
No easy way to keep resultant information searched using Web Search Tools.
Utilize the best capabilities of the tools
5
Motivation II Necessities for integration Need for common data format No easy way to find all publications
Example: A search in Google Scholar for the publications of our research lab (Community Grids Lab) will return only about 20% of the total CGL publications.
Wealth of information contained in numerous field remains largely outside the scope of tools
What happens if tool you choose is not adopted or worse just disappears Example: Windows Live Academic (WLA)
6
Motivation Scenario : Collection of Information using Search Tools
The search tools have two main roles in the usage scenarios of our system: They will be used to seed the creation of a community
(e.g., the papers of a research group, the papers on a chemical compound, etc.).
These seeds will then be expanded and refined by our community-building tools and linked with the annotation tools. They will be used to extract the citation count of scientific papers.
7
8
Motivation Scenario II : Collection of Information using Search Tools Extract information from Search Domain
Example: Using heuristic method for Google Scholar.
Extract information to build metadata having search key
This model can be used for various search tools Collect metadata for scholarly published papers. Build communities implied by the co-authors of papers. Search information through populated metadata.
9
10
Research Issues Integration
Building a model to integrate community tools and adding value to existing
systems natural collection of related documents easily support more metadata support tagging
Scalability Investigate system behavior for increased message rate per
second
Flexibility and Extensibility Easy to add and remove service mechanism Easy of integrating annotation and search tools
11
Architecture Principles Community-centric platform of services
Integration of dynamic publication, search tools into Cyberinfrastructure based scholarly research
Integration such scientific research defining metadata and using various URL, and map them
Services that aggregate information from a variety of sources (i.e., “mash-up” tools) and provide added value to communities of researchers
Do not build a new tagging or search systems. Reuse the tools and adding value to existing systems
Easier to link together all relating information common Digital Entity (DE)
12
Digital Entity(DE) Definition
13
Field1 Field2 Field3 FieldNFieldN-1
Digital Entity (DE) : Collection of metadata fields
title authorurl
abstract
address booktitleannote
chapter editioncrossref howpublished
institution
annote
journal monthbibtext_key
number
note
organization pages schoolpublisher series
type volume year versiontags article_id
issue date short_title
research_notes
alternate_journal issn
elec_resource_num
label
reviewed_item accession_number call_number
imagelink_to_pdf
access_date last_modifed_datecaption
language
translated_author
comments
conference_locationconference_name conference_url
book_url extended_url1
doi
extended_url2 bookmarking_website
number_cited
citation_id
Integrated Collaborative Information Systems (ICIS) Architecture
Tools: External web tools providing services to clients. Clients: Users to use the ICIS. Gateways:
Channels between tools and ICIS Channels between clients and ICIS
Services: Collaborative environments for users to utilize the ICIS system functionalities.
14
. .
Clients
Integrated Collaborative Information Systems (ICIS) Architecture
Gateways
...
.
.Services
Gateways
...
.
.. .
Clients
Integrated Collaborative Information Systems (ICIS) Architecture
Gateways
...
.
.Services
Gateways
...
.
.ToolsHTTP HTTP
Integrated Collaborative Information Systems (ICIS) Architecture Components
Tools external web tools to provide services to clients Integration Manager have information service and
provide communication between tools, client, and responsible for integration operation in the system
Filter operates two-way data filtering Permission Handler checks existing Digital Entity
(DE)s permission or build a new permission token for new DEs
Data Manager provides a mechanism to extract data from a repository and insert data into a repository
Storage maintains user data and permissions in the database
15
Integrated Collaborative Information Systems (ICIS)
16
Tool-2Windows
Live Acad.
Tool-3 Del.icio.us
Tool-NCiteSeer
Tool-1 Google Scholar
Integration Manager
Permission Handler
Data Manager
Client
Tools
Gateway N
Gateway 3
Gateway 2
Gateway 1
ToolGateway
Information Service
ClientGateway
Pull Service
Push Service
HTTP/SOAP
HTTP/SOAP ……..
Filter Handler
Filter Processor
Token checker
Token builder
Filter
Controller
Database
Storage
Extracter Service
InserterService
Controller
……..……..
17
ToolGateway
Pull Service
RequestHandler
MetadataBuilder
InformationHandler
ClientGateway
WSDL
ServicePoint
HTTP ServicePoint
HTTP
WSDLWebTools
Client
PresentationService
InformationService
<XML>
Integration Manager
Integrated Collaborative Information Systems (ICIS) Services
18
Local SearchService
TaggingService
Access ControlService
Web PageSearch Service
Digital EntityService
Event Service
Web SearchTool Service
ConfigurationService
AuthorizationService
Upload/Download
Service
ConsistencyService
Feed GeneratorService
Summary: Architecture
Build integration architecture We do not reinvent existing tools Use existing features of tools Supports tagging services Provides common metadata Allows to use consistent data Provides common resolution of filters Supports authorization of users
19
Use Case: Collection of Metadata from web pages Collect
Digital Entities in web pages using HTTP methods. Analyze
Using heuristic methodology to extract metadata fields of the Digital Entities for publications
Build RSS objects using collected Digital Entities. New tags using collected Digital Entities.
Compare Collected Digital Entities from web pages with the existing
Digital Entities in ICIS repository. If they are:
different: Store new Digital Entities in ICIS repository. same: Option to update tags and other fields for collected DEs
Share New Digital Entities with other tools using ICIS repository.
20
Security Model Security in web 2.0 can be limited. We implemented a simple but more powerful security
model around local tools that wrap Web 2.0 systems. We used an access-control matrix model to provide
security for our information system Supports multiple groups and multiple users for each Digital
Entity (DE). Similar to UNIX file system
The Unix RWX bits corresponds to Read, Write, and Execute operation for each file and directory.
In our system, DE correspond to the file element and folder corresponds to the directory element.
For each DE and folder, there are three types of access rights defined in the systems: Read, Write, and Delete.
21
Security Model II We have a security model that supports
Level of Authorization Roles are defined as Super Administrator (SA) and Group
Administrator (GA), User The system allows having more than one SA. An existing SA can add other SAs to the system. SA can assign any User to become GA, and remove GA
from being group administrator. Each group should at least one GA. GA add/remove Users
from the group. Users can allow other Users and groups to share their
resources.
User profile Share user profile between sites.
22
Security Model III User, Group, DE, and Folder
relations
23
User
Group
has
DEaccess
access
Folder
has
access
access
Benchmarks and Environments Message rate scalability investigation
Search operation Using Database Access Using Memory Utilization
Test environments Apache Axis version 1.2 Apache Tomcat Server version 5.0.28 Java 2 Runtime Environment, Standard Edition (build
1.5.0_12-b04) The maximum heap size of Java Virtual Machine(JVM)
is 1024 MB 1 Gbits/sec network bandwidth
24
Integrated Collaboration Information System(ICIS) Framework Search local repository using database access
with increasing Message rate
25
Integrated Collaboration Information Systems(ICIS)
Framework
WSDL
Single Thread
WSDL
Integrated Collaboration Information Systems Framework Search request by increasing Message Rates(Number of Users)
using database access
WSDL
Single Thread
Number of Clients
Database
Message rate scalability result (Search using Database)
26
Integrated Collaboration Information System(ICIS) Framework II Search local repository using memory with
increasing Message rate
27
Integrated Collaboration Information Systems(ICIS)
Framework
WSDL
Single Thread
WSDL
Integrated Collaboration Information Systems Framework Search request by increasing Message Rates(Number of Users)
using memory utilization
WSDL
Single Thread
Number of Clients
Memory
Message rate scalability result (Search using Memory Utilization)
28
Contribution System Research
Providing a architecture and model for integration of collaborative systems
Integration and interoperability of annotation, search tools, and web search tools
User collaboration and sharing resources.
Providing benchmarks to evaluate the scalability of the prototype system
29
Contribution II System Research
Increasing performance and scalability using memory utilization
Providing flexibility allowing integration of different tools having common metadata.
Easy to add and extend service mechanism
Supporting authorization and event based mechanism
Implementing a rather more powerful access control mechanism
System Software An ICIS Infrastructure of Internet Documentation and
Integration of Metadata (IDIOM) systems
30
Future Works Apply Integrated Collaboration Information
System(ICIS) Framework to other application domains such as streaming collaboration systems
Integrate other collaboration and search tools into ICIS Framework CiteSeerX
Use distributed storages instead of a single storage
Expand our approaches to open-access scientific databases such as PubMed, PubChem, Science.gov
31
32
Thanks!Questions?