a framework for community based distributed and semantically annotated course-ware development,...

55
A Framework for Community based Distributed and Semantically Annotated Course- ware Development, Sharing and Quality Management for Higher Technical education over Publish/subscribe P2P Overlay Department of Computer Science and Engineering Motilal Nehru National Institute of Technology Allahabad and Applied Artificial Intelligence Group Centre for Development of Advanced Computing, Pune

Upload: hilary-wells

Post on 01-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

A Framework for Community based Distributed and Semantically

Annotated Course-ware Development, Sharing and Quality Management for

Higher Technical education over Publish/subscribe P2P Overlay

Department of Computer Science and Engineering

Motilal Nehru National Institute of Technology Allahabad

and

Applied Artificial Intelligence Group

Centre for Development of Advanced Computing, Pune

Page 2: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 2

Higher Technical Education: Observations

Engineering Institutions : 2,500 approx Annual output: 400,000 approx Computer Science graduates : 300,000

approx Growth rate: 20% expected (NASSCOM) Employable Output: 25% only (McKinsey

Global)

Page 3: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 3

Higher Technical Education: Observations

M.Tech. Output: 20,000 Ph.D. Output

Engineering: less than 1000 Basic Sciences: around 5,000.

Page 4: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 4

Higher Technical Education: Observations

Number of researchers (2007-08) India :About 154,800 China: 1,423,000 US : 1,571,000

Page 5: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 5

Higher Technical Education: Observations

Needs: Order of magnitude growth of Quantity and Quality

Rapid and large scale growth of Student enrollment Institutes/universities Research Scholars

Total quality management of Outputs: Publications, Patents, Personals Resources: Courseware, Training material,

Labs and Evaluation Services

Page 6: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 6

Impact of Internet

Highly scalable, anywhere/anytime access Very large volume of:

Courseware Research papers Training materials

No positive impact on quality of education. Points to a disconnect between needs and

availability

Page 7: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 7

Possible Reasons for Disconnects

Resources are targeted to a specific groups May not be suitable for academically,

linguistically and culturally different groups of users

Disproportionately larger effort required to search Lack of semantic annotation

Lack of quality assessment and indicators

Page 8: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 8

Learning Methodologies

Traditional class room teaching with/without ICT Face to face interaction with teacher and peers Valuable learning experience Peer interaction dominant

E-learning: Unsupervised: No interaction, Learners work in

isolation Supervised: Limited interaction Static resources:

Very limited support for evolving heterogeneous needs of learners.

Page 9: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 9

E-Learning Infrastructure

Content Delivery : Client/Server Mode Dedicated Servers in LAN Environment Through Portals on WWW

Communication Paradigm Request/Reply Synchronous Coupled

Scalability: Limited Fault Tolerance: Limited

Page 10: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 10

Latent Knowledge Resources

Every institution has large number of hosts.

Each host contains valuable knowledge resources.

Latent: search engine can’t list them Reason:

Hosts do not have Public IP address Hosts are not servers Hidden behind Proxy/NAT

Page 11: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 11

Sharing Latent Knowledge Resources Interest based cooperative sharing is

desirable Difficulties:

Heterogeneity of interest Dynamic interest evolution Rendezvous of availability and interest Hosts are widely distributed

Page 12: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 12

Sharing Latent Knowledge Resources Visibility of interests and contents

resource owner – declare the availability and Interested user -- submit there interest

Dynamic evolution of Interest based communities

Page 13: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 13

Our Vision

Decentralized and autonomous middleware Highly Scalable Fault-tolerant Minimal management and maintenance

overhead Support dynamic evolution of interest

based communities for Collaborative generation of:

Content Meta-data Domain ontology

Seamless sharing of resources Peer interaction

Page 14: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 14

Our Vision

Semantic searching based on Meta-data Domain ontology

Quality assessment of resources by community

Behavioral Mining

Page 15: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 15

Challenges

Heterogeneity Users: Interest and content Host: uptime, memory, CPU, bandwidth

Scalability and interoperability Hosts without Public IP Management of dynamics

content, user group and their behaviors Absence of domain ontology and meta-data

Page 16: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 16

Requirements

Communication paradigm to support scalability Decoupling: Time, space and synchronization Anonymity

Network Infrastructure to support Peer-to-peer interaction Dynamic evolution of interest based

communities Interoperability Seamless dynamic leaving and joining of

nodes

Page 17: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 17

Decoupling :

Between providers and consumers Increase scalability

No dependencies No coordination & synchronization.

Create highly dynamic, decentralized systems

Page 18: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 18

Dimensions Of Decoupling:

Three dimensions Space - No need to hold references or

even know each other Time - No need to be available at the

same time Synchronization (flow) - Control flow is

not blocked by the interaction

Page 19: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 19

Publish/Subscribe

Paradigm for scalable distributed applications

Provides Decoupling Anonymity Asynchrony

Page 20: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 20

Publish/Subscribe: High Level View

Page 21: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 21

Publish/subscribe: Subscription Model Topic (subject) -based Content-based Type based

Page 22: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 22

Implementation of Event Service

Centralized Implementation Event matching is easy No Scalability No fault Tolerance

Distributed Implementation Set of nodes designated as Brokers Improved Scalability and fault tolerance Routing and matching of events is difficult

Page 23: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 23

Implementation of Event Service

Role based Implementation Every node can take any role based on

context Broker Publisher Subscriber

Highly scalable and fault tolerant

Page 24: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 24

Role based Implementation: Challenges Management of scalability and fault-

tolerance Application Layer Overlay Hierarchy Informed/Un-informed leaving

Routing of Publications and subscriptions Location of rendezvous Life span

Page 25: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 25

Role based Implementation: Challenges Role assignment

Designated (fix role) Dynamic

Matching Content based Type based

Notification Service Guarantee (at least once, at most

once etc.)

Page 26: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 26

Current Network Infrastructure

Page 27: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 27

Current Network Infrastructure

Within Institute/Organization: Nodes are assigned Private IPs Grouped in IP based subnets Physically connected with each other

through layer-2 and layer-3 switches. Not visible to outside world Connect to outside world through

NAT/Proxy

Page 28: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 28

Our Network Architecture

Within LAN of Institute/Organization Nodes having same interest:

Not aware about each other May be physically distant

Some virtualization is required Formation of interest based virtual rings Virtual links are formed using virtual (e.g..

TCP) links Virtual ring termed as Overlay.

Page 29: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 29

Our Network Architecture

With in LAN of Institute/Organization

Page 30: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 30

Our Network Architecture

Node visibility Nodes hidden behind Proxy/NAT Virtual rings of same interest may be

behind different proxy/NAT Isolated rings Resource sharing not possible: Invisibility Have to come under one umbrella

Page 31: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 31

Our Network Architecture

Virtual Ring of Proxies too. This makes it a 2-tier Overlay

Page 32: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 32

Our Network Architecture

Dynamic Community Evolution Abstraction over the 2-tier overlay

Isolated rings form communities Virtual Interest based proximity: Physically

nodes may be far apart

Page 33: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 33

Our Overall Network Architecture

Page 34: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 34

Pub/Sub on our Network Architecture: Every Node acts as:

Publisher, Subscriber, Broker Rendezvous Point based Matching

Distributed Hash Table (DHT) Nodes:

Majority are short lived and have minimal capabilities

Small percentage Remains up for long periods Relatively better storage, bandwidth and memory Termed as Super nodes.

Page 35: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 35

Super Nodes

Candidate Super Nodes: May get elected dynamically Proxy Nodes GARUDA nodes/ NKN nodes

May act as Brokers for Popular content (temporal locality) Hot contents are automatically cached

Page 36: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 36

Finding Content

Push/Pull Model Subscription Instead of Searching

Learner need not make search effort Learner subscribes for content System provides matching Publication

Page 37: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 37

Finding Content

Semantic Support Publication with/without meta-data Subscription with/without meta-data

Knowledge Resources enriched with meta-data

Use of domain specific ontology

Page 38: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 38

Meta-data

Meta-data can be created in distributed manner by: Content creator Some designated meta-data expert from the

community Automatic or semi-automatic Meta-data: Published/subscribed, stored,

retrieved as usual knowledge resource.

Page 39: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 39

Ontology

Distributed Ontology creation by Some experts from community

Published/subscribed, stored, retrieved as usual knowledge resource.

Page 40: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 40

Our Universal Client

Every node will run a generic client application

Universal client provides an interface for: Joining, Leaving: virtual ring maintenance Fault tolerance: replication, caching Publishing, Subscribing content Event Brokering Meta-data creation Ontology creation Behavior mining and Quality assessment

Page 41: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 41

Our Software Architecture

Page 42: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 42

Layer 1: Distributed and Federated Database

It Contains: Meta-data base Ontology base Knowledge Resource base Access log Base for user profiles

Page 43: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 43

Layer 1: Distributed and Federated Database

It also contains: Publication base Subscription base Base for event brokering

Page 44: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 44

Layer 2: Publish/Subscribe, Overlay Layer

It has three sub-layers: Sub-layer 1 : Overlay sub-layer Sub-layer 2 : Community Management sub-

layer Sub-layer 3 : Publish/Subscribe sub-layer

Page 45: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 45

Layer 3: Service Layer

Provides Services for Distributed Ontology Creation Metadata Harvesting Inference Engine Multilingual Subscription/Publication

Support

Page 46: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 46

An Example Demonstration

Layer 3 of our Software Architecture Presentation by C-DAC

Page 47: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 47

Design Challenges and Trade-offs

Overlay Architecture: Structured/Unstructured/Hybrid Unstructured

Stateless, Maintenance cost minimum Flooding instead of routing, bandwidth wastage

Structured State full, Maintenance required No flooding, saves bandwidth

Page 48: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 48

Design Challenges and Trade-offs

Implementation of event service Purely Distributed

Every node can be broker High scalability Higher cost of event management, routing and

matching Partially Distributed

Only Proxies as brokers Scalability is reduced Lower cost of event management, routing and

matching

Page 49: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 49

Simulation

To evaluate design alternatives: Role:

Assignment Vs acquisition Static Vs Dynamic

Utilization of Skewedness in subscription Replication of Hot Content

Service Guarantee Life span of Knowledge resources

Informed and Uninformed Leaving

Page 50: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 50

Strengths: MNNIT

Implicit Invocation Systems and Semantic Web Group of faculty members and research

scholars (PhD, MTech) indulged in: Large scale Publish/Subscribe for dynamic

topologies Automatic meta-data extraction and generation.

Networking and Distributed Computing Group of faculty members and research

scholars (PhD, MTech) indulged in: Peer-to-Peer computing Cloud Computing

Page 51: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 51

Strengths: CDAC

Expertise in: Multi-lingual Searching Meta-data extraction and generation Domain specific ontology creation Behavioral Mining

Page 52: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 52

Proof of Concept

Demonstration of: Publish/Subscribe over P2P Overlay. Scalability in terms of participating nodes by

simulation

Page 53: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 53

Proof of Concept: MNNIT Responsibilities Design and implementation of 2-tier peer to

peer overlay involving nodes at: CDAC: One proxy and 10 nodes behind it. MNNIT: One proxy and 10 nodes behind it.

Demonstration of scalability of the infrastructure in terms of participating nodes by simulation.

Design and implementation of publish/subscribe interface over this p2p overlay

Integration of modules developed by CDAC including GUI.

Page 54: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 54

Proof of Concept: CDAC Responsibilities Creation of ontology for two or three core

computer science domain and for Subscription/publication meta-data

structure, complaint with semantic web standards.

Distributed data-base management of meta-data, ontology, subscriptions,

publications, access pattern log and user profiles.

GUI for search, publish and subscribe.

Page 55: A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education

©2010 CSED, MNNIT, Allahabad and AAIG, CDAC Pune 55

Thanks