efficient monitoring of web resources

45
Efficient monitoring of Web resources Avigdor Gal (joint work with Haggai Roitman and Louiqa Raschid) IFIP 2.6 meeting 24/6/2009, Nicosia, Cyprus

Upload: quanda

Post on 13-Jan-2016

22 views

Category:

Documents


2 download

DESCRIPTION

Efficient monitoring of Web resources. Avigdor Gal (joint work with Haggai Roitman and Louiqa Raschid) IFIP 2.6 meeting 24/6/2009, Nicosia, Cyprus. Profile-Based Online Data Delivery. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Efficient monitoring of Web resources

Efficient monitoring of Web resources

Avigdor Gal

(joint work with Haggai Roitman and Louiqa Raschid)

IFIP 2.6 meeting24/6/2009, Nicosia, Cyprus

Page 2: Efficient monitoring of Web resources

Profile-Based Online Data DeliveryProfile-Based Online Data Delivery Data delivery: the delivery of data of interest

(specified in profiles) from servers (data providers) to clients (data consumers). Push vs. Pull Server capabilities vs. Client requirements

Profiles: specify what, when, how data should be delivered, and its delivery value.

Online: the decision making of what and when to deliver is usually done without a complete knowledge of the all “stream” of future requirements or capabilities in the system, while considering the sources’ dynamic behavior.

Page 3: Efficient monitoring of Web resources

Example: Monitoring RSS FeedsExample: Monitoring RSS Feeds

User

CNN Top Stories

BBC breaking news

Yahoo! news

Google Alerts

User

UserUser

pull

push

Other example applications:

• E-Commerce & E-Markets

• Grid

• Mashups & Portals

• Continuous Queries (CQ)

• Cache Management

• …

Page 4: Efficient monitoring of Web resources

Research GoalsResearch Goals Proposed a generic model for profile based online

data delivery. Allows to negotiate over the dynamic nature of resources

and use time based constraints. Considered both server capabilities and user

requirements. allow the generation of a hybrid push-pull solution.

Handle various data delivery aspects: Dual approach for targeted data delivery. Hybrid push-pull framework and data delivery solution. Capturing data delivery tradeoffs Complex data delivery under bandwidth constraints

Page 5: Efficient monitoring of Web resources

Related WorkRelated Work Push

Systems: BlackBerry, JMS, Google Alerts

Web Caching & Synchronization Publish/Subscribe Stream processing/CQ & CEP Broadcast systems CDNs (e.g., RSS aggregation)

Pull Update Models Web Crawling/Monitoring (WIC) Sensor-nets Grid Web Services Mashups Web Caching (LR-Profiles,

Prefetching) PDCM

Hybrid: Pop-Pap, Data Gerrymandering, Ajax, RSS

Page 6: Efficient monitoring of Web resources

Data delivery model: Data delivery model: Data and ArchitectureData and Architecture

Page 7: Efficient monitoring of Web resources

ProMo Proxy - OverviewProMo Proxy - Overview

Page 8: Efficient monitoring of Web resources

Data delivery model: ProfilesData delivery model: Profiles

We propose a novel profile model based on execution intervals.

Execution Interval: an association of a time interval with some resource. Complex execution intervals can be also specified. Can be specified either explicitly or implicitly (using EI-patterns

that are further derived using an update model). Have some unique properties that effect scheduling.

Profiles: a set of execution intervals, and include: Notification rules that associate utility values to execution

intervals. Profile owner role (either client or a server).

Page 9: Efficient monitoring of Web resources

Execution intervals - ExampleExecution intervals - Example

Page 10: Efficient monitoring of Web resources

Example Client ProfileExample Client Profile

profileprofile

ownerowner

rolerole

notification rulesnotification rules

complex-EI complex-EI patternpattern

local and local and global utilitiesglobal utilities

Page 11: Efficient monitoring of Web resources

Schedules, Constraints, and Schedules, Constraints, and Data Delivery MetricsData Delivery Metrics

Schedule: A mapping Constrained Schedules: limited budget for different

data delivery tasks (e.g., “politeness” constraints or upper bound on parallel monitoring/listening tasks).

Data delivery metrics: Completeness (max) Data latency (min) System resource utilization (Probes) (min) Execution time (min) Gained Utility (Satisfiability) (strict)

Data delivery objectives and performance evaluation are based on those metrics.

nonepushpullTRS ,,:

Page 12: Efficient monitoring of Web resources

A Dual Approach for Targeted Data A Dual Approach for Targeted Data DeliveryDelivery

Instead of maximizing utility under (strict) system resource constraint, minimize system resource utilization while (strictly) satisfying (all) user profiles.

Main motivation: dynamic allocation according to user profiles may produce benefit for both objectives. We propose an optimal static algorithm SUP for the dual

problem. Under some conditions, SUP is even optimal for both

objectives! We further present adaptive versions of SUP, fbSUP and

fbSUP(λ), that handle non-static situations using feedback. Overall, results show that the dual approach is capable to

dominate the traditional approach and has good utility/budget performance in the non-static case.

Page 13: Efficient monitoring of Web resources

ProMo: Hybrid Framework for Online ProMo: Hybrid Framework for Online Data DeliveryData Delivery Idea: mediate between clients and servers while

considering both client requirements and server capabilities.

Solution: use the same profile structure both for servers and clients, as a result: Matching clients and servers becomes easy. Easy to generate hybrid schedules.

We provide a taxonomy of server capabilities and data delivery patterns.

The algorithm supports various capability patterns (e.g., pull-only, push-only, hybrid, push-filter, and conditional-pull)

Page 14: Efficient monitoring of Web resources

Capturing Approximate Data Delivery Tradeoffs (“The Proxy The Proxy Dilemma”)

• CompletenessCompleteness more alternatives for drivers

• DelayDelay More time to react • High completeness may result in delayed delivery less time to react.

• Low delay may results in missing updates less alternatives to consider.

Page 15: Efficient monitoring of Web resources

Bandwidth Constrained Complex Profile Satisfaction

Example 1: Arbitrage Monitoring Example 2: Mashups

Page 16: Efficient monitoring of Web resources

Future WorkFuture Work Dual approach:

Consider more constrained settings (e.g., lower bound of gained utility).

Adaptive switch between OptMon1 and OptMon2 solutions. More general probabilistic adaptive framework. None-uniform probing costs.

ProMo hybrid push-pull Consider a constrained setting with ProMo (e.g., minimization

push-pull costs, politeness constraints). Develop a more refined server commitment model and server

selection and ranking techniques.

Page 17: Efficient monitoring of Web resources

Future Work (cont.)Future Work (cont.) Tradeoffs:

Find offline approximation for the general case. Find online policies with competitive guarantees for the general

case. Usage of Pareto sets as design tool for online policies. Private profiles based tradeoffs. Consider complex profiles.

Complex: General cost-benefit model (e.g., consider utility gain vs.

monitoring costs). Use other complex profile semantics (e.g., OR, SUBSET). Develop update models for complex monitoring.

Page 18: Efficient monitoring of Web resources

Backup Slides

Page 19: Efficient monitoring of Web resources

Schedules (cont.)Schedules (cont.)

delay

Tj

ri

Page 20: Efficient monitoring of Web resources

Model: Feasible SchedulesModel: Feasible Schedules

Page 21: Efficient monitoring of Web resources

Execution Intervals – Properties and Execution Intervals – Properties and Effects on SchedulingEffects on Scheduling

Inter-resource overlap Directly affects the probing congestion

Intra-resource overlap Allows more then a single EI to be captured by a single probe.

Rank Effects the difficulty in satisfying a single client requirement or finding

a suitable server capability (thus, some pull will be required). Can cause a skew in resource access patterns.

Explicit vs. Implicit Implicit may require to use update models to derive explicit EIs and

therefore, introduce noise in to the model. Utility

Effect the relative importance of capturing.

Page 22: Efficient monitoring of Web resources

SUP – Dual OptimalitySUP – Dual Optimality max clique

max clique

Page 23: Efficient monitoring of Web resources

fbSUP (SUP with feedback)fbSUP (SUP with feedback)

Page 24: Efficient monitoring of Web resources

fbSUP(fbSUP(λλ))

Page 25: Efficient monitoring of Web resources

SUP vs. TTL & WICSUP vs. TTL & WIC

Static case - FPN(1.0) Dynamic case - Poison

• #probes(SUP) = 2,462

• max #probes(WIC) = >65,000

• max #probes(TTL) = >65,000

• #probes(SUP) = 3,904

• max #probes(WIC) = >20,000

• max #probes(TTL) = >7,000

Page 26: Efficient monitoring of Web resources

fbSUP vs. fbSUP(fbSUP vs. fbSUP(λλ))

• Both adaptive versions improve on SUP (with moderate probe budget increase)

• fbSUP(λ) improve even for X=1

• fbSUP(λ) is the dominant

• Up to X<4 fbSUP(λ) requires slightly more budget then fbSUP

• For X≥4 fbSUP(λ) completely dominates fbSUP.

Page 27: Efficient monitoring of Web resources

ProMo – Server Capabilities vs. Data ProMo – Server Capabilities vs. Data Delivery PatternsDelivery Patterns

Page 28: Efficient monitoring of Web resources

ProMo Middleware - ExampleProMo Middleware - Example

Page 29: Efficient monitoring of Web resources

Pareto Sets and ApproximationPareto Sets and Approximation

Page 30: Efficient monitoring of Web resources

Efficient Offline Optimal SolutionEfficient Offline Optimal Solution (case with no intra-resource overlaps)

Page 31: Efficient monitoring of Web resources

Offline Optimal Algorithm CorrectnessOffline Optimal Algorithm Correctness

• From the algorithm construction:

• Pareto optimality: By induction, let Sj be the jth schedule that is added to S:

SPS

Sj S’S

Page 32: Efficient monitoring of Web resources

Efficient Online PoliciesEfficient Online Policies (case with no intra-resource overlaps)

• Look Ahead:

• Look Back:

PolicyOrderingCompletenessLatencyTradeoff

LAOptimal??

LB2-approx.4-approx.4-approx.

LABOptimalLess then LA

?

LBA2-approx.Optimal2-approx.

Page 33: Efficient monitoring of Web resources

LA (EDF) Optimal completeness LA (EDF) Optimal completeness (no intra-resource overlap)

ri’ri

case 1:

Tj

ri’ri

case 2:

Tj

) preserve(

• gain.

• other resource ri’ was selected (but not by LA) apply Lemma 42 (preserve or gain).

case 1:

) gain(

case 2:

• preserve.

• other resource ri’ was selected by LA (but not by S^) apply Lemma 42 (preserve or gain).

Page 34: Efficient monitoring of Web resources

LAB Dominates LALAB Dominates LA

The proof follows from Lemma 42 (completeness preservation) and Lemma 47 each local change from LA to LAB would result in less delay.

The proof follows from the definition of LAB potential: and ordering operator: .

Page 35: Efficient monitoring of Web resources

LB Tradeoff 4-Approximation LB Tradeoff 4-Approximation (no intra-resource overlap)

• Completeness 2-approximationCompleteness 2-approximation:

Basic idea: given any schedule S and LA, if we change S into LA, each change might improve the performance by at most 1 S has no more then 2 times less completeness then LA.

• Latency 4-approximationLatency 4-approximation:

• Tks: k-th first time that OPT didn’t

probe, but LB did (and Tkf be the last

time)…and Tks’, Tk

f’ when both did.

• Best case: LB and OPT act the same black circles.

• Worst case (triangles): OPT has: while LB has at most:

• Thus we get approximation ratio = 4

Whenever the EIs have uniform width W LB is 2-approximation

Page 36: Efficient monitoring of Web resources

LBW Tradeoff 2-Approximation LBW Tradeoff 2-Approximation (no intra-resource overlap)

Completeness 2-approximationCompleteness 2-approximation: Same as in LB.

Optimal LatencyOptimal Latency: LB’s greedy “delay traps”:Delay trap

Case 1: S

S’j

Case 2: S

S’j

S’

S

Page 37: Efficient monitoring of Web resources

Online policies vs. Optimal Pareto SetOnline policies vs. Optimal Pareto Set

Page 38: Efficient monitoring of Web resources

Online policies vs. Optimal Pareto Set:Online policies vs. Optimal Pareto Set:Runtime Scalability AnalysisRuntime Scalability Analysis

Page 39: Efficient monitoring of Web resources

Online policies: Budget ImpactOnline policies: Budget Impact

Page 40: Efficient monitoring of Web resources

Online policies: Workload Impact (no Online policies: Workload Impact (no intra-resource overlap)intra-resource overlap)

Page 41: Efficient monitoring of Web resources

Online policies: Workload Impact (with Online policies: Workload Impact (with intra-resource overlap)intra-resource overlap)

Page 42: Efficient monitoring of Web resources

Proposed Offline Approximation

• As A we use Bar-Yehuda et al. algorithm for scheduling split-intervals.

• C=1 A provides 2k-approx. we get (2k+2)-approx.

• C>1 A provides (2k+1)-approx. we get (2k+3)-approx.

• Drawbacks: the transformation may be quite expensive. A doesn’t scale (requires LP solution for fractional version of the problem).

Page 43: Efficient monitoring of Web resources

Greedy Online PoliciesGreedy Online Policies

PolicyOrderingProperty (no-intra resource overlaps)

S-EDFS-EDFOptimal for simple profiles

MRSFMRSFl-competitive where

M-EDFM-EDFSimilar to MRSF for problem instances with profiles

Page 44: Efficient monitoring of Web resources

MRSF: MRSF: ll--Compatitive Compatitive (case with no intra-resource overlap)(case with no intra-resource overlap)

“Good guys”

“Bad guys”

• Pick good guys gain 3 + 2 (length(I))• Pick bad guys gain 1

• Ratio = length(I)) at the worst case every CEI has equal length comp. ratio:

Page 45: Efficient monitoring of Web resources

Online Policies vs. Offline approxOnline Policies vs. Offline approx . .

• For rank(P)=1 both WIC and EDF are optimal.

• For any rank(P) the worst case optimal upper bound is OPTrank(1)/rank(k).

• Simple policies (i.e., WIC,EDF) do not fit into problems with complex profiles

• here COMPMRSF ≥ COMPoff ≥ OPT/2k

• offline policy doesn’t scale

• online policies scale quite well