ubiquitous web caching wenzheng gu ph.d. defense cise department, university of florida november 25,...
TRANSCRIPT
Ubiquitous Web Caching
Wenzheng Gu
Ph.D. Defense
CISE Department, University of Florida
November 25, 2003
Outline
Introduction Overview Challenges Contributions Related Work
Extended-ICP Protocol Design Emulation and Analysis
Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation
Outline
Introduction Overview Challenges Contributions Related Work
Extended-ICP Protocol Design Emulation and Analysis
Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation
Ubiquitous Computing
Trends on Wireless and Internet Growth
Web Caching
RemoteServerClient
ProxyServer
Request
Response
HTTP
CacheCache
Cache Hit
Benefits of Web Caching
Reduces network bandwidth usage
Lessens user-perceived delays
Lightens loads on origin servers
Internet Caching Protocol (ICP)
Two Types of Relationship: parent sibling
Parent 1 Parent 2
parent
sibling
Child 1 Child 2 Child 3
Outline
Introduction Overview Challenges Contributions Related Work
Extended-ICP Protocol Design Emulation and Analysis
Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation
The Impact of Mobility on Web Access and Web Caching (1/2) Currently, there is no mobile Web caching protocol. Changing Network:
By leaving home network, mobile users are disconnected from their home cache servers.
By returning home or visiting other networks, users are disconnected from the cache servers just visited.
Hence, users experience degradation of performance while mobile and upon their return.
Changing devices: Users lose client cached objects, favorites, cookies. Users lose personal calendar, contact information
The Impact of Mobility on Web Access and Web Caching (2/2) Heterogeneity of Devices Wide Variety of Web Contents Lack of Automated User Intent Wireless Network Limitation
Low Bandwidth Disconnection/ Handoff Address Migration
Lack of Context Aware Lack of Security
Outline
Introduction Overview Challenges Contributions Related Work
Extended-ICP Protocol Design Emulation and Analysis
Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation
Contributions Extended ICP Protocol (x-ICP)
Support for Mobility in Web Caching Experimentally demonstrated and quantified the
benefits of x-ICP in terms of cache hit rate. Adaptation Mechanisms to Cope with Device
Heterogeneity and Web Content Variety Adaptive Web Content Adaptive Client and Server Side Algorithms Experimentally demonstrated the benefits of our
adaptive mechanism
Architecture of Ubiquitous Web Caching
IBM Compatible
Laptop computer
Power Mac G4
Hand held computer
Cell phone
ProxyWeb Server
Web ServerWeb Server
PFML
HTTP
X-ICP
PFML
Outline
Introduction Overview Challenges Contributions Related Work
Extended-ICP Protocol Design Emulation and Analysis
Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation
Mobile IP Routing [PER98]
Foreign Proxy
Home Proxy HomeNetwork
ForeignNetwork
InternetHost
Mobile Client
Mobile Client
Addressing Device Heterogeneity– CC/PP [CCP99]
CC/PP stands for Composite Capabilities/ Preferences Profiles
The CC/PP describes and manages software and hardware profiles that include: information on the user agent's capabilities the user's specified preferences within the user
agent's set of options;
Content Negotiation [HOL98]
Container Page Req
Container Page Res
Embedded obj.req. with Accept
Headers
Variant selectionbased on variantlist and CC/PP
selected Obj.
OriginServer Client
Container Page Req
Container Page Res
Embedded obj.req. with Accept
Headers
Variant selectionbased on variant
list, properties andaccept headers selected Obj.
CacheServer Client
Container Page Req
Container Page Res
Embedded obj. req.
Variant selectionbased on variant
list, properties andclient info.
selected Obj.
CacheServer Client
Container Page Req
Container Page Res
Embedded obj. req.
Variant selectionbased on variant
list and client info.
variant list
OriginServer Client
Obj. URI of aspecific version
selected Obj.
Server-driven Negotiation Agent-driven Negotiation
Transparent Negotiation Versioning Negotiation
Content Adaptation [SMI98]
Outline
Introduction Overview Motivation and Contributions Related Work
Extended-ICP Protocol Design Emulation and Analysis
Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation
Overview of X-ICP
A Web caching protocol to support mobile users Automatically connect the user’s Foreign Proxy with
his Home Proxy when a user changes the point of attachment on the network
Deliver user’s profile If network situation permits, deliver cached objects
from Home Proxy instead of from origin Web site Collect all the downloaded objects and store them
on the Home Proxy when a user is on the move so that the contents continues to be available upon the user’s return
X-ICP Infrastructure
Home
Internet
Cache ExchangeMotionConnection
Modules of X-ICPProxyServer
ServerSide
Process
ClientSide
Process
StorageManager
NodeMonitor
CacheCopier
To Web
From Client
Exchange with other cache copier
Exchange with other
Node Monitors
X-ICP Processes
Proxy and X-ICP Services Discovery
Mobile Node Registration
Web Object Delivery
Cache Contents Duplication
Outline
Introduction Overview Motivation and Contributions Related Work
Extended-ICP Protocol Design Emulation and Analysis
Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation
Emulation Environment
Proxy logs from CISE department of UF are used as trace data. The field of URL is mainly utilized to measure the cache hit rate.
3 out of 5 subnets on the CISE network with different population were chosen. Traces were kept running for 25 days each.
ICP was implemented to query Aswan and/or Cairo in order to locate which object is from where.
Mobility is emulated by clearing the cache everyday, thus compulsory miss is higher.
Aswan and Cairo are configured as sibling in Squid.
Home Proxy--Aswan Foreign Proxy--Cairo
client
Solaris 6/Squid 2.4 Solaris 8/Squid 2.4
LAN To Internet
ICP
Emulation Results(1/2)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
1 3 5 7 9 11 13 15 17
Day
Hit
Rat
ioHome ProxyForeign Proxy
Emulation Results(2/2)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1 3 5 7 9 11 13 15 17
Day
Hit
Rat
io
Home ProxyForeign Proxy
Emulation Conclusion
With x-ICP deployed, a 21% higher hit rates can be achieved, which is the hit rate on the Home Proxy when users are attached to a Foreign Proxy.
Definitions
Execution time for entire task without x-ICP X-ICP Speedup = ---------------------------------------------------------------------
Execution time for entire task using x-ICP when possible
Cf : Cache Hit Rate on a Foreign ProxyCh : Cache Hit Rate on a Home ProxyDo : Round-trip delay between Foreign Proxy and Origin ServerDh : Round-trip delay between Foreign Proxy and Home Proxy
Do
Foreign Proxy (Cf)
Home Proxy (Ch)
Origin Site
Dh
Performance Analysis of x-ICP
ohh
o
ohhf
of
DCD
DSpeedup
DCDC
DCSpeedup
)1(
])1([)1(
)1(
Performance Analysis of x-ICP
Let Do = 65ms Based on Cottrell’s study on “Internet
Monitoring at SLAC” [COT00].
6579.065
Speedup
Dh
Let Ch = 21%, From our Emulation study based on CISE
Web Caching logs.
0
2
4
6
8
10
12
14
16
1 1.05 1.1 1.15 1.2 1.25
Speedup
Ro
un
d-t
rip
tim
e (
ms)
13.65 Y=0.024X+5.8887
201 miles geographic distance between two proxy servers is allowed with x-ICP deployed
RTT<2ms on campus network
The Speedup is 1.22 with x-ICP deployed
Analysis Results on x-ICP
Sensitivity Analysis – Do
0
0.2
0.4
0.6
0.8
1
1.2
1.4
0 10 20 30 40 50 60 70 80 90 100
Average Regional RTT within North America (ms)
Sp
eed
up
Dh=2
Dh=13.65
Generally speaking, the impact of the average regional RTT value is not significant on the speedup.
Do=65
Sensitivity Analysis - Ch
0
5
10
15
20
25
30
35
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Cache Hit Rate on Home Proxy (Ch)
Sp
eed
up
Dh=2
Dh=13.65
The increment of the speedup is negligible when the cache hit rate value is small.
Ch=21%
Evaluation on X-ICP
With x-ICP deployed, a 21% cache hit rate can be achieved on the Home Proxy
With that 21% hit rate a 1.22 times higher speedup can be gained on a
campus wide high speed network. the distance of two proxy servers can be up to
about 201 miles in terms of current Internet environment.
Summary on X-ICP
X-ICP extends ICP caching protocol to support for mobility
X-ICP reduces the user’s response time Under x-ICP, user’s profile follows the user
while mobile. This provides for a seamless Web experience.
Outline
Introduction Overview Challenges Contributions Related Work
Extended-ICP Protocol Design Emulation and Analysis
Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation
Content Service Overlay Content Service Overlay NetworksNetworks
OriginServer
Client
Client
Client
Packet Network
Content Network Overlay
Edge Node
Content Services Network Overlay
Server Server
Content Types
XHTML pages Content files
Video Image Text Audio
Content Delivery to Heterogeneous Devices
Existing Approaches Content Adaptation Content Negotiation
Our Approach—Partiality Fidelity Markup Language
Take advantage of the index page Insert two types of new tags as metadata
Priority Tag Fidelity Tag
The Hierarchy of PFML Elements
PFML
Priority
Fidelity
Choice
Img Script Embed
Other HTML Tags …
The Document Type Definition of PFML
<?xml version =”1.0”?><!DOCTYPE PFML SYSTEM “PFML.dtd”><!ELEMENT PFML (Priority*)>
<!ELEMENT Priority ANY><!ATTLIST Priority name CDATA #IMPLIED ><!ATTLIST Priority value (0|1|2|3|4|5|6|7|8|9) ‘9’ ><!ATTLIST Priority fixed (Y|N) ‘Y’ >
<!ELEMENT Fidelity (choice*)><!ELEMENT choice (img* | script* | embed*)><!ATTLIST choice sourceQuality CDATA ‘1’ type CDATA #IMPLIED charset CDATA #IMPLIED language CDATA #IMPLIED feature CDATA #IMPLIED … … >
Processing on PFML
“9”
“5”
Foo.gif
“0”
“9”
“5”
Foo.gif
“9”
“5”
Foo.png
1 2
.
xxxxxxxxxx
Foo.png
Foo.png
1. Partiality Adaptation
2. Versioning Negotiation
Outline Introduction
Overview Challenges Contributions Related Work
Extended-ICP Protocol Design Emulation and Analysis
Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation
Priority Tag
Priority tag is used to divide a web page into several portions
Advantages: Enable different device users share the same
index page so that increase the cache hit rate and reduces traffic and response time
Reduce the number of embedded objects to download so that bandwidth is saved
Example of Priority Tag <?xml version = “1.0”?> <Priority value=’9’ fixed=‘Y’> <HTML> <!--Foo’s personal Web site. --> <HEAD> <TITLE> Foo’s Home </TITLE> </HEAD> <BODY> <!- - self-introduction- -> <P> I am … </P> </Priority> <Priority value=’5’ fixed=‘N’> <!- -Personal picture - -> <IMG SRC=”Foo.gif” BORDER…> <!- - My interests - -> <P> I like sports and music… </P> <!- -friends’ link - -> <P>Foo1 < A HREF = HTTP://…></P> <P>Foo2 < A HREF = HTTP://…></P> </Priority> <Priority value=’9’ fixed=‘Y’> <!- - contact information - -> <P> Phone #: (123)456-7890 </P> </BODY> </HTML> </Priority>
The Adaptive Priority Decision Algorithm
Page Segment Priority Decision Algorithm
Agent priority Decision Algorithm
Algorithm Complexity
Maintained in O(log n)
Inserted or Deleted in O(log n)
Constructed in O(n)
Web Caching in Partiality Adaptation A mobile device can take advantage of the copy
of a Web page previously downloaded by some other devices, for example, a desktop, in a caching hierarchy.
A device with more capabilities can use the partial copy of a Web page downloaded previously by a smaller device, and send it to the user directly.
The user community size is bigger so cache hit rate could be higher.
Experiments on Priority Tags
Laptop computer
PDA
Cell phone
Proxy ServerWeb Server
Internet
Experiment One on Partiality Adaptation (1/2)
1
10
1001000
10000100000
Cnn L
aptop
Goo
gle L
apto
p
Cnn P
DA
Goo
gle P
DA
Cnn P
hone
Goo
gle P
hone
log
arit
hm
ic t
ime
(ms)
Remote
ExtractedCached
Experiment One on Partiality Adaptation (2/2)
When the speed of wireless network is above 11Mbps, it’s 9 times faster to download a 50k Web page from an extracted case than from origin site.
Questions? What if the speed of wireless network is not fast
enough? What if the Web page is not big enough?
More experimentation is needed.
Experiment Two on Partiality Adaptation (1/2)
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
0 5000 10000 15000 20000 25000 30000
Page Size (bytes)
Do
wn
load
ing
Tim
e (m
s)
Remote Extracted
Experiment Two on Partiality Adaptation (2/2)
Simulation Data On an average of 4843 bytes downloading With Priority Tags 5910 ms Without Priority Tags 6857 ms
According to our simulation, using Priority Tags can reduce about 1 second response time to the cellular phone users to browse the internet.
Outline
Introduction Overview Challenges Contributions Related Work
Extended-ICP Protocol Design Simulation and Analysis
Adaptation and Negotiation with Caching Priority Fidelity Markup Language Partiality Adaptation Versioning Negotiation
Fidelity Tag
Fidelity Tags are mainly used for content negotiation.
Allow web server to insert the object lists and their attributes into a web page where the corresponding web object is embedded.
Advantage: Let user make the decision so that eliminates the
CC/PP file fetching and parsing Reduce the number of round-trips
Example of Fidelity Tag <Fidelity> <choice sourceQuality= “1” type=“img/gif”>
<img src=“/images/foo.gif” width=“276” height=“110” /> </choice> <choice sourceQuality=“0.6” type=“img/png”>
<img src=“/images/foo.png” width=“76” height= “30” /> </choice> <choice> foo </choice> </Fidelity> <Fidelity> <choice sourceQuality= “0.9” type= “text/html” language= “en”>
<doc src=”/document/paper.html.en” /> </choice> <choice sourceQuality= “0.7” type=”text/html” language=”fr” >
<doc src=”/document/paper.html.fr” /> </choice> <choice sourceQuality= “1.0” type= “application/postscript” language= “en” >
<doc src=”/document/paper.ps.en” /> </choice> </Fidelity>
Experiment on Fidelity Tags
i95cl
Apache Web Server
Nextel Tower
CCPP
PAVN
Internet
Total Roundtrip Time
0
2000
4000
6000
8000
10000
12000
Object Size (bytes)
Ro
un
d-t
rip
Tim
e (
ms) PAVN
CCPP
Time Measured on the Server Side
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
CCPP
PAVN
time (seconds)
total
system
user
Evaluation on Fidelity Tags
We saved about 1 second on the server side by using PAVN instead of CC/PP module.
We saved about 0.8 second on the total round-trip time with our implementation on PAVN and CC/PP.
Summary on PFML
With the simple Priority and Fidelity metadata and associated algorithms, we give a better solution to the following problems: Heterogeneity of Devices Wide Variety of Web Contents Lack of Automated User Intent Low Bandwidth on Wireless Network
Conclusion
Designed a mobile Web caching protocol Deliver Web contents from nearby proxy Deliver user’s personal profile
Designed adaptive PFML and associated algorithms, with Content adaptation Content negotiation
Future Work
To model user’s behavior while mobile How far is the foreign proxy How long is the user’s linger time What devices are being used The speed of movement How frequent to move
It will help to deploy proxies and to determine the functionalities of proxies
Publications
“Extended Internet Caching Protocol: A Foundation for Building Web Caching to Nomadic Users,” ACM Symposium on Applied Computing, Melbourne, FL, January 2003.
“Ubiquitous Web Caching,” submitted to Wireless Communication and Mobile Computing Magazine, John Wiley and Sons, 2003.
“Adaptive Content Delivery with XML,” to submit, ITC Specialist Seminar on Performance Evaluation of Wireless and Mobile Systems Antwerp, Belgium, August 2004
Questions
Thanks!
In the core
FlashFlash
HTMLHTML
WMLWML
DHTMLDHTML
ASPASP
PHPPHPJPEGJPEG
PNGPNGGIFGIF
MPEG1MPEG1
RealReal
Windows MediaWindows MediaQuicktimeQuicktime
MPEG4MPEG4
At the end
DesktopDesktop
PDAPDA
PalmtopPalmtop
Integrated ChipIntegrated Chip
Embedded DevicesEmbedded DevicesWAP PhoneWAP Phone
LaptopLaptop
Mobile IP Tunneling
Scenario on X-ICP Registration
Hops Detection(1/4)
Deploying x-ICP on different networks can bring more overhead.
If the two proxy servers are too far away from each other, x-ICP sibling configuration shouldn’t take place.
Hops, RTT, or physical distance of the two servers should be detected.
On the Foreign Proxy (proxyE): Cache_host proxy4.Net1 sibling http-port
icp-port On the Home Proxy (proxy4):
acl src ProxyE ProxyE.Net2 Http_access allow ProxyE ICP_access allow ProxyE
Sibling Proxy Configuration(2/4)
Care_of_Address
Register with Node Monitor (3/4) Foreign User 1
Foreign User 2
Foreign User n
URL 1 URL nURL 2
URL 1 URL nURL 2
URL 1 URL nURL 2
Object Lists
ForeignUserList
User Profile Delivery (4/4)
Bookmarks History links Contact information Cookies …
RTT vs. Distance (1/3)(courtesy of Stanford Linear Accelerator Center )
0.002.004.006.008.00
10.0012.0014.0016.0018.00
1 2 3 4 5 6
time period of a day
Del
ay(m
s)
hop 1hop 2hop 3hop 4hop 5hop 6hop 7
RTT vs. Distance(2/3) This is a trace-route like simulation conducted in our lab. Requests are made to the Random selected top 100 Web sites. Round trip time for the first 7 hops (routers) are collected. The first 6 hops are on campus. It shows the RTT < 2ms on the
campus backbone.
Page Segment Priority Value Decision Algorithm(1/2) # Nc: total number of clicks; increment upon each click
# Ns: total number of segments of a page
# t: a function to calculate a specific threshold with parameters
# Pi: priority value for segment i
# Ti: the time stamp to generate the Pi
# Tnow: the current time
# Ci: total number of clicks on segment i
# Ci’: total number of clicks on segment i sent from client agent
# executed on each access
for each segment i {
Ci Ci + Ci’;
Nc Nc + Ci’;
}
Page Segment Priority Value Decision
Algorithm(2/2) # executed periodically
for each segment i {
# priority value increment
if ( Ci > t (Ci,Pi,Nc,Ns) and Pi < 9 ) {
Pi Pi + 1;
Ti Tnow;
}
# priority value decrement
# expired means the segment hasn’t been touched for a period
else if ( Ti expired and Pi > 0 ) {
Pi Pi –1;
Ti Tnow;
}
}
Client Agent Priority Value Decision Algorithm (1/2)# Nj,c: total number of clicks on a page; increment upon each click # Nj,s: total number of segments of a page # Np: total number of pages # t: a function to calculate a specific threshold with parameters # Pj,i: priority value for segment I # Pj,c: priority value for a page # Tj,c: the time stamp to generate the Pj,c # Vk: the total number of pages having priority k, where 0<=k<=9# Pa: priority value for a client agent# Ta: the time stamp to generate Pa# Cj,i: total number of clicks on segment i, page j.
# upon each clickif (new page) Np Np +1; Initialize new (Cj,i)s to 0; Initialize new Pj,c to 0;Cj,i Cj,i + 1;Nj,c Nj,c + 1;
Client Agent Priority Value Decision Algorithm (1/2)
# at the idle timefor each page j{ Pj,c’ Pj,c;
# change priority of pagefor each segment i
if ( Cj,i > t(Nj,c,Nj,s) and Pj,c > Pj,i)
Pj,c Pj,i;Tj,c Tnow;
else if ( Tj,c expired)if ( Pj,c < 9 )
Pj,c Pj,c + 1;
Tj,c Tnow;
# change priority of agentif ( Pj,c <> Pj,c’)
k Pj,c’; Vk Vk -1;
k Pj,c;Vk Vk +1;if (Vk > t(Np) and Pa >
k)Pa k;Ta Tnow;
else if (Ta expires and Pa < 9)Pa Pa + 1;Ta Tnow;
}
RVSA details:
The overall quality Q of a variant is the value of Q = round5( qs * qt * qc * ql * qf ) qs Is the source quality factor in the variant
description. qt The media type quality factor qc The charset quality factor ql The language quality factor qf The features quality factor
Example of RVSA
Variant list {"paper.html.en" 0.9 {type text/html} {language en}}, {"paper.html.fr" 0.7 {type text/html} {language fr}}, {"paper.ps.en" 1.0 {type application/postscript} {language
en}} Request Accept- headers :
text/html:q=1.0, */*:q=0.8 Accept-Language: en;q=1.0, fr;q=0.5
Computations round5 ( qs * qt * qc * ql * qf ) = Q paper.html.en: 0.9 * 1.0 * 1.0 * 1.0 * 1.0 = 0.90000
paper.html.fr: 0.7 * 1.0 * 1.0 * 0.5 * 1.0 = 0.35000 paper.ps.en: 1.0 * 0.8 * 1.0 * 1.0 * 1.0 = 0.80000