prefetching for visual data exploration punit r. doshi, elke a. rundensteiner, matthew o. ward...
Post on 22-Dec-2015
226 views
TRANSCRIPT
Prefetching for Visual Data Exploration
Punit R. Doshi, Elke A. Rundensteiner, Matthew O. Ward
Computer Science Department
Worcester Polytechnic Institute
Support: NSF grants IIS-9732897, EIA-9729878, and IIS-0119276.
2
Overview
• Why visually explore data?
– Fact: Increasing data set sizes
– Need: Efficient techniques for exploring the data
– Possible solution: Interactive Data Visualization -- humans can detect certain patterns better and faster than data mining tools
• Why cache and prefetch?
– Interactive data visualization tools do not scale well
– Interactive real-time response needed
– Caching and prefetching improve response time.
• Goal: Propose and evaluate prefetching for visualization tools
3Data Hierarchy
Flat Display
Hierarchical Display
Example Visual Exploration Tool: XmdvTool
4
Example Visual Exploration Tool: XmdvTool
Structure-Based Brush2 Parallel Coordinates (Linked with Brush2)
Roll-Up:
Structure-Based Brush1 Parallel Coordinates (Linked with Brush1)
Drill Down:
5
Characteristics of a Visualization Environment
Characteristics that can be exploited for caching and prefetching:
• Locality of exploration• Contiguity of user
movements• Idle time due to user
viewing displayMove left/right
Move up/down
6
• Purpose• reduce response time and network traffic
• Issues• visual query cannot directly translate into object IDs high-level cache specification to avoid complete scans
• Semantic Caching: queries are cached rather than objects• minimize cost of cache lookup• dynamically adapt cached queries to patterns of queries
Overview of Semantic Caching
DBcache
Server machineClient machine
GUI
7
In XmdvTool, caching reduced response time by 85%
Effectiveness of Caching
0
40
80
120
160
200
Client OFFServer OFF
Client OFFServer ON
Client ON ServerOFF
Client ON ServerON
Caching
Res
po
nse
Tim
e (s
eco
nd
s)
Prefetching can further improve response time.
8
Prefetching• Locality of exploration• Contiguity of user
movements• Idle time due to user
viewing display
New user query
Idle time
Prefetching
Cache DB
User’s next request can be predicted with high accuracy
Time to prefetch
Fetching
9
m(n-2)
m(n-1)m(n)
m(n+1)
Exponential Weight Average Strategy
m(n-2)
m(n-1)m(n)
m(n+1)
Mean Strategy
Vector Strategies
Hot Regions
Current Navigation
Window
Focus Strategy
Data Set Driven Strategy
(m-1) m (m+1)
Direction Strategy
Localized Speculative Strategies
Random Strategy
1/41/4
1/4
1/4
Prefetching Strategies
10
Used:– C/C++
– TCL/TK
– OpenGL
– Oracle 8i
– Pro*C
User
MinMaxLabeling
SchemaInfo
Hierarchical Data
RewriterTranslator
Loader
BufferQueries
GUI
OFF-LINE PROCESS
Estimator
ExplorationVariables
DB
ON-LINE PROCESS
CACHE
Flat Data
PrefetcherLibrary:RandomDirection
Focus
EWAMean
DB DB
Buffer
XmdvTool Implementation
11
Evaluation of Prefetching Strategies• Setup:
– Testbed: XmdvTool freeware system for n-dimensional exploration
– User Traces:• Synthetic user traces with varying # of hot regions,
% directionality, average delay between user requests• Real user traces collected by a user study
• Study effect of different navigation patterns:– # hot regions– erratic vs. directional– delay between user requests
12
Focus strategy best as # hot regions increases
Prefetchingimproves response time
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5
Number of Hot Regions
Norm
aliz
ed L
aten
cyNo Prefetch
Random
Direction
Focus
Mean
EWA
13
Random Strategy – best for erratic traces.Direction Strategy – best for directional traces.
0
0.2
0.4
0.6
0.8
1
0 20 40 60 80 100
'Keep Direction' factor
No
rmal
ized
Lat
ency
No Prefetch
Random
Direction
Focus
Mean
EWA
14
Prefetcher performance improves and plateaus as delay between user operations increases.
Prefetcher performance improved up to 28%.
Recall: Caching improved response time by 85% over no caching.
0
5
10
15
20
25
30
0 1 2 3 4 5 6 7 8
Delay between User Operations (seconds)
Per
cent
age
Impr
ovem
ent (
%)
15
What Can We Conclude?• Focus: hot region calculation overhead• Mean and EWA: offers more than needed• Direction: simple, no prior knowledge required
NOTE:• Our experiments on real user traces show that real
users are highly directional
If only one strategy can be chosen, select Directional Prefetching.
16
Related Work
• Integrated visualization-database systems -- Tioga, IDEA, DEVise
[have not used caching and prefetching]
• Prefetching research -- mostly on (1) web prefetching, (2) prefetching for memory caches by OS, (3) I/O prefetching.
[no prefetching research for visualization apps]
17
Contributions
• Identified key characteristics of visualization tools exploitable for optimizing data access performance
• Developed, implemented and tested prefetching strategies in XmdvTool
• Shown that caching coupled with prefetching at client-side improves data access performance– Caching reduces response time by 85% over no-caching.
– Prefetching further improves response time by 28% over no-prefetching.
18
Future Work
No single prefetcher works best for all types of user navigation patterns
Adaptive Prefetching (preliminary results show that this further improves response time and reduces prediction errors, at a minimal overhead cost).
19
Thank You
XmdvTool Homepage:
http://davis.wpi.edu/~xmdv
Code is free for research and education.
Contact author: [email protected]