research overview gagan agrawal associate professor

9
Research Overview Gagan Agrawal Associate Professor

Upload: lorin-watkins

Post on 18-Jan-2018

216 views

Category:

Documents


0 download

DESCRIPTION

An Overall Vision Our world will be full of distributed and dynamic data sources High speed networking (Grid computing) Sensor networks, mobile systems, embedded devices Processing this information involves many challenges A lot of data, distributed Often, continuous data streams (can’t store all data, real- time processing constraint) Complex interplay of communication and computational costs Application programmers want more transparency

TRANSCRIPT

Page 1: Research Overview Gagan Agrawal Associate Professor

Research Overview

Gagan Agrawal Associate Professor

Page 2: Research Overview Gagan Agrawal Associate Professor

Personnel Involved Ph.D student

Liang Chen Wei Du Ruoming Jin Feng Li (Jointly with Joel Saltz) Xiaogang Li

Masters (thesis) student Ge Yang

Undergrad student Leo Glimcher

Faculty collaborations: Joel Saltz, Tahsin Kurc, Umit Catalyurek, Srini Parthasarathy, Raghu Machiraju

Page 3: Research Overview Gagan Agrawal Associate Professor

An Overall Vision Our world will be full of distributed and dynamic

data sources High speed networking (Grid computing) Sensor networks, mobile systems, embedded devices

Processing this information involves many challenges

A lot of data, distributed Often, continuous data streams (can’t store all data, real-

time processing constraint) Complex interplay of communication and computational

costs Application programmers want more transparency

Page 4: Research Overview Gagan Agrawal Associate Professor

Research Projects Compilers: Compiling XQuery (Query Language for

XML data), Compiling for a distributed heterogeneous (grid) environment, parallelizing scientific data intensive and data mining codes

Middleware and Runtime Support: FREERIDE (Framework for Rapid Implementation of Datamining Engines), ongoing work on distributed processing of data streams

Data mining and OLAP algorithms: Mining for streaming data, Parallel and scalable mining algorithms, OLAP algorithms

Page 5: Research Overview Gagan Agrawal Associate Professor

Compiling Data Intensive Applications for a Grid Environment

Page 6: Research Overview Gagan Agrawal Associate Professor

Compiling XQuery Vision: XML has become an accepted standard

for distribution of datasets XQuery is the well-accepted high-level query

language for querying and processing XML datasets

Compiling complex data-intensive reduction operations written in XQuery

Reductions written using recursion Data-centric execution strategies Using XML Schemas to describe the datasets -

Page 7: Research Overview Gagan Agrawal Associate Professor

System Support for Data Mining in a Parallel Environment

Clusters of SMPs

Data Parallel Java

Compiler Techniques

MPI+Posix Threads+File I/O

FREERIDE(middleware)

Runtime Techniques

Page 8: Research Overview Gagan Agrawal Associate Professor

Distributed Processing of Data Streams Processing continuous data streams arising from

distributed sources A number of system and algorithmic challenges

Real time requirement on processing rate – tradeoffs between accuracy of analysis and efficiency

Placement of data – obviously want to process an individual stream close to the source of data

Feedback based control of accuracy – cannot allow any computational or communication stage to become the bottleneck

Performance modeling: impact of output size, level of sampling etc. on performance

Recently started work in this area ….

Page 9: Research Overview Gagan Agrawal Associate Professor

Algorithms for Mining and OLAP Decision tree construction for streaming data:

new one-pass algorithm with statistical accuracy bound

Parallel and scalable decision tree construction: use sampling, but without losing accuracy

Data cube construction: Parallel algorithms with optimal communication

volume Tiling based algorithms for scaling output sizes