data science research center - overview and mission

21
Marcel Worring University of Amsterdam An Introduction

Upload: data-science-research-center

Post on 20-Aug-2015

451 views

Category:

Technology


5 download

TRANSCRIPT

Marcel WorringUniversity of Amsterdam

An Introduction

Outline

1. Goals of the DSRC 2. Embedding and organization 3. Realizing the DSRC

4. The roadmap5. How to get involved

The goal

• Become a leading center on data science by developing the new data science discipline

Leveraging our scientific excellence

Leveraging our tools and infrastructure

Reaching out

Educating talents

1001000101001000101001010101010001010101010101010101010001010101010101010101010010101001010101010101010101001010100000101010101001010101000101001010101010001111010101010101010101010101010100010101010101010101111101010101010101010101010010101010101010101010101000101010101010101010101010

Our contributions to data science

Data

Understand and decide

Analyze and model

Store and process

Reasoning

Knowledge representati

on

MultimediaRetrieval

Modeling and

simulation

Machine Learning

Information Retrieval

Decision Theory

BusinessAnalytics

VisualAnalytics

DistributedProcessing

Large Scale Databases

SoftwareEng.

System / Network

Eng.

SecurityPrivacy

Provenance

Goals in data science research

Data

Understand and decide

Analyze and model

Store and process

Reasoning

Knowledge representati

on

MultimediaRetrieval

Modeling and

simulation

Machine Learning

Information Retrieval

Decision Theory

BusinessAnalytics

VisualAnalytics

DistributedProcessing

Large Scale Databases

SoftwareEng.

System / Network

Eng.

SpeedEfficiencyScalability

InsightImpact

PrecisionRecall

Model fit

Conformance

Data Science

• Characteristics– All are connected– All driven by data and its use– A holistic approach is needed

• Our answer– The Data Science Research Center

Research in data science

Data

Understand and decide

Analyze and model

Store and process

Reasoning

Knowledge representati

on

MultimediaRetrieval

Modeling and

simulation

Machine Learning

Information Retrieval

Decision Theory

Business Analytics

VisualAnalytics

DistributedProcessing

Large Scale Databases

SoftwareEng.

System / Network

Eng.

Data Science

Research Center

Embedding and relations

DSRC

Informatics InstituteUvA

Department of Computer Science VU

E-science center SURFsara

CWI

ILLC

HvA

Department of mathematics

CLHC Forensic Science

Network Institute

Center for Digital Humanities

Center for content, creation and technology

Amsterdam BusinessSchool

OrganizationDaily management team

Management board

Leading researchers

Marcel Worring Paul Groth Sanne Veenenbos

Max Welling Henri Bal Bert Bredeweg Ger Koole

Sander Klous Arnold SmeuldersMaarten de RijkePeter BonczFrank van HarmelenJacopo UrbaniCees de Laat

Realization: four aims

Research: a platform for research in data science connecting people and methodologies.

Infrastructure: a data-driven infrastructure for experimenting with realistic complex data sets.

Valorization: a channel between scientific research and third party applications.

Education: data-science curricula with realistic data experimentation throughout the program.

Research

• Focus on research with a holistic view on data science– Connecting the different disciplines– From data to domain impact

• Start – Seed projects: small projects bringing together

two or more ICT disciplines• Workshops

– Domain workshops: with all stakeholders define research topics leading to data science project proposals

Infrastructure

“In a sense, the physical and technical infrastructure

becomes invisible and the data themselves become the

infrastructure – a valuable asset, on which science,

technology, the economy and society can advance.”[“Riding the wave” EU High Level Expert Group]

Shared large scale

datasources

Common tools and code bases

Transparant access to distributed computing infrastructure

Shared domain driven tasks

Valorization

• Joint full projects– Within the DSRC– With industry / govermental organizations

• Small-scale projects– From data and problem to solution with quick

turnaround • Competitions

– From data and problem to innovative solutions worked on by a number of teams

• Spin-offs and startups

Education

• Infrastructure yields platform for education in – Informatics

• Information Science, Artificial Intelligence, Software Engineering, Computer Science, Business Analytics

– Domain specific courses• E.g. Minor Data Science for X (your favorite discipline)

– Commercial courses• The objective of DSRC

– to introduce a full data science program• With hands-on experience

– On real data and real problems or innovations

Finance

UvA Faculty

Research Cluster

VUMatching

Funds

AFS profiling funds

ProjectsProjects

Projects

Projects

Projects

Projects

Domains

• Digital humanities • Computational social science• Digital forensics and security• City technology• Physical sciences• Life sciences• Business analytics• .......

Fully functional

Data science programSelf-sustained

Holistic data science

Start of new PhD projects

Roadmap

Seed projects

Domain workshops

Invited talks

Basic transparent infrastructure

Infrastructure design

Data acquisition

Start of projectsEmbedding infrastructure

Courses

Minor(s) data scienceDomain workshopsProject acquisition

RDA

Year 1

Year 2

Year 3 and 4 PIRE

How to get involved

• As a data science researcher– See what we can offer and what you can offer– Define one of the seed projects– Participate/propose workshops– Acknowledge DSRC in publications– Bring in (existing or new) projects

• Contact us via– [email protected]

How to get involved

• As a data-driven application holder– Participate in the workshops and events– See how you can share your data– See whether we can develop joint projects– Link domain knowledge and data science

research• Contact us via

[email protected]

1001000101001000101001010101010001010101010101010101010001010101010101010101010010101001010101010101010101001010100000101010101001010101000101001010101010001111010101010101010101010101010100010101010101010101111101010101010101010101010010101010101010101010101000101010101010101010101010

Overview of the talks

Data

Understand and decide

Analyze and model

Store and process

Reasoning

Knowledge representati

on

MultimediaRetrieval

Modeling and

simulation

Machine Learning

Information Retrieval

Decision Theory

Business Analytics

VisualAnalytics

DistributedProcessing

Large Scale Databases

SoftwareEng.

System / Network

Eng.

Henri Bal

Maarten de RijkeCees de LaatMax Welling

Peter Boncz

See some of the demos

Thanks for you attention

[email protected]

www.dsrc.nl