highlights from day 3* in the big data house * ±1

15
Highligh ts from Day 3* in the Big Data House * ±1

Upload: dale-turner

Post on 28-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Highlights from Day 3* in the Big Data House * ±1

Highlights from Day 3* in theBig Data House

* ±1

Page 2: Highlights from Day 3* in the Big Data House * ±1

Wednesday’s theme

• It's not just the scale and volume of data that characterises data-intensive research, but also the complexity within and across datasets

• May be in one discipline or across many

Page 3: Highlights from Day 3* in the Big Data House * ±1

My motivation: understanding the scholarly data ecosystem

• Data collections are growing in number, volume and complexity

• Overall there is growing heterogeneity• The scholarly process seems to be making people

more and more expert in smaller and smaller areas• Grand challenges need researchers to cut across the

silos:– Data– Technology– Community– Funding

Page 4: Highlights from Day 3* in the Big Data House * ±1

Before

• I know people want to do data integration – linkage – different info about same thing/place/person/time• e.g. Google maps• e.g. Longitudinal studies

• I wanted to know what it really means, inside and across disciplines• NAR 1000+ databases• e.g. Climate change

Page 5: Highlights from Day 3* in the Big Data House * ±1

http://isabel-drost.de/hadoop/slides/fosdem2010.pdf

MapReduceWhere is it applicable?

Page 6: Highlights from Day 3* in the Big Data House * ±1

http://www.maptube.org/lookeast

July, August, September 20086,902 responses

BBC Look East: Anti-Social Behaviour

Mike Batty

Page 7: Highlights from Day 3* in the Big Data House * ±1

Ideas on the future of social science research data

• Enduring challenges of documentation for replication, and coordination

• More and more comparative analysis• Harmonisation and standardisation

• Data linkage and data enhancement• Models for complex multiprocess systems • Fluency – increasing uptake by more users

17/MAR/2010 DIR workshop: Handling Social Science Data 7

Paul Lambert

Page 8: Highlights from Day 3* in the Big Data House * ±1

Andrey Rzhetsky

Page 9: Highlights from Day 3* in the Big Data House * ±1

Linked Open Data

Page 10: Highlights from Day 3* in the Big Data House * ±1

Linked data• Lightweight• Doesn’t mandate a technology• Small investment, potential big return• Sometimes misunderstood

– Hugh Glaser didn’t use the O-word or the I-word

• Well positioned for effect in the ecosystem• I’m worried about handling data that changes over

time• “Publish and be damned” can be cultural obstacle

Page 11: Highlights from Day 3* in the Big Data House * ±1

What we didn’t discuss enough(or I wasn’t in the room)

• Provenance working across silos• Map-Reduce• Arts and humanities• ...

Page 12: Highlights from Day 3* in the Big Data House * ±1

SysMO summary

• Providing an environment where every data-driven researcher will thrive

• Reality is messy. – Extreme Technology Determinism vs Voluntarist Sociocultural

shaping

• Extreme and continuous partnership with users.– Act Local Think Global

• Agile development environment facilitated stream of features to tackle pain points.– Leverage other e-Laboratories, Maintaining scientists’ buy-in.

• Socio-Political Axis dominates the Technical Axis.– Collaboration evolutions, Confidence in exchange.

Carole Goble

Page 13: Highlights from Day 3* in the Big Data House * ±1

Socio-technical perspective strong• Carole’s talk:

– Reputation, incentives, sharing

• New forms of data for digital social research– Loyalty cards– Traffic cameras– Smart electricity meters– Facebook

• Privacy vs. inference• Sociology of digital entities?• Social simulation• Crowd sourcing and citizen-sensing• Citation

Page 14: Highlights from Day 3* in the Big Data House * ±1

Structural Analysis of Large Amounts of Music Information University of Illinois, Urbana-Champaign, University of Southampton, McGill UniversityDigging Into the Enlightenment: Mapping the Republic of Letters University of Oklahoma, University of Oxford, Stanford UniversityData Mining with Criminal Intent George Mason University, University of Alberta, University of HertfordshireTowards Dynamic Variorum Editions Mount Allison University, Imperial College, London, Tufts University

Digging into Image Data to Answer Authorship Related Questions Michigan State University, University of Illinois, Urbana-Champaign, University of SheffieldHarvesting Speech Datasets for Linguistic Research on the Web McGill University, Cornell UniversityRailroads and the Making of Modern America–Tools for Spatio-Temporal Correlation, Analysis, and Visualization University of Portsmouth, University of Nebraska-LincolnMining a Year of Speech University of Oxford, University of Pennsylvania

Digging into Data

Structural Analysis of Large Amounts of Music Information University of Illinois, Urbana-Champaign, University of Southampton, McGill UniversityDigging Into the Enlightenment: Mapping the Republic of Letters University of Oklahoma, University of Oxford, Stanford UniversityData Mining with Criminal Intent George Mason University, University of Alberta, University of HertfordshireTowards Dynamic Variorum Editions Mount Allison University, Imperial College, London, Tufts University

Page 15: Highlights from Day 3* in the Big Data House * ±1

Thanks to everyone!