how to use data for good

80
#DataTalk How to Use Data for Good

Upload: experianus

Post on 21-Apr-2017

6.128 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: How to Use Data for Good

#DataTalk How to Use Data for Good

Page 2: How to Use Data for Good

Join our #DataTalk on Thursdays at 5 p.m. ET

This week, we talked with DataKind, Real Impact Analytics, Elissa Redmiles, a Data Science for Social good Summer Fellow at the University of Chicago, Nick Eng, Data Scientist at the Center for Data Science and Public Policy at the University of Chicago, and Kevin Chen, Chief Scientist at the Experian Data Lab.

Check out the resources and tweets from this chat:

ex.pn/dataforgood

Page 3: How to Use Data for Good

What is a “data for good” project?

Page 4: How to Use Data for Good

Real Impact Analytics@RIAnalytics

Data for good is the use of big data to help policymakers and aid workers foster the social public good and maximize impact.

ex.pn/datatalk#DataTalk

Page 5: How to Use Data for Good

Elissa RedmilesData Science for Social Good Summer Fellow at the University of Chicago@eredmil1 ex.pn/datatalk

#DataTalk

Data for good projects use data and data science to help nonprofits better reach their

mission and assist their target audience.

Page 6: How to Use Data for Good

Nick EngData Scientist, Center for Data Science at the University of Chicago@nick_eng ex.pn/datatalk

#DataTalk

Projects that use data to improve society as a whole, rather than any single individual. To be more specific: communities/cities.

Page 7: How to Use Data for Good

DataKind@DataKind

ex.pn/datatalk#DataTalk

Data and subject matter experts working together to use data to address humanitarian

challenges. Collaboration is key!

Page 8: How to Use Data for Good

Kevin ChenChief Data Scientist, Experian Data Lab @kevincchen

ex.pn/datatalk#DataTalk

Projects that improve social equality by leveraging public and private data.

Page 9: How to Use Data for Good

Nick EngData Scientist, Center for Data Science at the University of Chicago@nick_eng ex.pn/datatalk

#DataTalk

Using data to help the underprivileged, especially for those who might not know

how data can be used as a tool.

Page 10: How to Use Data for Good

What are some favorite examples of how data has been used for good?

Page 11: How to Use Data for Good

ex.pn/datatalk#DataTalk

Elissa RedmilesData Science for Social Good Summer Fellow at the University of Chicago@eredmil1

One of the projects that drew me to @DataSciFellows was the

#NurseFamilyParnership project, which used data science to predict people in need.

Page 12: How to Use Data for Good

ex.pn/datatalk#DataTalkDataKind

@DataKind

Using anonymous mobile location data in aggregated ways to identify mobility patterns

and design better public transportation.

Page 13: How to Use Data for Good

ex.pn/datatalk#DataTalkLitterati

@Litterati

We leverage data to get smarterabout our litter patterns.

Page 14: How to Use Data for Good

ex.pn/datatalk#DataTalk

Our favorite D4G example is the use of telecom data with the Global Pulse UN team in Uganda to detect and tackle food

crises and prioritize actions against poverty. More specifically, we have developed a mapping of income inequality and income

shocks in Africa using changes in pre-paid patterns.

Real Impact Analytics@RIAnalytics

Page 15: How to Use Data for Good

ex.pn/datatalk#DataTalkDataKind

@DataKind

Another neat one from Data Science Bowl: convolutional neural nets to predict ocean health.

CLICK HERE

Page 16: How to Use Data for Good

ex.pn/datatalk#DataTalk

A second powerful example of D4G is the use of telecom mobility data to identify, prevent and treat contagious diseases

such as Ebola, malaria and cholera. We have been able to identify micro-communities as well as mobility patterns. This leads towards identifying key routes to block and assess the

potential impact on the spread of a disease.

Real Impact Analytics@RIAnalytics

Page 17: How to Use Data for Good

ex.pn/datatalk#DataTalk

Nick EngData Scientist, Center for Data Science at the University of Chicago@nick_eng

Beyond predictive models and confidential datasets, products like clearstreets.org simplify

our lives using open data.

Page 18: How to Use Data for Good

ex.pn/datatalk#DataTalkDataKind

@DataKind

@DataKindUK volunteers mapped public data to help @SSChospices find children in need of

hospice care.

Page 19: How to Use Data for Good

ex.pn/datatalk#DataTalk

Melissa Correia@melissacorreia

Child welfare agencies are using sophisticated analyses to improve outcomes for kids in foster care.

Page 20: How to Use Data for Good

What challenges do organizations face when working on data

philanthropy projects?

Page 21: How to Use Data for Good

ex.pn/datatalk#DataTalkDataKind

@DataKind

One challenge is defining a clear question upfront for the project that will help an organization

maximize impact.

Page 22: How to Use Data for Good

ex.pn/datatalk#DataTalk

Elissa RedmilesData Science for Social Good Summer Fellow at the University of Chicago@eredmil1

Organizational culture is very important.Having good data to analyze and resources directed

toward analysis are key.

Page 23: How to Use Data for Good

ex.pn/datatalk#DataTalk

Finding balance between retaining proprietary knowledge on either data or technology and applying

to data for good projects can be hard.

Kevin ChenChief Data Scientist, Experian Data Lab @kevincchen

Page 24: How to Use Data for Good

ex.pn/datatalk#DataTalk

Nick EngData Scientist, Center for Data Science at the University of Chicago@nick_eng

Implementation! Fancy models or cool visualizations is only step one. Making these tools part of the day-to-day is number two.

Page 25: How to Use Data for Good

We can see 3 types of challenges: (i) design of the tools/apps; (ii) access to data; (iii) align the eco-system.

ex.pn/datatalk#DataTalkReal Impact Analytics

@RIAnalytics

Page 26: How to Use Data for Good

The operational challenge is mostly to repackage research insights to generate real impact on decisions of aid

workers in the field. Many tools are not simple enough for a daily field use or less actionable and have usually not

been designed around an actual worker’s needs.

ex.pn/datatalk#DataTalkReal Impact Analytics

@RIAnalytics

Page 27: How to Use Data for Good

The technical challenge is to be able to connect to relevant data sources, being external data sources (e.g. WHO,

World Bank) or telecom data sources.

ex.pn/datatalk#DataTalkReal Impact Analytics

@RIAnalytics

Page 28: How to Use Data for Good

The legal and regulatory challenge is to syndicate our approach with local regulators and secure the data

handling process, in terms of privacy, anonymization of data or remote access. All data must remain at the telecom operator premises within the country. This last challenge

can be partly address through securing a sustainable eco-system involving all parties.

ex.pn/datatalk#DataTalkReal Impact Analytics

@RIAnalytics

Page 29: How to Use Data for Good

ex.pn/datatalk#DataTalk

Nick EngData Scientist, Center for Data Science at the University of Chicago@nick_eng

And figuring out what the problem exactly is, and framing it. We don’t always know the domain.

We need your help and feedback.

Page 30: How to Use Data for Good

ex.pn/datatalk#DataTalkIoT Channel

@IoTchannel

Key challenge is to maintain protection of user/client info and data without it being compromised/leaked.

Page 31: How to Use Data for Good

ex.pn/datatalk#DataTalkDataKind

@DataKind

There is also the challenge (and fun) of prepping and cleaning data before you dive in.

Page 32: How to Use Data for Good

What type of data can be used for data for good projects?

Page 33: How to Use Data for Good

ex.pn/datatalk#DataTalkDataKind

@DataKind

Time series, text, audio, geo, etc. We need to make sure privacy is preserved and it

doesn’t promote discrimination.

Page 34: How to Use Data for Good

ex.pn/datatalk#DataTalk

Elissa RedmilesData Science for Social Good Summer Fellow at the University of Chicago@eredmil1

Many different formats are usable: database data, excel data, csv data are all

easily processable, but text and web data work, too.

Page 35: How to Use Data for Good

ex.pn/datatalk#DataTalkReal Impact Analytics

@RIAnalytics

Telecom data are particularly unique in emerging markets, as they are collected systematically, locally and in real time. These data can be

complemented by 2 data sources: (i) external or public databases, such as occurrences of a specific disease in a specific location;

(ii) additional / ad-hoc data which are collected through a mobile application. The most important limitation is the possibility to identify

back individual people based on the shared insights or tools. This would dramatically undermine the scaling up of Data for Good.

Page 36: How to Use Data for Good

ex.pn/datatalk#DataTalk

Elissa RedmilesData Science for Social Good Summer Fellow at the University of Chicago@eredmil1

Real good can be done with access to internal data without releasing this data publicly.

Page 37: How to Use Data for Good

ex.pn/datatalk#DataTalk

Nick EngData Scientist, Center for Data Science at the University of Chicago@nick_eng

Open data and APIs are a great start.Check out the new CitySDK from the census.

Page 38: How to Use Data for Good

ex.pn/datatalk#DataTalk

Elissa RedmilesData Science for Social Good Summer Fellow at the University of Chicago@eredmil1

We focus more on internal vs. external data & complete data, more than formats.

Page 39: How to Use Data for Good

ex.pn/datatalk#DataTalk

Nick EngData Scientist, Center for Data Science at the University of Chicago@nick_eng

And when structured data isn’t available, you can get creative to make your own data

(e.g. scraping websites).

Page 40: How to Use Data for Good

ex.pn/datatalk#DataTalkDataKind

@DataKind

Totally agree with Nick Eng on getting creative with scraping websites or not forgetting about

data sources like satellite imagery.

Page 42: How to Use Data for Good

What are some best practices for using data for good?

Page 43: How to Use Data for Good

#DataTalkKevin ChenChief Data Scientist, Experian Data Lab @kevincchen

Garbage in, garbage out. Validate and carefully examine the data.

Page 44: How to Use Data for Good

#DataTalkReal Impact Analytics@RIAnalytics

The best D4G solutions provide action-oriented insights to end-users, which are supported by

science and easily accessible by mobile.

Page 45: How to Use Data for Good

#DataTalkReal Impact Analytics@RIAnalytics

We need to understand the actual needs of the potential users, assess correlation between

available data and possible actions and outcomes and adapt apps and algorithms accordingly.

Page 46: How to Use Data for Good

#DataTalkReal Impact Analytics@RIAnalytics

We need to be able to refresh and operationalize the tools offering a mobile access to insights; we need to technically secure the access to the data and ensure privacy; and we need to be able to measure impact and correct algorithm accordingly. Overall, trust is one of

the key overarching success factors, as it allows to have a smooth decision flow and maximize impact. Therefore, we need to build

strong partnerships with international institutions to ensure global impact and scalability of our actions.

Page 47: How to Use Data for Good

#DataTalkEWD Rozier@PrarieScience

The biggest step for using data for good is finding a committed, involved, partner who

will help transition to practice.

Page 48: How to Use Data for Good

#DataTalkDataKind@DataKind

Love this guide from @engrnroom.Great read on how to practice responsible

development data.

Page 49: How to Use Data for Good

#DataTalkNick EngData Scientist, Center for Data Science at the University of Chicago@nick_eng

When doing a project, make sure it’s a constant partnership with your other

stakeholders (e.g. nonprofits).

Page 50: How to Use Data for Good

#DataTalkElissa RedmilesData Science for Social Good Summer Fellow at the University of Chicago@eredmil1

Talk to SMEs and find the domain knowledge you don’t have. Data is only

have the puzzle.

Page 51: How to Use Data for Good

What are ways to use data for good, while protecting privacy?

Page 52: How to Use Data for Good

#DataTalkex.pn/datatalk

EWD Rozier@PrarieScience

Right now, it’s very ad-hoc; to move forward we need new data privacy solutions, which allow

proofs of privacy preservation.

Page 53: How to Use Data for Good

#DataTalkex.pn/datatalk

Elissa RedmilesData Science for Social Good Summer Fellow at the University of Chicago@eredmil1

@DataSciFellows keeps data secure while letting the code for processing the

data be open source.

Page 54: How to Use Data for Good

#DataTalkex.pn/datatalk

Kevin ChenChief Data Scientist, Experian Data Lab @kevincchen

Use the data in aggregates (e.g. finding activity patterns of city dwellers using

aggregated mobile phone activity).

Page 55: How to Use Data for Good

#DataTalkex.pn/datatalk

Elissa RedmilesData Science for Social Good Summer Fellow at the University of Chicago@eredmil1

We keep data science code open source so that other nonprofit organizations can use these

resources to process their own data.

Page 56: How to Use Data for Good

#DataTalkex.pn/datatalk

EWD Rozier@PrarieScience

We’ve been working on solutions for homomorphisms for database operations to

create a privacy aware kernel for data science.

Page 57: How to Use Data for Good

#DataTalkex.pn/datatalk

DataKind@DataKind

Shouting out @CrisisTextLine: they provide personalized care to those in crisis via text

messages while protecting privacy.

Page 58: How to Use Data for Good

#DataTalkex.pn/datatalk

EWD Rozier@PrarieScience

The hard part about privacy preserving operations are the current limits on

performable homomorphisms.

Page 59: How to Use Data for Good

#DataTalkex.pn/datatalk

Kevin ChenChief Data Scientist, Experian Data Lab @kevincchen

Add noise to the data, bucket the values (e.g. age) or use coarser level of info (e.g. zip3

vs zip5) when possible are a few ways.

Page 60: How to Use Data for Good

What type of data philanthropy would you like to see happen?

Page 61: How to Use Data for Good

#DataTalkex.pn/datatalk

Elissa RedmilesData Science for Social Good Summer Fellow at the University of Chicago@eredmil1

We need more public info showcasing the impact of using data science for good.

Page 62: How to Use Data for Good

#DataTalkex.pn/datatalk

Nick EngData Scientist, Center for Data Science at the University of Chicago@nick_eng

And more scalable ways to help nonprofits determine how data can help them.

Page 63: How to Use Data for Good

#DataTalkex.pn/datatalk

I would really like to see projects that use data to help understand, prevent, intervene,

and treat cancers.

Kevin ChenChief Data Scientist, Experian Data Lab @kevincchen

Page 64: How to Use Data for Good

#DataTalkex.pn/datatalk

EWD Rozier@PrarieScience

Biggest plausible projects I want to see are cities becoming more data savvy like Chicago.

Public access democratizes science.

Page 65: How to Use Data for Good

#DataTalkex.pn/datatalk

Real Impact Analytics@RIAnalytics

We would like to co-design a sustainable, scalable and open ecosystem of mobile anti-poverty apps together with other

developers, NGOs, international agencies and philanthropists. There will be different types of apps required, such as apps supporting NGO’s in disaster relief or sudden outbreak of a contagious disease or apps supporting ministries or public

authorities in their decision-making, optimizing the targeting and impact of public policies. Most of emerging countries lack

data about their populations.

Page 66: How to Use Data for Good

#DataTalkex.pn/datatalk

Elissa RedmilesData Science for Social Good Summer Fellow at the University of Chicago@eredmil1

I also think it’s important for more corporations like Experian and IBM to raise

awareness of #Data4Good projects.

Page 67: How to Use Data for Good

What are ways to support organizations and data scientists

working in data philanthropy?

Page 68: How to Use Data for Good

#DataTalkex.pn/datatalk

Real Impact Analytics@RIAnalytics

Philanthropists can best support D4G by joining the dialogue with app developers and end-users on the public

questions to address. There is a clear need to fund specific apps and an operational platform to ensure that

Data for Good becomes not only a science but foster also operational impact. Securing such platform with a first set of apps will generate spillovers and a positive dynamics among the communities of developers, NGOs and public

institutions.

Page 69: How to Use Data for Good

#DataTalkex.pn/datatalk

EWD Rozier@PrarieScience

Focus on open source tools. I will be controversial: we need to move away from prototyping in python

to a performable ecosystem.

Page 70: How to Use Data for Good

#DataTalkex.pn/datatalk

Elissa RedmilesData Science for Social Good Summer Fellow at the University of Chicago@eredmil1

Agree with @PrarieScience. @NSF funding for outcomes based

#DataScience projects is key especially for training data scientists.

Page 71: How to Use Data for Good

#DataTalkex.pn/datatalk

Corporates can provide funding and recognitions to their data scientists to

encourage participation in data philanthropy projects.

Kevin ChenChief Data Scientist, Experian Data Lab @kevincchen

Page 72: How to Use Data for Good

#DataTalkex.pn/datatalk

Nick EngData Scientist, Center for Data Science at the University of Chicago@nick_eng

Maybe start by finding your local #Data4Good community. Strength in numbers!

Page 73: How to Use Data for Good

ex.pn/datatalk#DataTalkDataKind

@DataKind

Funders can play a big role supporting nonprofits to expand the

use of data beyond reporting.

Page 74: How to Use Data for Good

Any final tips for data scientists who want to use data for good?

Page 75: How to Use Data for Good

ex.pn/datatalk#DataTalkReal Impact Analytics

@RIAnalytics

Our main tip is to collaborate, as an operational open ecosystem is critical to realize our shared vision of a healthy and poverty-free

world. Data for Good is at the cross-road of multiple skill sets, such as data sciences, software development, algorithm design, epidemiology, traffic modelling, field work involving poor

communities in emerging markets, telecom regulation. There is no chance one organization could offer these internally. Data for Good

needs to offer both operational tools and scientific insights.

Page 76: How to Use Data for Good

ex.pn/datatalk#DataTalk

Elissa RedmilesData Science for Social Good Summer Fellow at the University of Chicago@eredmil1

Don’t be discouraged by imperfect data!

Page 77: How to Use Data for Good

ex.pn/datatalk#DataTalk

Nick EngData Scientist, Center for Data Science at the University of Chicago@nick_eng

Data will always be messy.Especially from nonprofits.

Page 78: How to Use Data for Good

ex.pn/datatalk#DataTalk

Kevin ChenChief Data Scientist, Experian Data Lab @kevincchen

Let the data speak.Interpret the results objectively.

Page 79: How to Use Data for Good

ex.pn/datatalk#DataTalk

Nick EngData Scientist, Center for Data Science at the University of Chicago@nick_eng

Start simple. Simple projects can sometimes make the

biggest impact.

Page 80: How to Use Data for Good

Join our #DataTalk on Twitter on Thursdays at 5 p.m. ET.

experian.com/datatalk