stone soup data collection w/ cycletracks

32
SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY Building the Technology Pot for the Stone Soup Method of Data Collection: Facilitating Cooperation in the Face of Scarcity Elizabeth A. Sall Transportation Research Board Annual Meeting in Washington, D.C. Tuesday January 15 th , 2013

Category:

Technology


2 download

DESCRIPTION

This is a presentation given during TRB 2013 to illustrate a use of Technology to aggregate together data from an opt-in smart phone app: CycleTracks

TRANSCRIPT

Page 1: Stone Soup Data Collection w/ CycleTracks

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

Building the Technology Potfor the Stone Soup Method

of Data Collection: Facilitating Cooperation in the Face of

Scarcity

Elizabeth A. Sall

Transportation Research Board Annual Meeting in Washington, D.C.

Tuesday January 15th, 2013

Page 2: Stone Soup Data Collection w/ CycleTracks

DATA.But travel models need…

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

2

I AM A TRAVEL MODELER.

Caveat: I am not a data collection or surveying expert

Page 3: Stone Soup Data Collection w/ CycleTracks

So what am I going to talk about?

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

3

Story : We needed data for something we had never seen collected before. And we didn’t have much money or time.

…so we built this app called CycleTracks

Page 4: Stone Soup Data Collection w/ CycleTracks

Route Choice Data Collection Choices Considered

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

4

Cost per

Record

Cost per Respond

ent

Respondent LOE

Data Precisio

n

Data Qualit

y

RP or SP

Web-based stated preference

$ $ High SP

CATI Route recall

$$$ $$$ High Low Low RP

Personal GPS

$ $$ Med High Med RP

Bicycle GPS

$ $$ Med High High RP

Smart Phone

$ $ Low Med Med RP

Page 5: Stone Soup Data Collection w/ CycleTracks

Publicity! Advertising! Stickers!

CycleTracks: from coder to cyclist

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

5

iTunes Store

Android

Market

Page 6: Stone Soup Data Collection w/ CycleTracks

Amazon EC/2 Server running Apache

CycleTracks Data: from cyclist to analyst

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

6

mySQL

JSON

PHP PHP

Page 7: Stone Soup Data Collection w/ CycleTracks

Bay Area Participants (if they noted their home ZIP)

CycleTracksN-366

BATSN=153

Age Mean 34 33

Gender Female 20% 36%

Cycling Frequency Daily Several Times/Week Several Times/Month Less than once a month

48%36%13%3% N/A

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

7

Page 8: Stone Soup Data Collection w/ CycleTracks

Data Quality: some good, some bad

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

8

Page 9: Stone Soup Data Collection w/ CycleTracks

Urban Canyon Effect

Downtown

Haight Ashbury

vs

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

9

Page 10: Stone Soup Data Collection w/ CycleTracks

GPS Signal at Beginning of Trip

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

10

Page 11: Stone Soup Data Collection w/ CycleTracks

Not on a Bike

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

11

Page 12: Stone Soup Data Collection w/ CycleTracks

Post Processing Warranted

Gaussian smoothing

Activity & mode detection

Map matching

5,178 traces497 users

3,034 bike stages

366 usersh

(Schüssler & Axhausen 2009)

~60% of submitted datauseful

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

12

Page 13: Stone Soup Data Collection w/ CycleTracks

Unintended Benefit: Scalability

It works anywhere you can get a satellite signal

Database and cloud server highly scalable

Web interface for data minimizes human resources

Data cleaning open-source

Cost for data: Keeping server on Promotion

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

13

Page 14: Stone Soup Data Collection w/ CycleTracks

Where do people use CycleTracks?

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

14

*based on optional homeZIP field, NOT TRIP LOCATION

Agencies using Cycletracks:1. San Francisco2. Monterey Bay, CA3. Austin, TX4. Seattle, WA5. Fort Collins, CO6. Twin Cities, MN7. Raleigh, NC8. Salt Lake City, UT

Page 15: Stone Soup Data Collection w/ CycleTracks

Where?Where it’s advertised the most.

Place Users* Trips*

San Francisco 665** 11,458

Austin 276 2,950

Fort Collins 126 1,560

Seattle 108 1,175

Minneapolis 67 1,326

Oakland 26 127

Saint Paul 23 449

San Jose 22 70

Santa Cruz 17 254

Berkeley 14 127

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

15

*based on optional homeZIP field, NOT TRIP LOCATION** compared to 153 cyclists in the 2000 HH Travel Survey

Page 16: Stone Soup Data Collection w/ CycleTracks

When did new users submit first trip?

• New user registrations directly correlates with publicity

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

16

11-2

009

12-2

009

1-20

10

2-20

10

3-20

10

4-20

10

5-20

10

6-20

10

7-20

10

8-20

10

9-20

10

10-2

010

11-2

010

12-2

010

1-20

11

2-20

11

3-20

11

4-20

11

5-20

11

6-20

11

7-20

11

8-20

11

9-20

11

10-2

011

11-2

011

12-2

011

1-20

12

2-20

12

3-20

12

4-20

12

5-20

12

6-20

12

7-20

12

8-20

12

9-20

12

10-2

012

11-2

012

12-2

012

020406080

100120140160180200

Cycletracks New User's First Trip Submissions

San Francisco Monterey

AustinSeattle

Twin Cities

Fort Collins

Page 17: Stone Soup Data Collection w/ CycleTracks

Many just try it out , but half use it for a while

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

17

One Day Day - Week Week - Month 1 - 3 Months 3 - 6 Months 6 - 9 Months 9 - 12 Months Over a Year0

100

200

300

400

500

600

700

800

900

1000

41%

11%

16%15%

8%

3% 3% 3%

Users by Duration of Use

Page 18: Stone Soup Data Collection w/ CycleTracks

Broad Spectrum of Users

• Half of users submit > 5 trips• 10% of users submitted > 20 trips• 40 users submitted >100 trips (Max =

685)

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

18

1 2-5 6-10 11-15 16-20 21-7000

100

200

300

400

500

600

700

80031%

20%

8%

5%3%

10%

Users by Trips Submitted

Page 19: Stone Soup Data Collection w/ CycleTracks

Capturing Infrequent Cyclists

• 20% (500+) users infrequent cyclists (10% of trips)

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

19

Less than once a month Several times per month Several times per week Daily

187 411 852 853577

2,036

9,122

13,506

Trips and Users by Cycling Frequency

Users Trips

Page 20: Stone Soup Data Collection w/ CycleTracks

All Open Source

• GPL3 License• Code on GitHub• Fork us!

www.github.com/sfcta

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

20

Page 21: Stone Soup Data Collection w/ CycleTracks

Bonus Benefit: Transferability+ 750 users +8,500 trips

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

21

AggieTracks~35 users

Cville Bike mApper~120 users/1500 trips

Cycle Atlanta~ 400 users/4500 trips NuStats PaceLogger

Rolling their own from our code:

Cycle Lane~ 200 users/2500 trips

Page 22: Stone Soup Data Collection w/ CycleTracks

Combined Reach: ~44,000 trips

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

22http://goo.gl/maps/DuqGh

Page 23: Stone Soup Data Collection w/ CycleTracks

Issues - Bias

Tradeoff between bias and quantity But bias can be dealt with if quantity is

high enough.Which biases are acceptable and when?

i.e. does income affect how adverse to biking up hills you are (vs. biking around them) ?

What biases can we undo with technology?

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

23

Page 24: Stone Soup Data Collection w/ CycleTracks

Issues - Bias

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

24

NHTS Sample Rate Smartphone Ownership Effective Probable Response Rate0%

20%

40%

60%

80%

100%

120%

140%

Race/Ethnicity

White Black Hispanic

Sources: 2009 NHTS (NHTS Sample/1,000 population) ; Pew 2011 Smartphone Survey; Census

People who answersurveys

People who have Smartphones

People who answerSurveys over theirSmartphones

x =

Does…

…if so, this looks pretty good.

?

Page 25: Stone Soup Data Collection w/ CycleTracks

Issues - Bias

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

25

Sources: 2009 NHTS (NHTS Sample/1,000 population) ; Pew 2011 Smartphone Survey; Census

…and this looks even better.

NHTS Sample Rate Smartphone Ownership Effective Probable Response Rate0%

50%

100%

150%

200%

250%

Age Group

18-24 24-34 35-44 45-54 55-64 65+

Page 26: Stone Soup Data Collection w/ CycleTracks

Issues - Bias

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

26

Sources: 2009 NHTS (NHTS Sample/1,000 population) ; Pew 2011 Smartphone Survey; Census

…but looks like income exacerbates the divide.

NHTS Sample Rate Smartphone Ownership Effective Probable Response Rate0%

50%

100%

150%

200%

250%

300%

Total Household Income

<$10k $10k-<$20k $20k-<$30k $30k-<$40k $40k-<$50k $50k-<$75k $75k-<$100k $100k+

Page 27: Stone Soup Data Collection w/ CycleTracks

Issues - Recruitment

Recruitment can be difficult Small publicity campaigns --> Small

datasets Areas most successful in recruiting users

had large publicity campaigns App needs to have value itself:

Monetary value Feel like ‘they are helping’ something

they care about Fun (at least not painful) to use

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

27

Page 28: Stone Soup Data Collection w/ CycleTracks

Cliffs Notes

All we did was build a little phone app: Very tiny investment (<$20,000 total ) for

CycleTracks Yielded 35,000+ records Open source policy has afforded 8,500

more and counting

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

28

Page 29: Stone Soup Data Collection w/ CycleTracks

Lessons Learned

Think about ways to get data to come to you.

Reach far with small levels of investment.Be open. Open-source works!Set aside real money to: Maintain and grow the app and associated

scripts Advertise what we have done with it/develop

a communityDevelop App under your Apple Developer ID

Changing is painfulUse an API interface rather than have the

app hard-coded to a database More flexible in case others want to

contribute data

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

29

Page 30: Stone Soup Data Collection w/ CycleTracks

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

Thanks!

Credits: Lisa Zorn, Billy Charlton, Matt Paul

elizabeth [at] sfcta [dot] orgwww.sfcta.org/modeling

www.sfcta.org/cycletrackshttp://github.com/sfcta

Page 31: Stone Soup Data Collection w/ CycleTracks

Another story: We had a lot of unorganized data collected by a zillion projects, agencies, consultants…and wanted to make sense of it.

…so we built this app called CountDracula

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

31

Page 32: Stone Soup Data Collection w/ CycleTracks

SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY

32

CountDracula

https://github.com/sfcta/CountDracula