chicago big data executive summit june 12, 2013

31
Chicago Big Data Executive Summit June 12, 2013 Big Value from Generati ng Big Data

Upload: nerice

Post on 25-Feb-2016

41 views

Category:

Documents


1 download

DESCRIPTION

Generating. Big Value from. Big Data. Chicago Big Data Executive Summit June 12, 2013. Using Data to Derive Value. Lessons Learned: Data size is relative to an organizations ability to make use of it Assumptions and bias can get in the way The best insights are actionable. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chicago Big Data Executive Summit June 12, 2013

Chicago Big Data Executive SummitJune 12, 2013

Big Valuefrom

GeneratingBig Data

Page 2: Chicago Big Data Executive Summit June 12, 2013

Using Data to Derive Value

• Lessons Learned:– Data size is relative to an organizations ability to make use of it– Assumptions and bias can get in the way– The best insights are actionable

Page 3: Chicago Big Data Executive Summit June 12, 2013

R. Brendan AldrichExecutive Director, Data Warehousing

City Colleges of Chicago

• 18 years in Information Technology

• 13 years running data warehouse, business intelligence and analytics teams for global high volume data companies such as The Walt Disney Company, Travelers Insurance and Demand Media

• Currently building a data democracy at the City Colleges of Chicago

• TDWI and AERA membership

Speaker Introduction

Page 4: Chicago Big Data Executive Summit June 12, 2013

• Colleges:– Richard J. Daley College– Kennedy-King College– Malcolm X College– Olive-Harvey College– Harry S Truman College– Harold Washington College– Wilbur Wright College

• Satellites:– Lakeview Learning Center– Dawson Technical Institute– West Side Learning Center– South Chicago Learning Center– Arturo Velasquez Institute– Humboldt Park Vocational

Education Center

• Culinary– The French Pastry School– Washburn Culinary Institute

• Parot Cage Restaurant• Sikia Banquet Room

• Broadcast– WYCC TV (Channel 20)– WKKC FM 89.9

…as well as five child development centers, the Center for Distance Learning and the Workforce Institute

The City Colleges of Chicago is the largest community college district in the state of Illinois and one of the largest in the country with more than 5,800 administrators, staff and faculty educating over 120,000 students annually at facilities located within the city of Chicago.

Page 5: Chicago Big Data Executive Summit June 12, 2013

The Origin of Big Data

John Mashey, chief scientist at Silicon Graphics until 2000, gave hundreds of talks to small groups in the mid-to-late 1990’s using the term “Big Data” to describe how the boundaries of computing keep advancing.1

Page 6: Chicago Big Data Executive Summit June 12, 2013

Gartner Group

2001: Doug Laney first uses “Volume, Velocity & Variety” to describe Big Data2

2012: Gartner updates the definition to:

“Big data are high volume, high velocity and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process automation”

Page 7: Chicago Big Data Executive Summit June 12, 2013

Datafication is Driving Big DataDatafication:

Creating new data that didn’t previously exist in digital form

The more you know about your customer, the better you can differentiate yourself from your competitors.

Page 8: Chicago Big Data Executive Summit June 12, 2013

Disney’s Magic Bands3

• Customer Value:– Disney’s MagicBands will allow park guests to access the park, sign up for ride

waitlists (FastPass), interact with characters, purchase items, lost parents, etc.

• Company Value:– What type of guest are you and how do you route through the park (rides,

concessions, shows, purchases, etc.)– Route optimization, scheduling, ride balancing– Know your customer

• Worldwide– 121.4 million guests (2011)

• Florida– 17.1 million guests (2011)

Page 9: Chicago Big Data Executive Summit June 12, 2013

Getting to Big Value(or… Don’t Miss the Trees for the Forest)

1. Gathering vs. Understanding

2. Assumptions

3. Bias

Page 10: Chicago Big Data Executive Summit June 12, 2013

Barrier #1: Gathering vs. Understanding

“Big Data is not defined by it’s data management challenges, but by the organization’s capabilities in analyzing the data, deriving intelligence from it, and leveraging it to make forward looking decisions.”4

- Issac Sacolick, VP Technology at McGraw-Hill Construction

Page 11: Chicago Big Data Executive Summit June 12, 2013

Yes… The Volumes are Big

Page 12: Chicago Big Data Executive Summit June 12, 2013

The “Understanding” Market Takes Off

http://www.bigdatalandscape.com/

Page 13: Chicago Big Data Executive Summit June 12, 2013

Value Derived from Human Interaction

“Data and data sets are not objective; they are creations of human design. We give numbers their voice, draw inferences from them, and define their meaning through our interpretations.”5

- Kate Crawford, Principal Researcher @ Microsoft Research

Page 14: Chicago Big Data Executive Summit June 12, 2013

What Does Your Data Weigh?

• Light Data– Easily quantifiable measures and facts

• Mid-Weight Data– Interesting data; trends; patterns

• Heavy Data– Rich, meaningful, verified, and

actionable data

Data classification on the value being derived from the data

Page 15: Chicago Big Data Executive Summit June 12, 2013

Barrier #2: Assumptions

People inherently make assumptions… which can lead you to find what you expect as opposed to the marketable anomalies

Page 16: Chicago Big Data Executive Summit June 12, 2013

• DVD rental and video streaming company with over – 33 million subscribers (29 million streaming) in 40 countries

• Big Data Stats:– More than 50 Cassandra clusters with over 750 nodes – More than 50,000 reads & 100,000 writes per second.

• Claims 75% of its subscribers are influenced by what it suggests they will like.6

Page 17: Chicago Big Data Executive Summit June 12, 2013

House of Cards• Netflix’s data indicated that the same

subscribers who loved the original BBC production of “House of Cards” also loved movies starring Kevin Spacey or directed by David Fincher.7

• Netflix has committed $100 million to create two 13-episode seasons.

Page 18: Chicago Big Data Executive Summit June 12, 2013

Were they Right?

• From a data standpoint, it’s hard to know since Netflix doesn’t release viewership numbers.

• But how else could we evaluate?– Facebook likes: 206k– Twitter: 34,706 Followers– Mainstream Culture

• Magazine Covers?• Talk shows?• What do you hear?

• What could we conclude?

Page 19: Chicago Big Data Executive Summit June 12, 2013

Barrier #3: BIAS

“Hidden biases in both the collection and analysis stages present considerable risks, and are as important to the big-data equation as the numbers themselves.”5

- Kate Crawford, Principal Researcher @ Microsoft Research

Page 20: Chicago Big Data Executive Summit June 12, 2013

Classification of Bias8

• Cognitive– Misunderstanding of the

probabilities.

• Selection– Most available, convenient

and/or cost-effective as opposed to most relevant.

• Sampling– Most relevant to a subset that

may not hold true in the wider population.

• Modeling– Biased assumptions drive

selection of wrong variables

• Funding– Assumptions, interpretations,

data and applications skewed to favor funding party

• Representation– Larger data sets do not

ensure that the data is representative.

Page 21: Chicago Big Data Executive Summit June 12, 2013

Accounting for Bias9

• Know your Enemy– Be aware of biases that may affect your analysis. Document them as

part of your results

• Make use of Subject Matter Experts– Validate your results with domain experts and use them to test your

findings and algorithms

• Continuous Exploration– Don’t settle for satisfactory! Investigate the anomalies and explore

the data outside of your focus

Page 22: Chicago Big Data Executive Summit June 12, 2013

Generating Big Value

• Big Data is quantitative

• Deriving meaningful insights requires people

• Managing assumptions and bias increases value

• Insights identified can be acted upon

• Insights acted upon must be continually reviewed

Anything Else?

Page 23: Chicago Big Data Executive Summit June 12, 2013

Rise of the Data Democracy

“Humans are not an important part of utilizing new data, they are single most important part of the process.”10

- Bryce Maddock, CEO of TaskUs.com

Page 24: Chicago Big Data Executive Summit June 12, 2013

McKinsey: Systemic Barriers for Education11

Page 25: Chicago Big Data Executive Summit June 12, 2013

Building a Data Democracy:Enable Everyone with Access

• The right data must be available in all areas of the organization.

• Access to and use of data will create positive and lasting change.

• All City Colleges of Chicago employees will be able to use this platform to obtain data and/or run reports.

Only part of this challenge is licensing cost! Organizational acceptance, tool selection, bandwidth, data comprehension and accessible training are critical!

Page 26: Chicago Big Data Executive Summit June 12, 2013

Building a Data Democracy:Breaking Down the Silos

Page 27: Chicago Big Data Executive Summit June 12, 2013

Building a Data Democracy:One-Size Does Not Fit All

… and Interactive Analytics for all users

Reports …User-Created Dashboards

A unified data warehouse and web-based interface for accessing and interacting with data

Page 28: Chicago Big Data Executive Summit June 12, 2013

Building a Data Democracy:Increase Data Comprehension & Skills

Integrated Data Dictionary and Online Training

By integrating necessary reference and training information directly into the analytics website, we enable our employees to know with certainty what their data means and how to use it effectively.

Page 29: Chicago Big Data Executive Summit June 12, 2013

Takeaways

• Generating Big Value from Big Data:

– Datafication is driving differentiation in the marketplace• Collect the data that drives your business

– The value in Big Data is derived from human insight• How much does your data weigh?

– Be aware of Assumptions and Bias in your approach• Evaluate what does and doesn’t benefit your analysis

– Enable everyone with the right data to succeed• Data democracy

Page 30: Chicago Big Data Executive Summit June 12, 2013

APPENDIX

Page 31: Chicago Big Data Executive Summit June 12, 2013

References

• Infographics

IBM Big Data Hub, Infographic, “Tuning Into Big Data As The Buzz Gets Louder”, 9/26/12, http://www.ibmbigdatahub.com/infographic/tuning-big-data-buzz-gets-louder

Mushroom Networks, Infographic, “Landscape of Big Data”, 2013, http://www.mushroomnetworks.com/infographics/landscape-of-big-data Graeme Noseworthy, Infographic, “The Flood of Big Data”, 4/24/12, http://analyzingmedia.com/2012/infographic-big-flood-of-big-data-in-digital-marketing/

4 Issac Sacolick, Blog, “What is Big Data The Real Challenges Beyond Volume, Velocity and Variety”, 12/11/12, http://blogs.starcio.com/2012/12/what-is-big-data-real-challenges-beyond.html

7 Mary McNamara, Los Angeles Times, “Netflix’s ‘House of Cards’ looks, but doesn’t sound, like a hit””, 4/27/13, http://articles.latimes.com/2013/apr/27/entertainment/la-et-st-house-of-cards-netflix-20130427

6 Andrew Leonard, Salon, “How Netflix is turning viewers into puppets”, 2/1/13, http://www.salon.com/2013/02/01/how_netflix_is_turning_viewers_into_puppets/

• Articles

5 Kate Crawford,Blog, Harvard Business Review, “The Hidden Biases in Big Data”, 4/1/13, http://blogs.hbr.org/cs/2013/04/the_hidden_biases_in_big_data.html

8 James Kobielus,IBM Big Data Hub, “Data Scientist: Bias, Backlash and Brutal Self-Criticism”, 5/16/13, http://www.ibmbigdatahub.com/blog/data-scientist-bias-backlash-and-brutal-self-criticism 9 Haowen Chan and Robin Morris, GigaOm, “Careful: Your big data analytics may be polluted by data scientist bias”, 5/4/13, http://gigaom.com/2013/05/04/careful-your-big-data-analytics-may-be-polluted-by-data-scientist-bias/

10 James Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh and Angela Hung Byers, McKinsey Global Institute, “Big data: The next frontier for innovation, competition and productivity”, 5/11, http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation

3 Jules Polonetsky, Linkedin Post, “Magic Lessons for Retailers”, 5/31/13, http://www.linkedin.com/today/post/article/20130531031125-258347-magic-lessons-for-retailers

11 Bryce Maddock, Blog, “People and Big Data: Separately Good, Together Great”, 9/26/12, http://www.huffingtonpost.com/bryce-maddock/big-data_b_1908358.html

1 Steve Lohr, The New York Times, “The Origins of ‘Big Data’: An Etymological Detective Story”, http://bits.blogs.nytimes.com/2013/02/01/the-origins-of-big-data-an-etymological-detective-story/ 2 Doug Laney, Blog, “Deja VVVu: Others Claiming Garner’s Construct for Big Data”, 1/14/12, http://blogs.gartner.com/doug-laney/deja-vvvue-others-claiming-gartners-volume-velocity-variety-construct-for-big-data/