webinar on big data challenges : presented by raj kasturi

23
1 Is Scrum a good fit for solving big data challenges? Speaker – Raj Kasturi September 19 th , 2017 10:00 to 11:00 AM EST, 7:30 PM to 8:30 PM IST Special Thanks to:

Upload: discuss-agile

Post on 23-Jan-2018

70 views

Category:

Engineering


1 download

TRANSCRIPT

Page 1: Webinar on Big Data Challenges : Presented by Raj Kasturi

1

Is Scrum a good fit for

solving big data challenges? Speaker – Raj Kasturi

September 19th, 2017

10:00 to 11:00 AM EST,

7:30 PM to 8:30 PM IST

Special Thanks to:

Page 2: Webinar on Big Data Challenges : Presented by Raj Kasturi

2

• 25+ years of IT experience with eight plus years of enterprise level Agile Experience

• Agile experience as an Agile Coach, Scrum Trainer, Scrum Master

• Leading and helping large-‐scale Agile project transitions

• Adjunct faculty at Pennsylvania State University, Pennsylvania, USA. • 18+ years of teaching experience in Technology, Project Management

and 8 years of teaching Scrum, Agile courses

• Started my career as a programmer; worked as App. Dev. Manager

• Speaker, volunteer at agile conferences, user groups

• Servant Leader – Agile World, User Group, Scrum Alliance

My Website/Blog: http://agilekingdom.com/

@AgileRaj

https://www.linkedin.com/in/rajkasturi/

Raj Kasturi, MBA

Page 3: Webinar on Big Data Challenges : Presented by Raj Kasturi

3

Agenda

What is big data?

The three V’s of big data

Big Data Trends of 2017

Agile Spectrum

Big data complexity and empirical process control theory

Scrum and Big Data

Summary

Page 4: Webinar on Big Data Challenges : Presented by Raj Kasturi

4

What is big data?

▪ The term big data was coined in late 1990s

▪ Big data is different than regular data

▪ Billions of data sets and their interaction

▪ Traditional RDBMS is for regular data

▪ RDBMS cannot handle big data

▪ Requires a new technological approach for handling and

processing

▪ New data platforms to meet storage and performance

requirements

Page 5: Webinar on Big Data Challenges : Presented by Raj Kasturi

5

The 3 V’s of big data

Volume

VarietyVelocity

Are these three factors required to drive the need?

Page 6: Webinar on Big Data Challenges : Presented by Raj Kasturi

6

Add Value

▪ Do we have a fourth V?

▪ Aggregate to provide value

Value

Page 7: Webinar on Big Data Challenges : Presented by Raj Kasturi

7

Google’s flu tracker

▪ Knowing the what, rather than the why was good enough

▪ 2009 H1N1 flu epidemic

▪ Real-time flu tracker “Google Flu Trends”

▪ Flu sufferers google before visiting a clinic

▪ Search queries optimized, accurate and real-time data

▪ Data was far more effective than CDC – Size

▪ 3 billion searches a day

▪ Large servers and clever algorithms to sort data

Page 8: Webinar on Big Data Challenges : Presented by Raj Kasturi

8

Who uses it?

▪ Financial Services

▪ Telecommunications

▪ Energy

▪ Government

▪ Retail

▪ And many more…..

Page 9: Webinar on Big Data Challenges : Presented by Raj Kasturi

9

Complexity

Big Data

Page 10: Webinar on Big Data Challenges : Presented by Raj Kasturi

10

Agile Spectrum

Page 11: Webinar on Big Data Challenges : Presented by Raj Kasturi

11

Input Output

May have internal processes

Process

Page 12: Webinar on Big Data Challenges : Presented by Raj Kasturi

12

Input Output

May have internal processes

Defined Process

Composition known

Characteristics well

defined

• Sequential/Series of steps

• Underlying process well understood

• Results repeatable/predictable

• Command & Control approach

• Pre-defined variations are acceptable

Page 13: Webinar on Big Data Challenges : Presented by Raj Kasturi

13

Empiricism

Page 14: Webinar on Big Data Challenges : Presented by Raj Kasturi

14 14

Transparency

Ad

ap

tatio

n

Insp

ect

ion Black

Box

Frequently Inspect

and remove any

unacceptable

variations

Adjust and

control the

process, Improve

Significant aspects of the

process must be visible to

those responsible for the

outcome

Inputs Outputs

Needs frequent measurement

Problem cannot be fully understood or defined

Solution evolves as information becomes known

Protect the

black box by

not adding

anything new!

Page 15: Webinar on Big Data Challenges : Presented by Raj Kasturi

15 15

Page 16: Webinar on Big Data Challenges : Presented by Raj Kasturi

16 16

Page 17: Webinar on Big Data Challenges : Presented by Raj Kasturi

17

Hadoop’s distributed file system (HDFS)

Source: Managing big data workflows for dummies

MapReduce - think of it as a framework that processes and reduces raw big data into

regular‐size, tagged datasets that are much easier to work with.

Page 18: Webinar on Big Data Challenges : Presented by Raj Kasturi

18

Popular platforms and tools

➢ Pig

➢ Apache Hive

➢ Apache Sqoop

➢ In-memory databases

➢ NoSQL databases

➢ Massively Parallel Processing (MPP)

➢ Cassandra

➢ Hadoop

➢ Plotly

➢ Bokeh

➢ Neo4j

➢ Cloudera

➢ OpenRefine

➢ Storm

Page 19: Webinar on Big Data Challenges : Presented by Raj Kasturi

19

Scrum and Big Data

➢Scrum’s ability to measure work output –

Velocity

➢Knowledge is based on the ability to measure

a given phenomenon

➢Once we measure it, we can start to

manipulate the input and determine if we’ve

improved something by the resulting output.

Inspect & Adapt concept

➢-we have discussed empiricism and Scrum is

based on empirical process control

➢Continuous improvement

Page 20: Webinar on Big Data Challenges : Presented by Raj Kasturi

20

Top 10 Big Data Trends 2017

1. Big data becomes fast and approachable:

Options expand to speed up Hadoop

2. Big data no longer just Hadoop:

Purpose-built tools for Hadoop become obsolete

3. Organizations leverage data lakes from the get-go

to drive value

4. Architectures mature to reject one-size-fits

all frameworks

5. Variety, not volume or velocity, drives big-data

investments

Page 21: Webinar on Big Data Challenges : Presented by Raj Kasturi

21

Top 10 Big Data Trends 2017

6. Spark and machine learning light up big data

7. The convergence of IoT, cloud, and big data create new

opportunities for self-service analytics

8. Self-service data prep becomes mainstream as end users

begin to shape big data

9. Big data grows up: Hadoop adds to enterprise standards

10. Rise of metadata catalogs helps people find analysis-

worthy big data

Page 22: Webinar on Big Data Challenges : Presented by Raj Kasturi

22

Summary

Scrum is good for work:

With a fair degree of complexity,

Requires innovation

Requires invention

Product differentiation

Productivity

Faster launch to market

I say that Big Data needs all of the above.

Page 23: Webinar on Big Data Challenges : Presented by Raj Kasturi

23

Attributions

1. http://www.scrumguides.org/scrum-guide.html Scrum Guide 2016

2. https://www.scruminc.com/scrum-big-data-2/ JJ Sutherland

3. Managing Big Data Workflows for dummies – BMC Software special edition- Joe Goldberg

and Lillian Pierson, PE

4. Top 10 Big Data Trends for 2017 Tableau