1 big data stephen head senior manager, it risk advisory services

32
1 Big Data Big Data Stephen Head Senior Manager, IT Risk Advisory Services

Upload: julius-bryan

Post on 17-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

1

Big Data

Big Data

Stephen Head

Senior Manager, IT Risk Advisory Services

2

Big Data

3

Big Data

Our Time Today

Improved

Decision Making

Tools

Big Data

• Attributes of Big Data• Uses of Big Data• Types of Big Data• Big Data Tools• Big Data Controls• Privacy Risks• Security and Governance• Questions

Attributes of Big Data

5

Big Data

Data – How Big is Big?

1 Kilobyte = 1,000 bytes

1 Megabyte = 1,000,000 bytes

1 Gigabyte = 1,000,000,000 bytes

1 Terabyte = 1,000,000,000,000 bytes

1 Petabyte = 1,000,000,000,000,000 bytes

1 Exabyte = 1,000,000,000,000,000,000 bytes

1 Zettabyte = 1,000,000,000,000,000,000,000 bytes

? 1 Yottabyte = 1,000,000,000,000,000,000,000,000 bytes

6

Big Data

The human brain consists of about one billion neurons. Each neuron forms about 1,000 connections to other neurons, amounting to more than a trillion connections.

If each neuron could only help store a single memory, running out of space would be a problem. You might have only a few gigabytes of storage space, similar to the space in an iPod or a USB flash drive.

Yet neurons combine so that each one helps with many memories at a time, exponentially increasing the brain’s memory storage capacity to something closer to around 2.5 petabytes.

Source: Scientific American, May/June 2010

Where Do Humans Fit on the Chart

7

Big Data

Researchers have been able to encode a draft of an entire book into DNA. The 5.27 MB file contains 53,246 words, 11 JPG images, as well as a JavaScript program, making this the largest piece of non-biological data ever stored in DNA. The scientists published their findings in the journal Science.

In theory, two bits of data could be incorporated per nucleotide, implying that each gram of DNA could store 455 exabytes of data (1 exabyte is 1 million terabytes), which outstrips inorganic storage devices like flash memory, hard disks, and even quantum-computing methods.

Source: Science Tech Daily, August 21, 2012

Where Do Humans Fit on the Chart

8

Big Data

Definition of Big Data

Big data is where the data volume, acquisition velocity,

or data representation limits the ability to perform

effective analysis using traditional relational

approaches or requires the use of significant

horizontal scaling for efficient processing. - NIST

9

Big Data

What Attributes Define Big Data?

Gartner defines Big Data using three Vs

Uses of Big Data

11

Big Data

Uses of Big Data

• Timely insights from the vast amounts of data. This includes those

already stored in company databases, from external third-party

sources, the Internet, social media and remote sensors.

• Real-time monitoring and forecasting of events that impact either

business performance or operation.

• Identifying significant information that can improve decision quality.

• Mitigating risk by optimizing the complex decisions of unplanned

events more rapidly.

Source: McKinsey Global Institute

12

Big Data

Sources of Big Data

Data Analytics for Information Security. © 2012 Information Security Forum Limited. All rights reserved.

13

Big Data

Potential Value of Big Data

• $300 billion potential annual value to US health care.

• €250 billion annual value to Europe’s Public Sector Administration.

• $600 billion potential annual consumer surplus from using personal

location data.

• 60% potential in retailers’ operating margins.

Source: McKinsey Global Institute

Types of Big Data

15

Big Data

Types of Big Data

Type 1: This is where a non-relational data representation required for effective analysis.Type 2: This is where horizontal scalability is required for efficient processing.Type 3: This is where a non-relational data representation processed with a horizontallyscalable solution is required for both effective analysis and efficient processing.

Source: NIST

Big Data Tools

17

Big Data

• MapReduce (originally a proprietary technology of Google, but now a term used generically) is a programming model for parallel operations across a practically unlimited number of processors.

• Hadoop is a popular open‐source programming platform and program library based on the same ideas.

• NoSQL (the name derived from “not Structured Query Language”) is a set of database technologies that relaxes many of the restrictions of traditional, “relational” databases and allows for better scalability across the many processors in one or more data centers.

• Berkeley Data Analytics Stack, an open‐source platform that outperforms Hadoop and is being used by such companies as Foursquare, Yahoo, and Amazon Web Services.

Big Data Tools

18

Big Data

Big Data Tools

Big Data Controls

20

Big Data

• Confidentiality– Regulated Data– Access Restricted– Encrypted

• Integrity– Reliable– Complete– Accurate

• Availability– Accessible– Resilient

Big Data Controls

21

Big Data

The organization obtains or generates and uses relevant, quality information to support the functioning of internal control.

Identifies Information Requirements—A process is in place to identify the information required and expected to support the functioning of the other components of internal control and the achievement of the entity’s objectives.

Captures Internal and External Sources of Data—Information systems capture internal and external sources of data.

Processes Relevant Data into Information—Information systems process and transform relevant data into information.

Maintains Quality throughout Processing—Information systems produce information that is timely, current, accurate, complete, accessible, protected, and verifiable and retained. Information is reviewed to assess its relevance in supporting the internal control components.

Source: COSO, Internal Control––Integrated Framework Executive Summary, USA, May 2013.

COSO Principle 13

Privacy Risks

23

Big Data

Privacy Risks

24

Big Data

“We have the capacity to send every customer an ad booklet, specifically designed for them, that says, ‘Here’s everything you bought last week and a coupon for it,” one executive told me. As his computers crawled through the data, he was able to identify about 25 products that, when analyzed together, allowed him to assign each shopper a “pregnancy prediction” score. More important, he could also estimate her due date to within a small window, so RETAILER could send coupons timed to very specific stages of her pregnancy.

“With the pregnancy products, though, we learned that some women react badly,” the executive said. “Then we started mixing in all these ads for things we knew pregnant women would never buy, so the baby ads looked random. That way, it looked like all the products were chosen by chance.”

“And we found out that as long as a pregnant woman thinks she hasn’t been spied on, she’ll use the coupons. She just assumes that everyone else on her block got the same mailer for diapers and cribs. As long as we don’t spook her, it works.”

Privacy Risks

Charles Duhigg, How Companies Learn Your Secrets, New York Times. February 16, 2012

25

Big Data

Protecting Privacy

• Data anonymization/sanitization or deidentification• Adequate, relevant, useful and current big data privacy

policies, processes, procedures and supporting structures• Senior management buy-in and evidence of continuous

commitment to protect privacy• Appropriate data destruction, comprehensive data

management policy, clearly defined disposal ownership and

accountability• Compliance with legal and regulatory data requirements

Source: Privacy and Big Data, page 11. © 2013 ISACA. All rights reserved.

Security and Governance

27

Big Data

Security

• Security must adopt a big data view…The age of big data has

arrived in security management.• We must collect data throughout the enterprise, not just logs.• We must provide context and perform real time analysis.

Arthur Coviello, Chairman RSA

Source: Economist; http://searchcloudsecurity.techtarget.com/news/2240111123/Coviello- ‐talks- ‐about- ‐building- ‐a- ‐trusted- ‐cloud- ‐resilient- ‐security.

28

Big Data

Findings from the Information Security Forum

• Big data analytics is delivering value today

• Big data analytics has the potential to reduce cyber security risk and

increase agility

• Despite its potential, big data analytics is not yet mature within

information security

• Big data analytics is challenging, but manageable

• Existing big data analytics capabilities can be leveraged to improve

information security

29

Big Data

Information Security Uses of Big Data

• Monitoring security incidents and events

• Producing cyber intelligence

• Addressing phishing

• Keeping systems available

• Discovering a breach

• Identifying threat trends and evolution

• Detecting an embedded cyber attack

Data Analytics for Information Security. © 2012 Information Security Forum Limited. All rights reserved.

30

Big Data

Key Governance Questions

1. Can we trust our sources of big data?

2. What information are we collecting that may expose the enterprise to legal and regulatory battles?

3. How will we protect our sources, our processes and our decisions from theft and corruption?

4. What policies are in place to ensure that employees keep stakeholder information confidential during and after employment?

5. What actions are we taking that create trends that can be exploited by our rivals?

Source: Privacy and Big Data, page 10. © 2013 ISACA. All rights reserved.

31

Big Data

Summary

The potential value of Big Data to organizations is huge.

The tools to fully exploit Big Data are in varying stages of development.

The potential risks posed by Big Data are also significant.

As auditors, you are in an ideal position to help ensure that proper controls are put in place to mitigate these risks and realize the full potential offered by Big Data.

32

Big Data

Quest ions?

S t e p h e n H e a d

s t e p h e n . h e a d @ e x p e r i s . c o m

7 0 4 - 9 5 3 - 6 6 8 8