introduction to big data ethics and privacyricardo/ficheiros/bd - ethics and privacy.pdf ·...

21
Introduction to Big Data Ricardo Campos Ethics and Privacy Mestrado EI-IC Análise e Processamento de Grandes Volumes de Dados Tomar, Portugal, 2016

Upload: trantruc

Post on 14-Jul-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

Introduction to Big Data

Ricardo Campos

Ethics and Privacy

Mestrado EI-IC – Análise e Processamento de Grandes Volumes de Dados Tomar, Portugal, 2016

Instituto Politécnico de Tomar

Page 2: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

Part of the slides used in this presentation were adapted from presentations found in internet and from reference bibliography:

Page 3: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

Page 4: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

AGENDAWhat is this talk about?

Privacy

2User’s Life is

the Web

1Data

Protection

3

Q&A

5Ethics

4

Page 5: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

Photos – What they see

Tags – With whom

Videos – What they do

Statuses - State of mind

Likes – Opinions

Friends – Connections

Money transactions – Pay-pal.

Page 6: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

If I want to….I can know everything about you:

• Where you dine (geo-localization with recommendation systems: yelp; trip advisor);

• Your telephone number (https://sync.me)

• Where you sleep (geo-localization)

• Your health (wearable technologies, e.g., smart bracelet - mi band). Please consider reading this article: http://exameinformatica.sapo.pt/noticias/software/2016-01-15-Once-a-app-que-analisa-o-ritmo-cardiaco-para-encontrar-a-cara-metade

• Where you go out for running (runtastic)

• Your political party (comments on the web)

• Your interests (searches on the web and consequently your localization)

• Your salary (taxes simulators)

Page 7: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

If I want to….I can know everything about you:

• Qualifications (linkedin);

• Your consumer profile (shopping on the web. Everything you did. The websites visited, which ones you left and to which ones you go, the mouse path, where and how many times you clicked. Where you hesitated and how….everything will be known by web companies)

• Your contacts (from your mobile)

• Your walking pace (did you know you mobile have an accelerometer?)

• Even your fantasies

And yet….we haven’t talked about the cloud

Page 8: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

Pay cash for everything!

Never go online!

Don’t use a telephone!

Don’t use cards!

Don’t fill any prescriptions!

Never leave your house!

Page 9: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

Big data does not come with a built-in perspective on what is right or wrong or what is good or bad in using it. It is only by asking and seeking answers to ethical questions that we can ensure big data is used in a correct way;

Big data can be used:

to identify more general trends and correlations;

it can also be processed in order to directly affect individuals;

It is not the volume, velocity, variety or veracity what worries me, but the uses of the information. The uses of the data are not determined before collection.

It is no exaggeration to say that we are nothing more than a collection of data to most of the institutions—and many of the people—with whom we deal. Please consider reading this article: https://pplware.sapo.pt/truques-dicas/google-voice-ouve-grava-as-suas-conversas-nunca-as-apaga/

Page 10: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

Big Data may also pose significant risks for the protection of personal data and the right to privacy:

the sheer scale of data collection, tracking and profiling;

the security of data;

the transparency, which implies sufficient information given to individuals;

inaccuracy, discrimination, exclusion and economic imbalance;

increased possibilities of government surveillance.

Page 11: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

Performing analytics on datasets can reveal confidential information about organizations or individuals. Even analyzing separate datasets that contain seemingly benign data can reveal private information when the datasets are analyzed jointly.

Addressing these privacy concerns requires an understanding of the nature of data being accumulated and relevant data privacy regulations, as well as special techniques for data tagging and anonymization.

For example, telemetry data, such as a car’s GPS log or smart meter data readings, collected over an extended period of time can reveal an individual’s location and behavior

Please consider reading this article: http://www.jornaleconomico.sapo.pt/noticias/caro-cliente-sei-si-do-voce-imagina-84757

Page 12: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

Check out this link:

https://en.wikipedia.org/wiki/AOL_search_data_leak

Page 13: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

We have, as a society, only just begun to understand the implications of the age of big data. Consider the following:

• The social and economic impact of setting insurance rates based on browser or location history, e.g., visits to sites with information about chest pain or a detailed record of your vehicle’s GPS history (https://www.wired.com/2011/09/onstar-tracks-you/).

• Predicting criminal behavior through extrapolation from location, social network, and browsing data. Minority Report style predictive policing is already in place in some major urban areas (see http://www.cbsnews.com/8301-18563_162-57412725/lapd-computer-program-prevents-crime-by-predicting-it/)

Page 14: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

On February 16, 2012, the New York Times published an article about Target’s ability to identify when a customer is pregnant.

On April 20, 2011, two security researchers announced that iPhones were regularly recording the position of each device to a hidden file.

In August of 2011, Facebook faced criticism when it was thought to be exposing the names and phone numbers of everyone in the contacts on mobile devices that used the Contacts feature of the Facebook mobile application.

As you are likely considering how your organization would respond in similar situations, consider the fact that all these examples share one common factor: big-data technology.

Page 15: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

For both individuals and organizations, four common elements define what can be considered a framework for big data ethics:

Identity: What is the relationship between our offline identity and our online identity?

Privacy: Who should control access to data?

Ownership: Who owns data, can rights to it be transferred, and what are the obligations of people who generate and use that data?

Reputation: How can we determine what data is trustworthy? Whether about ourselves, others, or anything else, big data exponentially increases the amount of information and ways we can interact with it. This phenomenon increases the complexity of man aging how we are perceived and judged.

Page 16: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

Data protection is challenged by the advent of ‘big data’, which poses enormous challenges for data protection— both by processors and regulators.

The fundamental right to protection of personal data

The right to respect for private and family life, home and communications is laid down in Article 7 of the European Convention on Human Rights. Article 8 formulates the protection of personal data as a separate right.

Personal data as defined in Article 2(a) of Directive 95/46/EC means 'any information relating to an identified or identifiable natural person'. This includes any information which refers to the identity, characteristics or behavior of an individual or which is used to determine or to influence the way in which that person is treated or evaluated.

Page 17: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

The 1995 EU Data Protection Directive, for example, limits the collection of personal data to the fulfilment of specific predefined purposes. It also requires the destruction of data once the purpose of which they have been collected is achieved. This provision prevents the accumulation of data, which is a necessary condition for data to become ‘big’.

“The very same algorithms and analytical tools that Facebook uses to understand your interests and desires, and Amazon uses to calculate (and miscalculate) what else you might like to buy, can be used by government and private security companies alike to calculate (and miscalculate) whether you may be a threat, now or in the future. And it is precisely the “dual use” nature of this technology that makes it so hard to regulate”

Page 18: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

What types of uses of big data raise the most public policy concerns?

Correlation of disparate data such as healthcare, financial, demographic and location data.

Tracking consumer behavior and sharing them with 3rd party without proper authorization for targeting and other purposes.

Big data storage in the cloud across multiple geo boundaries

Lack of transparency: who has access to which data, which data is collected and for what reason.

Page 19: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

“Just because it is accessible doesn´t make it ethical”

It may be unreasonable to ask researchers to obtain consent from every person who posts a tweet, but it is unethical for researchers to justify their actions as ethical simply because data is accessible.

Danah Boyd, Kate Crawford (2011) Six provocations for Big Data

The ethical dilemma of self-driving cars

https://www.youtube.com/watch?v=ixIoDYVfKA0

Page 20: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?

How would you feel about paying more for the same product than the person checking out in front of you?

The real challenge: are you willing to get better value and more innovation for some loss of privacy?

Since there is no way to stop the accumulation of Big Data, should its use be regulated by the Federal government?

Page 21: Introduction to Big Data Ethics and Privacyricardo/ficheiros/BD - Ethics and Privacy.pdf · Introduction to Big Data ... ... visits to sites with information about chest pain

What is Information Retrieval?