ethical aspects of data mining information capability what can you do? information responsibility...

29
Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

Upload: alban-sutton

Post on 27-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

Ethical Aspects of

Data Mining

Information Capabilitywhat can you do?

Information Responsibility

what should you do?

Page 2: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-2

Major Concerns of huge amount of data

1. Could be used for negative purposes

2. Errors in the data

3. Access to data not well controlled

4. Collected for one purpose, used in other Data Mining

Page 3: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-3

Simson Garfinkel• “Database Nation:

– The Death of Privacy in the 21st Century”Sebastopol, CA: O’Reilly & Associates, 2000

Has interesting views of the rights of privacy

The need for governmental control to assure privacy

This book relates a series of government projects proposing centralized data

Page 4: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-4

1965 National Data Center• Envisioned to combine records

from:– Bureau of the Census– Bureau of Labor Statistics– Internal Revenue Service– Social Security Administration

• Motivation: cut costs– Would lead to more accurate statistics– Princeton Institute – single site may

offer better information security• Canceled: public pressure (56%

oppose)

Page 5: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-5

Credit Bureau Database• 1960s credit bureaus widely used by business

• Loans not repaid• Overdue credit card payments• Multiple address changes to escape creditors• Possibly might contain every phase of life

– Consumers rarely knew of its existence– Policies forbade consumers seeing their files

• 1971 Stopped by Congress (Fair Credit Reporting Act)

Page 6: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-6

1971 Fair Credit Reporting Act• Allowed computerization

– But gave consumers rights to • Review• Challenge• Insert their own version

• Industry complained• 1970s & 1980s consolidation of credit reporting

industry to basically 3 firms– Not only give credit reports– ALSO WILL COMPUTE CREDIT SCORE– WILL SELL DEMOGRAPHICS & INFORMATION

FOR DATA MINING

Page 7: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-7

Governmental Data Mining• Early 1980s Federal Government

– Matching programs• Catch fraud & abuse• Erroneous data often penalized innocent people

• 1994 Communications Assistance to Law Enforcement Act– New powers for wiretapping digital communications

• 1996 States required to – display social security numbers on driver’s licenses– Issue medical patients unique identifiers– Both discontinued due to citizen backlash

Page 8: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-8

Clipper• 1991 proposal• Use encryption systems• Focus to track sexually explicit information to

minors• Might have required Internet providers to deploy

far-reaching monitoring & censoring• Courts: unconstitutional

Page 9: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-9

Lotus & Equifax• 1990 – CD-ROM product

– Lotus Marketplace: Households• Names, addresses, demographic data• Every US household• Intent: small businesses could target-market like

large firms

• 30,000 people wrote to delete their names

• Project canceled

Page 10: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-10

Lexis-Nexis• 1996 P-TRAK database

– Published SSNs of most US residents

• Thousands called switchboard to complain

• After 11 days, Lexis-Nexis discontinued product

Page 11: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-11

Social Security Administration• 1997 informed US taxpayers that detailed

tax history available over the Internet– Security provisions

• Required some personal information

• Tens of thousands complained

• Senate investigated

• Service shut down

Page 12: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-12

American Airlines

• Yield Management– Identify the probability of last-minute

cancellations to allow overbooking– Develop price schedules that maximize

revenue

• Consumers would like to have similar tools

Page 13: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-13

DANGER

• Drug Enforcement Agency– Demanded access to drug chain frequent-

buyer inventories

Page 14: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-14

Threat from inference

• If fewer than three organizations offer sales activities for a product, total sales information could be summed

• Insurance information about traffic violations and insurance claims

Page 15: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-15

Contention• IT threats:

– Runaway marketing– Personal information sold as a

commodity– Intelligent computing threats

• Even if some data intended to be protected, neural networks could include data without explaining why

Page 16: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-16

Scope of Error• 1991 study:

– 1,500 report sample– 43% of the files had errors

• Credit database errors– Fewer than 1% of files had errors– But that denied credit to over 2 million

Page 17: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-17

Fingerprints• 90+ elements

– Odds of duplication low• Garfinkel calls “absolutely unique”

• 1987 FBI had 23 million cards on file– Scale too great to use for anything but confirmation

(given name)

• Sources of error– Entering data– Swapped in police lab– Modify records to frame the accused

Page 18: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-18

DNA• Widely accepted as “absolutely unique”

– But identical twins by definition (3/1000)– Determined by heritage

• Communities with high in-breeding share more

• Concern about DNA databank mission creep– Use of neural network technology could inadvertently

induce use of information without realizing– Government needs protection for spies, defectors,

witnesses

Page 19: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-19

Patient Medical Records• “Please Respect Patient Confidentiality”• Insurance companies have interest in

open knowledge– They argue “lower premiums”– More likely “higher profits”

• DANGER: perfect DNA knowledge– Insurers select clients– Ultimately, control over birth, allowable

marriage

Page 20: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-20

Data Mining• Polk: buys motor vehicle registrations

(http://usa.polk.com/News/LatestNews/2006_0504_hybrids.htm)– Combines make & model with census data– Sell to marketers to

• Determine income• Lifestyle• Likelihood of purchasing any given product

• 21st-century marketing more one-to-one– Aggressively seek personalized information– Segmentation

Page 21: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-21

Microsoft®• 1997 Internet Explorer 4

– Active desktop– Cookies- Audience of one

• ERP web-portals– Same principle– Customize desktop

Page 22: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-22

Web Data Mining Issues• Privacy• New forms of discrimination

– Weblining: classifications based on irrelevant profiling data that marketing companies and others collect on the Web

• Spiliopoulou idenified three web mining applications– Data acquisition– Measurement of cost and quality– Assessment of user/owner satisfaction

Page 23: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-23

Protection

1. Exercise anonymity

2. Publicize & litigate

3. Track them as they track you

4. Fight for new laws

Page 24: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-24

Tools to support web mining

• Portals

• Site Trackers

• Profilers

• Search bots

• Deep linking, Meta-tagging trick, framing, in-line linking

Page 25: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-25

Web Ethics

• Utilitarian view– Greatest good for the greatest number

• Rawlsian view– More individual protection

• Pragmatic view– Compromise

Page 26: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-26

MovieFone• Traditional movie tracking company

– Depends on extensive interviews with sample of moviegoers

– Error ± 5%

• MovieFone– Sells advanced tickets– Predicts with less error

• actually samples the market

– Same as Amazon.com predicting book sales

Page 27: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-27

Code of Fair Information PracticesDept. of Health, Education, and Welfare, 1973

• No personal data record-keeping systems whose very existence is secret

• People can find out what information about them is in a record and how it is used

• People can prevent information about them obtained for one purpose from being used or made available for other purposes without that person’s consent

• Any organization creating, maintaining, using, or disseminating records of identifiable personal data must assure the reliability of the data for their intended use and must take precautions to prevent misuses of the data

Page 28: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-28

Canadian Personal Information Protection and Electronic Documents Act - 2000

• C-6

(http://www.parl.gc.ca/36/2/parlbus/chambus/house/bills/summaries/c6-e.htm)– Data collected from Web, or other

• Applies to federally regulated Canadian businesses– Banks & insurance

• Extended in 2003 to businesses regulated by Canadian provinces

Page 29: Ethical Aspects of Data Mining Information Capability what can you do? Information Responsibility what should you do?

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

13E-29

Code for the Protection of Personal Information

1. Accountability2. Identify purposes3. Consent4. Limiting collection5. Limited use, disclosure, retention6. Accuracy7. Safeguards8. Openness9. Individual access10. Challenging compliance