bsa cpd a_koene2016
TRANSCRIPT
Responsibility and accountability in algorithm mediated services
Ansgar KoeneLibel, Privacy, Data Protection and Online Legal Action - A Practitioner’s Guide
25 November 2016
http://unbias.wp.horizon.ac.uk/
Algorithms in the news
2
E. Bakshy, S. Medding & L.A. Adamic, “Exposure to ideologically diverse news and opinion on Facebook” Science, 348, 1130-1132, 2015
Echo-chamber enhancement by NewsFeed algorithm
3
10.1 million active US Facebook users
Proportion of content that is cross-cutting
Search engine manipulation effect could impact elections – distort competition
4
Experiments that manipulated the search rankings for information about political candidates for 4556 undecided voters.i. biased search rankings can shift the voting preferences of
undecided voters by 20% or moreii. the shift can be much higher in some demographic groupsiii. such rankings can be masked so that people show no
awareness of the manipulation.
R. Epstein & R.E. Robertson “The search engine manipulation effect (SEME) and its possible impact on the outcome of elections”, PNAS, 112, E4512-21, 2015
• White House: Big Data: A Report on Algorithmic Systems, Opportunity, and Civil Rights
• Council of Europe: Committee of experts on Internet Intermediaries (MSI-NET)
• European Parliament: Algorithmic accountability and transparency in the digital age (Marietje Schaake MEP/ALDE)
• European Commission: eCommerce & Platforms launching 2 year investigation on algorithms
• House of Lords Communications Committee inquiry “Children and the Internet” (ongoing)
• Commons Science and Technology Committee inquiry “Robotics and Artificial Intelligence” (2016) -> recommendation for standing Commission on AI
• HoL EU Internal Market Sub-Committee inquiry “Online platforms and the EU Digital Single Market” (2016)
Governmental inquiries
5
• Partnership on Artificial Intelligence to Benefit People and Society: consortium founded by Amazon, Facebook, Google, Microsoft, and IBM to establishing best practices for artificial intelligence systems and to educate the public about AI.
• IEEE Global Initiative for Ethical Considerations in artificial Intelligence and Autonomous Systems -> development of Standards on algorithmic bias, transparency, accountability
Industry response
6
• Similar to existing rights under the Data Protection Act
• Individuals have the right not to be subject to a decision when:– it is based on automated processing; and– it produces a legal effect or a similarly significant effect
on the individual.
• You must ensure that individuals are able to:– obtain human intervention;– express their point of view; and– obtain an explanation of the decision and challenge it.
GDPR: Rights related to automated decision making and profiling
7
• The right does not apply if the decision:– is necessary for entering into or performance of a
contract between you and the individual;– is authorised by law (eg for the purposes of fraud or tax
evasion prevention); or– based on explicit consent. (Article 9(2)).
• Furthermore, the right does not apply when a decision does not have a legal or similarly significant effect on someone.
GDPR: Rights related to automated decision making and profiling
8
When processing personal data for profiling purposes, appropriate safeguards must be in place to:• Ensure processing is fair and transparent by providing meaningful
information about the logic involved, the significance and envisaged consequences.
• Use appropriate mathematical or statistical procedures.• Implement appropriate technical and organisational measures to
enable inaccuracies to be corrected and minimise the risk of errors.• Secure personal data proportionate to the risk to the interests and
rights of the individual and prevent discriminatory effects.Automated decisions must not:
– concern a child; or– be based on the processing of special categories of data unless:
• you have the explicit consent of the individual; or• the processing is necessary for reasons of substantial public
interest on the basis of EU / Member State law.
GDPR: Rights related to automated decision making and profiling
9
10
• A set of defined steps that if followed in the correct order will computationally process input (instructions and/or data) to produce a desired outcome. [Miyazaki 2012]
• From a programming perspective:
Algorithm = Logic + Control
logic is problem domain-specific and specifies what is to be donecontrol is the problem-solving strategy specifying how it should be done
• Problems have to be abstracted and structured into a set of instructions which can be coded.
What is an algorithm?
11
Calculate the number of ghost estates in Ireland using a database of all the properties in the country that details their occupancy and construction status.1. Define what is a ghost estate in terms of
(a) how many houses grouped together makes an estate? (b) what proportion of these houses have to be empty or under-
construction for that estate to be labelled a ghost estate? 2. Combine these rules into a formula -- “a ghost estate is 10 or more
houses where over 50% are vacant or under-construction”. 3. Write a program that searches and sifts the property database to
find estates that meet the criteria and totals up the number. • We could extend the algorithm to record coordinates of qualifying
estates and use another set of algorithms to plot them onto a map. • In this way lots of relatively simple algorithms are structured
together to form large, often complex, recursive decision trees.
Example
12
• Defining precisely what a task/problem is (logic)• Break that down into a precise set of instructions, factoring
in any contingencies, such as how the algorithm should perform under different conditions (control).
• “Explain it to something as stonily stupid as a computer” (Fuller 2008).
• Many tasks and problems are extremely difficult or impossible to translate into algorithms and end up being hugely oversimplified.
• Mistranslating the problem and/or solution will lead to erroneous outcomes and random uncertainties.
The challenge of translating a task/problem into an algorithm
13
• Algorithms are mostly presented “to be strictly rational concerns, marrying the certainties of mathematics with the objectivity of technology”.
• The complex set of decision making processes and practices, and the wider systems of thought, finance, politics, legal codes and regulations, materialities and infrastructures, institutions, inter-personal relations, that shape their production are not discussed.
• Algorithms are presented as objective, impartial, reliable, and legitimate
• In reality code is not purely abstract and mathematical; it has significant social, political, and aesthetic dimensions.
The myth of algorithms
14
• Algorithm are created through: trial and error, play, collaboration, discussion, and negotiation.
• They are teased into being: edited, revised, deleted and restarted, shared with others, passing through multiple iterations stretched out over time and space.
• They are always somewhat uncertain, provisional and messy fragile accomplishments.
• Algorithmic systems are not standalone little boxes, but massive, networked ones with hundreds of hands reaching into them, tweaking and tuning, swapping out parts and experimenting with new arrangements.
Algorithm creation
15
• Company algorithms provide a competitive edge which they are reluctant to expose with non-disclosure agreements in place.
• They also want to limit the ability of users to game the algorithm to unfairly gain competitive edge.
• Many algorithms are designed to be reactive and mutable to inputs. E.g.: Facebook’s NewsFeed algorithm does not act from above in a static, fixed manner. Posts are ordered dependent on how one interacts with ‘friends’. The parameters are contextually weighted and fluid. In other cases, randomness might be built into an algorithm’s design meaning its outcomes can never be perfectly predicted.
The transparency challenge
16
• Deconstructing and tracing how an algorithm is constructed in code and mutates over time is not straightforward.
• Code often takes the form of a “Big Ball of Mud”: “[a]
haphazardly structured, sprawling, sloppy, duct-tape and bailing wire, spaghetti code jungle”.
Examining pseudo-code/source code
17
• Reverse engineering is the process of articulating the specifications of a system through a rigorous examination drawing on domain knowledge, observation, and deduction to unearth a model of how that system works.
• By examining what data is fed into an algorithm and what output is produced it is possible to start to reverse engineer how the recipe of the algorithm is composed (how it weights and preferences some criteria) and what it does.
Reverse engineering
18
Operationalizing “fairness” in algorithms
19Source: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2515786
• We want a fair mapping f: CS -> DS• We do not know CS, we can only approximate it through
observation.• Thus we are dealing with f: OS ->DS
• Equality of outcomes:– [We’re All Equal] assume that all groups are similar in CS,
group differences in OS are due to observation bias.
• Equality of treatment:– [WYSIWYG] assume OS is true representation of CS.
equality of outcomes vs. equality of treatment
20
• Certification: test the system with representative data sets X and Y.– Problem: how to guarantee representative data in CS
Certifying disparate impact
21Source: http://arxiv.org/abs/1609.07236
• Assume bias in CS -> OS mapping• Perform re-mapping such that OS distribution X=1 and X=0
groups is same
Removing disparate impact
22
X=1
X=0 re-mapped X
• Fairness is fundamentally a societally defined construct (e.g. equality of outcomes vs equality of treatment)– Cultural differences between nations/jurisdictions– Cultural changes in time
• “Code is Law”: Algorithms, like laws, both operationalize and entrench spatio-temporal values
• Algorithms, like the law, must be:– transparent – adaptable to change (by a balanced process)
Problems
23
EPSRC funded UnBias project
24http://unbias.wp.horizon.ac.uk/
• WP1: ‘Youth Juries’ workshops with “digital natives” to co-produce citizen education materials on filtering/recommendation algorithms
• WP2: Hackathons and double-blind testing to produce user-friendly open source tools for benchmarking and visualizing biases in algorithms
• WP3: Interviews and user observation to derive requirements for algorithms that satisfy subjective criteria of bias avoidance
• WP4: Broad stakeholder focus groups to develop policy briefs for an information and education governance framework
Project activities
25
[email protected]@UnBias_algoshttp://unbias.wp.horizon.ac.uk
Questions?
Source material:Defining algorithmshttp://papers.ssrn.com/sol3/papers.cfm?abstract_id=2515786Mathematical definition of fairnesshttp://arxiv.org/abs/1609.07236Certifying and removing disparate impacthttp://arxiv.org/abs/1412.3756