ling/cse 472: introduction to computational linguistics · 2020-04-21 · hovy & spruitt 2016...

32
Ling/CSE 472: Introduction to Computational Linguistics 4/21 Societal Impact of NLP

Upload: others

Post on 07-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Ling/CSE 472: Introduction to Computational Linguistics

4/21Societal Impact of NLP

Page 2: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Overview

• Stakeholder-focused typology of risks of NLP & voice technology

• Value Scenarios

• Reading questions

Page 3: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Typology

• A systematic classification of phenomena, along one or more dimensions

• Helps to explore the space of possibilities

• Helps to understand relationships across categories

Prev work: Hovy & Spruitt 2016

Page 4: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing”

• Survey of some types of issues

• Importantly raised awareness of the discussion within English-language NLP circles

• Introduced concepts of:

• Exclusion, Overgeneralization, Bias confirmation, Topic Overexposure, Dual use

• Illustrated with NLP-specific examples of negative impacts

• Not exhaustive, not a typology

Page 5: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Guiding principles: Sociolinguistics (e.g. Labov 1966, Eckert & Rickford 2001)

• Variation is the natural state of language

• Variation in pronunciation, word choice, grammatical structures

• Status as ‘standard’ language is a question of power, not anything inherent to the language variety itself

• Language varieties & features associated with marginalized groups tend to be stigmatized

• Meaning, including social meaning, is negotiated in language use

• Our social world is largely constructed through linguistic behavior

Page 6: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Guiding principles: Value sensitive design

• Value sensitive design (Friedman et al 2006, Friedman & Hendry 2019):

• Identify stakeholders

• Identify stakeholders’ values

• Design to support stakeholders’ values

Page 7: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Stakeholder-centered typology

Direct stakeholders Indirect stakeholders

By choice Subject of query

Not by choice Contributor to broad corpus

Subject of stereotypes

Page 8: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Direct stakeholders: By choice

• I choose to use this voice assistant, dictation software, machine translation system…

• … but it doesn’t work for my language or language variety

• Suggests that my language/language variety is inadequate

• Makes the product unusable for me

• … but the system doesn’t indicate how reliable it is

• Users reliant on machine translation/auto-captioning for important info left in the dark about what they might be missing

Page 9: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Direct Stakeholders: By Choice

• I choose to use this information retrieval system…

• … but the presentation of the results juxtaposed against the question I have in mind obscures important

Page 10: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because
Page 11: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Direct stakeholders: Not by choice

• My screening interview was conducted by a virtual agent

• I can only access my account information via a virtual agent

• Access to a 911 system requires interaction with a virtual agent first

• … but it doesn’t work or doesn’t work well for my language variety

• I scored poorly on the interview, even though the content of my answers was good

• I can’t access my account information or 911

Page 12: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Direct stakeholders: Not by choice

• LM (language modeling) technology can now generate very real sounding text, in English at least (Radford et al 2019)

• … but which is not grounded in any actual relationship to facts

• I mistake the text for statements made by a human publicly committing to them

• I become more distrustful of all text I see online

• Language models trained on ‘standard’ or ‘official’ sounding documents will sound ‘standard’ or ‘official’.

Page 13: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Indirect stakeholders: Subject of query

• Someone searched for me online

• … but the search triggered display of negative ads including my name because of stereotypes about my ethnic identity (Sweeney 2013)

• Someone searched for critics of the government

• … and found my blog post/tweet

• Someone put my words into an MT system

• … which got the translation wrong and led the police to arrest me (The Guardian, 24 Oct 2017; https://bit.ly/2zyEetp)

Page 14: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Indirect stakeholders: Subject of query

• Someone searched for me online

• … but the ethnicity associated with my name triggered display of negative ads including my name (Sweeney 2013)

• Someone searched for critics of the government

• … and found my blog post/tweet

• Someone put my words into an MT system

• … which got the translation wrong and led the police to arrest me (The Guardian, 24 Oct 2017; https://bit.ly/2zyEetp)

Page 15: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Indirect stakeholders: Subject of query

• Someone searched for me online

• … but the ethnicity associated with my name triggered display of negative ads including my name (Sweeney 2013)

• Someone searched for critics of the government

• … and found my blog post/tweet

• Someone put my words into an MT system

• … which got the translation wrong and led the police to arrest me (The Guardian, 24 Oct 2017; https://bit.ly/2zyEetp)

Page 16: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Indirect stakeholders: Subject of query

• Someone designed a system to classify people by identity characteristics according to linguistic features

• Information I thought I was presenting only in some venues is made available in others

Page 17: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Indirect stakeholders: Contributor to broad corpus

• ASR doesn't caption my words as well as others'

• My contributions are rendered invisible to search engines

• Language ID systems don’t identify my dialect

• Social-media based disease warning systems fail to work in my community (Jurgens et al 2017)

Page 18: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Indirect stakeholders: Subject of stereotypes

• Virtual assistants are gendered as female and ordered around

• Systems are built using general webtext as a proxy for word meaning or world knowledge

• … but general web text reflects many types of bias (Bolukbasi et al 2016, Caliskan et al 2017, Gonen & Goldberg 2019)

• My restaurant’s positive reviews are underrated because of the name of the cuisine (Speer 2017)

• My resume is rejected because the screening system has learned that typically “masculine” hobbies correlate with getting hired

• My image search reflects stereotypes back to me

Page 19: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Indirect stakeholders: Subject of stereotypes

• Systems are built using general webtext as a proxy for word meaning or world knowledge

• … but general web text reflects many types of bias

• My restaurant’s positive reviews are underrated because of the name of the cuisine (Speer 2017)

• My resume is rejected because the screening system has learned that typically “masculine” hobbies correlate with getting hired

• My image search reflects stereotypes back to me

Page 20: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Who’s job is this?

• Speech/language tech researchers & developers: build better systems, promote systems appropriately, educate the public

• Procurers: choose systems/training data that match use case, align task assigned to speech/language tech system with goals

• Consumers: understand speech/language tech system output as the result of pattern recognition, trained on some dataset somewhere

• Members of the public: learn about benefits and impacts of speech/language tech and advocate for appropriate policy

• Policy makers: consider impacts of pattern matching on progress towards equity, require disclosure of characteristics of training data

Page 21: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

How can we empower people to do those jobs?

• Documentation of data sets and models trained on them (Thursday)

• Methodologies for thinking through how technology might interact with social systems (e.g. value sensitive design)

• identifying the people/communities likely to be impacted

• eliciting their input (e.g. Diverse Voices https://techpolicylab.uw.edu/project/diverse-voices/ )

• thinking through scenarios (e.g. http://www.envisioningcards.com/, value scenarios)

Page 22: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Value Scenarios

• Design Scenarios (Rosson & Carroll 2003): Tell the story of how the product, once developed, will be used. Focus on user, typically with happy outcomes.

• Value Scenarios (Nathan et al 2007): Tell the story of how the product, deployed pervasively over time, will impact society. Focus on both users and other stakeholders, imagine what could go wrong.

Page 23: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Value Scenarios: Elements

• Stakeholders

• Pervasiveness

• Time

• Systemic Effects

• Value Implications

Page 24: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Reading questions

• How does one weigh the possible negative outcomes against the intended positive outcomes of a technology and decide whether to produce the technology or not? Both scenarios seem extreme in the reading, is there a more middle ground that explores both negative and positive effects?

• The value scenario, by looking at the negative outcome, potentially makes a presumption of guilty of one thing, which can be bad for technology innovation.

Page 25: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Reading questions

• I'm curious how a value scenario would be formed. In the value scenario examples provided in the reading, there is a strong negative tone. Is this negative tone a deliberate decision meant to determine possible consequences? Perhaps, a group developing a value scenario for a technology conducted user research and received a wide range of responses. In this case, would they try to focus on the more negative opinions and develop the value scenario from there even if it is held only by a small minority? Or try to take the responses and incorporate them as a whole?

• When talking about time (e.g. making prediction about 10-20 years from product deployment), I wonder what is the usual time frame used to make predictions, because 10-20 years seem a bit unrealistic.

Page 26: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Reading questions

• Though, reading the Geminoid Jack scenario made me doubt the possibility of the value scenario. We're all in "quarantine" now and, as much as everyone else, I miss going out to physically see my friends and physically attend class. I can understand the validity of using a geminoid for Jack in the scenario and I understand the consequences it illustrated, but would such an invention actually "become massively popular" and make bank when we have other technology such as Zoom, messaging apps, and Uber Eats to do the tasks illustrated in the scenario as well?

Page 27: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Reading questions

• It was unclear to me who it is that should write value scenarios. If it is the developers of some technology, what motivation would they have to write about their technology in a possibly negative light? If it is the policy makers, would they not be heavily biased to prefer a particular prediction about the future, depending on if they support the technology or not? Even if multiple value scenarios are to be written, it seems like the author would hold much power in influencing what the reader think about the technology at hand.

• Are some of the limitations of scenario-based design discussed in this paper inevitable limitations of evaluating one's own designs? For example, the designers of a technology are always going to be biased towards considering the positive, intended uses of the technology.

Page 28: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Reading questions

• With this idea in consideration, what would motivate the developers to accurately and honestly identify negative impacts of their design, especially if it could completely hinder progress?

• When would a team consider value scenarios? Is this something you would consider initially and design around, or revisit multiple times during the design process to evaluate how changes to initial design might affect stake-holders later on?

• Who creates these value scenarios and makes these evaluations on new technologies' potential impact? I imagine these are considerations that creators have, but maybe there are also some people who are more trained in making good, viable evaluations of the impact of new technology, perhaps if they have very strong knowledge of how society naturally changes.

Page 29: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Reading questions

• Is this actually being used in industry?

• I find it interesting that the two suggestions the authors give at the conclusion seemed kind of like... common sense to me? Every action that people take have consequences, regardless of whether they are overt or not. Technology is human-made, so that technology would carry its own consequences. Sure, it is extremely difficult to know exactly what humans will do with what is given to them (can be said for anything, really) but shouldn't designers always take into account of what the product can do and not what is created to do?

• This is a very subjective question but how much onus is on the developers to ensure that a system like SafetyNet doesn't get used as a weapon for already discriminated communities. There is certainly many things that could be done like properly varying data, but ultimately the policy makers affect the decisions of how the tool can be used.

Page 30: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Reading questions

• I think value scenario is a good way for people to engage in considering the long term effect of the product. Then how can value scenarios help people to actually fix the problem? The main purpose of SBD is using stories based scenario to quickly identify key issues for fixes. The value scenario gives a possible future but failed to identify key issues for fixes. I think the main problem for the value scenario is to negatively link an outcome to one factor, the last straw. In fact, the outcome is caused by multiple factors, such as lack of police resources, the continuous influence of Federal Housing Administration, and etc.

Page 31: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Reading questions

• What additional elements could be added to the Value Scenario alongside Pervasiveness, Stakeholders, Time, etc? Would it be a good idea to also consider a scenario where a technology is only used by a select group of people; i.e. the opposite of the Pervasiveness factor? I would think this could help envision if this technology would give the small group of people using it a disadvantage or advantage.

• In a value scenario, does "stakeholder" always have to refer to a person or group of people? Or could a stakeholder be something like a species of animals, for example, whose natural habitat is affected by some sort of emerging technology? The example scenarios in the paper were focused on humans, but it seems to me like it would be useful to go into in-depth analyses like these ones for technology that is intended to interact with non-humans.s

Page 32: Ling/CSE 472: Introduction to Computational Linguistics · 2020-04-21 · Hovy & Spruitt 2016 “The Social Impact of Natural Language Processing ... • My resume is rejected because

Reading questions

• It was interesting to me that it was claimed that system effects are generally emergent, i.e. the implicitly unintended result of the composition of multiple local effects once a system is put into practice. How does this relate to modern NLP systems, some of which are not easy to split up into subroutines?