localization and translation crowdsourcing · • specialized crowdsourcing platforms, non profits...

40
Localization and Translation Crowdsourcing: Challenges to theory and research Katholieke Universiteit Leuven Digital Humanities Sept. 10 th , 2014 Miguel A. Jiménez Crespo [email protected]

Upload: others

Post on 12-Jul-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Localization and Translation Crowdsourcing: Challenges to theory and research

Katholieke Universiteit Leuven Digital Humanities

Sept. 10th, 2014

Miguel A. Jiménez Crespo [email protected]

Page 2: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

1. INTRODUCTION ¢  The digital era: Technology, the Internet and translation ¢  Defining crowdsourcing

l  Mapping crowdsourcing and related notions in TS ¢  Mapping crowdsourcing into TS research

l  Research paths and trajectories •  Machine Translation/ computational linguistics •  Industry research •  Translation Studies research

l  Research questions and methods •  Motivation and volunteer: a summary

¢  Future perspectives and research ¢  Conclusions

Miguel A. Jiménez-Crespo, Rutgers University

Page 3: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Digital era and translation

¢  Digital era permeates all aspects our lives ¢  Web 2.0 and beyond:

l  democratizing, participatory, open nature, dynamic, user-driven

¢  Translation digital revolution (i.e. Cronin 2013)

¢  Emergence of new translational phenomena l  Localization, online crowdsourcing,

HAMT Miguel A. Jiménez-Crespo, Rutgers University

Page 4: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Digital era and translation

¢  Technology and Translation theory •  “the emergence of new technologies has

transformed translation practice and is now exerting an impact on research and, as a consequence, on the theorization of translation .” (Munday 2012: 179)

¢  “Technological Turn” (Cronin 2010; O’Hagan 2013; Malmkjaer 2013)

¢  Potential to reshape, re-conceptualize existing TS knowledge, models, paradigms.

Miguel A. Jiménez-Crespo, Rutgers University

Page 5: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Impact of technology

¢  New modalities l  Videogame, web, software and smartphone app localization

¢  New practices and phenomena l  Crowdsourcing l  User-generated translations l  Translation technology (TM, post edited MT-TM, etc.) l  Cloud-based collaborative translations l  Etc.

¢  Impact on models l  Translation as collaboration l  Relation source-target texts

•  “Sentence salad” (Bedard 2000), “Collage translations” (Mossop 2005)

l  Professionalism and natural translation (Harris and Sherwood 1978) l  Quality l  Localization models (diff. from translation?) (Jiménez-Crespo 2013a)

Miguel A. Jiménez-Crespo, Rutgers University

Page 6: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Challenges in TS theory

¢  Individual character of translation ¢  Professionalism (Gouadec 2007) / natural

translation (Harris and Sherwood 1978) ¢  Dynamic nature: source text, target text ¢  Translation as human-computer interaction

(O’Brien 2012) ¢  Translation universals? (Jimenez-Crespo

2012)

Miguel A. Jiménez-Crespo, Rutgers University

Page 7: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

2. Crowdsourcing – a definition?

Miguel A. Jiménez-Crespo, Rutgers University

Page 8: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Crowdsourcing - definition

¢ Compound > “Crowd” + “outsourcing” ¢ Definition by Howe

l  “the act of taking a job traditionally performed by a designated agent […] and outsourcing it to an undefined, generally large group of people in the form of an open call” (Howe 2006:np).

¢  “Collaborative translations on the web” (Shimoata et al 2001)

Miguel A. Jiménez-Crespo, Rutgers University

Page 9: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Crowdsourcing - definition

¢ Common elements to definitions of crowdsourcing (Estelles 2012) (1)  the crowd (2)  the task at hand (3)  the recompense obtained (4)  the crowdsourcer or initiator of the crowdsourcing activity (5)  what is obtained by them following the crowdsourcing

process (6)  the type of process (7)  the call to participate (8)  the medium

Miguel A. Jiménez-Crespo, Rutgers University

Page 10: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Defining crowdsourcing in TS ¢  Volunteer or community translations produced

in some form of collaboration by a group of Internet users forming an online community (O’Hagan 2013)

¢  “collaborative efforts to translate content [...] either by enthusiastic amateurs [...] or by professional translators” (McDonough Dolmaya 2012: 169)

Miguel A. Jiménez-Crespo, Rutgers University

Page 11: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Translation Crowdsourcing

¢  Extends to translation the goal of creator of WWW l  “the web is more a social creation than a

technical one” (Berneers-Lee 2000: 113) ¢  Renewed interest in social collaborative nature of

translation (Risku and Dickinson 2009; Risku and Windhanger 2013)

¢  Research combines “sociological turn” (Wolf 2010; Angellelli 2011) and “technological turn” (O’Hagan 2013).

Miguel A. Jiménez-Crespo, Rutgers University

Page 12: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Defining crowdsourcing in TS

¢  Terminological confusion: l  “Translation crowdsourcing” l  “User-generated” (O’Hagan, 2009; Perrino, 2009) l  “Open” (Cronin 2010) l  “Community” l  “Volunteer” (Pym 2011) l  “Hive” translation (Garcia 2009). l  “Crowdtranslation” (Kageura et al 2011), “CT3” (Ray

and Kelly 2011)

¢  Synonyms- existing TS terms: collaborative, community, social, volunteer translation

Miguel A. Jiménez-Crespo, Rutgers University

Page 13: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Mapping TS concepts

Miguel A. Jiménez-Crespo, Rutgers University

Page 14: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Crowdsourcing types

Miguel A. Jiménez-Crespo, Rutgers University

Page 15: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Crowdsourcing types

¢  Communities can be structured: l  Hierarchical structures within the community,

selecting, depending on the task at hand, participants with different skill levels or characteristics

l  Members have different roles depending on hierarchy

•  i.e. professional translators higher chance of being editors/ translation managers in TED (Camara forthcoming)

l  Evaluation criteria: pre selection or during practice i.e. Cucumis, Kiva Miguel A. Jiménez-Crespo, Rutgers University

Page 16: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

What types of texts are crowdsourced? ¢  Initially, web content (Shimoata et al.

2001; Murata et al. 2003). Other technology-related materials

¢ Today – Anything. l Limited by volunteer interest in the

cause-initiative l  “Volunteers are in control” (O’Brien

and Schäler 2010) l  i.e.Cucumis

Miguel A. Jiménez-Crespo, Rutgers University

Page 17: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

What is crowdsourced?

¢  Audiovisual texts l  TV series (Orrego Carmona 2012) l  Manga and amine (O´Hagan 2009) l  User generated videos (Baker 2014) l  Medical information videos (Ludewid 2014)

¢  Texts requested by NGOs (Anastasiou and Schäler 2009; Petras 2011),

¢  Software ¢  MOOC courses (Beaven et al 2013) ¢  Blogs (Grunwald 2011) ¢  SMS in times of crisis (Murno 2010; Shuderling 2013) ¢  Post-editing MT of any text type

l  Produce usable translations l  To train MT engines (Zaidan and Calliston Burch 2012; Yan et al

2013; Zbib et al 2013). Miguel A. Jiménez-Crespo, Rutgers University

Page 18: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Prototype approach to crowdsourcing definition

¢  Defining Web Localization

¢  Prototype approach

(Jiménez-Crespo 2014)

Miguel A. Jiménez-Crespo, Rutgers University

Page 19: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Prototype approach to crowdsourcing definition

¢  Web – mediated environments (Estellés 2012), ¢  Open calls are initiated by companies and institutions,

l  Also self-organized translation communities ¢  Web-based platforms

l  Facebook Translation, Minna no Hon’yaku, Transbey, Tradubi or the Rosetta Foundation.

l  Process, roles and responsibilities such as manager, editor, translator, bilingual participant, etc.

¢  Participants are volunteers l  Professionals and non-professionals

¢  Rewards mostly intrinsic in nature (help, personal-intellectual satisfaction, fun, etc.)

¢  Volunteers bring l  Bilingual expertise and “natural translation” skills (Harris and

Sherwood 1978), l  “Professional translation competence” (PACTE 2005).

Miguel A. Jiménez-Crespo, Rutgers University

Page 20: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

3. Research into Crowdsourcing

Miguel A. Jiménez-Crespo, Rutgers University

Page 21: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Research trends and perspectives ¢ MT and natural language processing ¢  Industry approaches ¢ Translation Studies

l Theoretical (i.e. Cronin 2010; O'Hagan 2011; Fernández Costales 2012; Jiménez-Crespo 2011, 2013b)

l Applied - Training (O’Hagan 2008; Desjardins 2011; Babych et al. 2012)

l Empirical Miguel A. Jiménez-Crespo, Rutgers University

Page 22: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Research trends – MT, computational linguistics ¢  Trends of interest for TS

l  Development and testing of crowdsourcing models (Morera-Mesa and Filip 2013)

•  Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013)

•  Platforms for training MNH-TT (Babych et al. 2012).

l  Comparing crowdsourcing with MT (Anastasiou and Gupta 2011)

l  Crowdsourcing as a tool to obtain translations to feed and improve MT engines (Yan et al. 2013; Zbib et al 2013; Zaidan and Callison-Burch 2011; Kumran et al 2010; Callison-Burch 2009).

Miguel A. Jiménez-Crespo, Rutgers University

Page 23: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Industry research

¢ Descriptive (Desilets 2007; DePalma and Kelly 2008; Ray 2009; European Union 2011)

¢ Provided prescriptive accounts of best practices (Ray and Kelly 2011; Desilets and van de Meer 2011; Kelly, Ray and DePalma 2011). l  Potential- research into implicit theoretical

models, norms, etc.

Miguel A. Jiménez-Crespo, Rutgers University

Page 24: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

TS research into crowdsourcing ¢  Faster than the usual slow pace at which TS

has embraced the impact of technology (O’Hagan 2013)

¢  Description –> theorization –> applied-> hypothesis –> empirical studies

¢  Interdisciplinary ¢  Research questions

l  Motivation? l  User profile? l  Ethics (Drugan 2011)? l  Quality (Jiménez-Crespo 2013b, Carreira Martínez

2011)? Miguel A. Jiménez-Crespo, Rutgers University

Page 25: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Studies into motivation and volunteers’ profiles ¢  Perspective: Sociology of translation (Wolf 2010)

l  Objective: •  “the social role of the translators and the translators’ profession,

translation as a social practice...” and “people and their observable actions” (Chesterman 2007: 173-174)

l  “Translator´s studies” (Chesterman 2009) •  Translators themselves and their sociological environment as the locus

of research.

¢  Studies l  TED talks (Camara forthcoming; Olohan 2014) l  Wikipedia (McDonough Dolmaya 2012) l  Rosetta Foundation NGO (O’Brien and Schäler) l  Facebook (Dombek 2013) l  General (Fernendez-Costales 2012)

Miguel A. Jiménez-Crespo, Rutgers University

Page 26: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Methodologies

¢  Interactionist: Online surveys, online interviews ¢  Documentary: Analysis of blogs (Olohan 2014)

and publications (Drugan 2011) ¢  Observational (Dombek 2013) ¢  “Netnography” (Dombek 2013)

l  written account resulting from fieldwork studying the cultures and communities that emerge from on-line, computer mediated, or Internet-based communications, where both the field work and the textual account are methodologically informed by the traditions and techniques of cultural anthropology (Kozinets 1998: 366)

Miguel A. Jiménez-Crespo, Rutgers University

Page 27: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Participants in studies

¢  177 volunteers in TED talks (Camara forthcoming)

¢  130 volunteers in the Rosetta Foundation NGO

¢  75 English translation volunteers in Wikipedia

¢  19 and 20 in the Polish Facebook volunteer community

Miguel A. Jiménez-Crespo, Rutgers University

Page 28: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Motivation – a summary

¢ Motivations often divided between intrinsic and extrinsic (Frey 1997). Used in studies in FOSS (Lakhani and Wolf 2005; Lakhani et al 2007) l  Intrinsic: personal enjoyment or for a feeling

of obligation to a specific community l  Extrinsic: direct or indirect rewards, such as

gaining more clients, getting presents or the potential to attract customers

Miguel A. Jiménez-Crespo, Rutgers University

Page 29: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Motivation – a summary

¢ Tier 1> Intrinsic motivations l  (1) Making information in other

languages accessible to others l  (2) Help the organization with their

mission or a belief in the organizations’ principles

l  (3) Due to intellectual reasons. Related to what Shirky (2010) refers to as “cognitive surplus”.

Miguel A. Jiménez-Crespo, Rutgers University

Page 30: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Motivation – A summary

¢ Tier 2> Combination of intrinsic and extrinsic l  (4) The desire to practice the second

language or translation skills l  (5) Professional motivations related to

the need to gain translation experience or increase one´s reputation

Miguel A. Jiménez-Crespo, Rutgers University

Page 31: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Motivation – A summary

¢ Tier 3> Intrinsic l  (6) The desire to support less known

languages l  (7) The satisfaction of completing

something for the good of the community

l  (8) For fun l  (9) To be part of a network

Miguel A. Jiménez-Crespo, Rutgers University

Page 32: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Motivation – A summary

¢  Studies identify - combination of intrinsic and extrinsic motivations

¢  If professional participate (McDonough Dolmaya 2012), extrinsic motivations (reputation, attract clients, etc.) are more significant.

¢  If asked what would motivate them more> feedback from translators and agencies (O’Brien and Schäler 2010)

Miguel A. Jiménez-Crespo, Rutgers University

Page 33: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Education profiles – a summary

¢  Depends on initiative l  TED and Wikipedia, 32 to 33% full or partial

translation training l  Wikipedia 6.7% full BA-MA degree l  Polish Facebook community: mostly high

school level, followed by elementary education and BA.

¢  Certain initiatives have more prestige than others, attracting different profiles

¢  Differences with FOSS software initiatives (± 60% trained)

Miguel A. Jiménez-Crespo, Rutgers University

Page 34: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Professional profiles – a summary

¢ Rosetta Foundation – 86% professional translators

¢ Translators: Wikipedia 12%, 16.4% in the TED.

¢ Students: 37% in Wikipedia, 17.5% TED

¢ Academics: 4% in Wikipedia and 8.5% TED

Miguel A. Jiménez-Crespo, Rutgers University

Page 35: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Participant´s profiles

¢  Ages: l  Facebook – mostly 16 to 25 l  TED: Mainly 25 to 36 years olds followed by

16 to 25. •  27.6% of participants from from 36 to 61.

¢  Participation in other translation crowdsourcing l  40% Wikipedia , 50% Facebook

¢  Time weekly l  Mainly 2 to 5 hours a week

Miguel A. Jiménez-Crespo, Rutgers University

Page 36: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Conclusions ¢  Internet and the WWW has caused a paradigm = invention of

the press in the 15th century. ¢  Crowdsourcing keeps growing, attracting the attention of web

users and researchers. l  Only limits: volunteer interest and the cause itself (O’Brien and

Schäler 2010). Volunteer rank initiatives, placing higher value in some than others.

¢  Ability to disrupt the professional market – overestimated. ¢  Widening the scope of translation and the notion of translation

quality l  Translation – raw MT, human post edited MT, crowdsourced, fully

human, high quality human translation l  “Fit for purpose” quality model (Jiménez-Crespo 2013a)

¢  Research questions: l  Ethics? motivation? Quality? l  Are these questions motivated by fear of the perceived disruptive

nature of crowdsourcing? Researchers´ biases? Miguel A. Jiménez-Crespo, Rutgers University

Page 37: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Conclusions ¢  Research into crowdsourcing : interdisciplinary ¢  Different perspectives of interest: MT, industry, TS ¢  Mainly framed within both “sociological approaches” and the

“technological turn”. Translation turns overlap (Pym 2010) ¢  Variety of methodologies

l  Mainly “netnography” and online surveys and interviews ¢  Research questions:

l  Ethics? motivation? Quality? l  Are these questions motivated by fear of the perceived disruptive

nature of crowdsourcing? Researchers´ biases?

¢  Most studied issue: Motivation l  Mostly intrinsic, since volunteer do not profit from the result l  Combination of motivations l  What is the motivation of professional translators? Are they similar?

Miguel A. Jiménez-Crespo, Rutgers University

Page 38: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Conclusions ¢  Potential of research from many other perspectives. For

example l  Sociological – Impact of crowdsourcing in society, in the

translation market •  Why or what types of initiatives are more successful?

l  Ideological l  Ethical – copyright? Economic benefit from crowdsourcing? l  Cultural l  Applied research

•  Using crowdsourcing platforms for training (O´Hagan 2008; Kageura et al 2011)

•  Development of platforms and models with interested parties (Desilets 2010)

Miguel A. Jiménez-Crespo, Rutgers University

Page 39: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Conclusions ¢  Cognitive approaches

l  Translation as “decision-making” and “problem-solving” process (Wilss 1994; Angelone and Shreve 2010)

l  How are web-based collaborative translations processed? Match with existing research into cognitive processes and CAT tools

l  Impact of platforms in the process ¢  Product-based approaches

l  Crowdsourced texts represent a distinctive textual population or “translation subset” (Chesterman 2004) to test “general tendencies of translation”

l  “Language of crowdsourced translation” (Jiménez-Crespo forthcoming)? Miguel A. Jiménez-Crespo, Rutgers University

Page 40: Localization and Translation Crowdsourcing · • Specialized crowdsourcing platforms, non profits or medical information (Laurenzi et al 2013) • Platforms for training MNH-TT (Babych

Thanks!        

 Miguel  A.  Jiménez  Crespo  

email:  [email protected]    

Miguel A. Jiménez-Crespo, Rutgers University