cultural (re-)formations: structuring a linked data ontology for intersectional identities ·  ·...

5
Cultural (Re-)formations: Structuring a Linked Data Ontology for Intersectional Identities Susan Brown [email protected] University of Guelph, Canada Abigel Lemak [email protected] University of Guelph, Canada Colin Faulkner [email protected] University of Guelph, Canada Kim Martin [email protected] University of Guelph, Canada Rob Warren [email protected] Carleton University, Canada Introduction Cultural diversity has been an increasing source of debate within the digital humanities community. The concentration within the Debates in Digital Humanities series (Gold, 2012; Gold and Klein, 2016) of pieces re- flecting the increasing prominence of matters related to race, gender, cultural diversity and difference is but one marker of the extent to which diversity matters. The Orlando Project in feminist literary history incor- porated an intersectional understanding of identity categories from the outset (Brown, Clements and Grundy, 2006-2017). Translating Orlando’s Extensible Markup Language (XML) data into linked open data (LOD) to make it accessible, interoperable, and amena- ble to a range of analytical approaches (Simpson and Brown) requires an ontology that will serve both Or- lando and the broader research community hosted by the Canadian Writing Research Collaboratory (CWRC). This paper outlines the CWRC ontology design and the challenges of shifting from semi-structured to struc- tured data (Smith, 2016: 273). Much work on digital diversity expresses skepti- cism of the ability of systematized knowledge struc- tures to capture the performative, processual, and contingent nature of lived subjectivities. Tara McPher- son stresses that “computers are themselves encoders of culture” and calls for more attention to be paid to the interconnectedness of the structures of code and the management of race socially: "Just as the relational database works by normalizing data—that is, by strip- ping it of meaningful, idiosyncratic context, creating a system of interchangeable equivalencies—our own scholarly practices tend to exist in relatively hermeti- cally sealed boxes or nodes." Scholars including Lisa Nakamura (2002: 120) and Moya Bailey (2011) see value in “messiness” as a way to push against and re- define the contours of a digital humanities scholarship that remains rooted in predominantly white episte- mology. At the same time, relegating representations of dif- ference to narrative rather than structured data will produce gaps within big data that are both impover- ishing for humanities inquiry and dangerous in their political implications (Lerman, 2013; Trevinarus, 2014; “Use”; Brown and Simpson, 2013). Adriel Dean- Hall and Robert Warren (2013) have advocated ap- proaches that respect the privacy and preferences of lived human subjects while improving the responsive- ness of online systems to diversity and complexity. Within a LOD context, what are finally findable, pro- cessable, and reusable on the global graph are things, not strings, so the challenge is the extent to which nu- ance, context, and indeed messiness can be incorpo- rated into a LOD ontology. The Orlando Project (Brown, et al., 2006-2017) charted a middle ground between narrative and struc- ture for its bespoke XML tagset. The team struggled with the hierarchical nature of XML particularly in re- lation to identity categories, torn between knowledge that readers would turn to Orlando to find writers as- sociated with particular cultural identities and recog- nition that such categories are discursive rather than essential (Fuss, 2013). It devised a “Cultural For- mation” tagset to depict identity as neither unitary nor immutable, and as much related to representational acts as to the lived experiences into which those rep- resentations blur. Precisely because constituted through discursive and social practices, vocabularies associated with subjectivities and identities can shift over time and place, and throughout an individual’s lifetime.

Upload: duongkhuong

Post on 01-May-2018

221 views

Category:

Documents


5 download

TRANSCRIPT

Cultural (Re-)formations: Structuring a Linked Data Ontology for Intersectional Identities [email protected],[email protected],[email protected],[email protected],[email protected],Canada

Introduction Culturaldiversityhasbeenanincreasingsourceof

debatewithinthedigitalhumanitiescommunity.TheconcentrationwithintheDebatesinDigitalHumanitiesseries(Gold,2012;GoldandKlein,2016)ofpiecesre-flectingtheincreasingprominenceofmattersrelatedtorace,gender,culturaldiversityanddifferenceisbutonemarkerof theextent towhichdiversitymatters.TheOrlandoProjectinfeministliteraryhistoryincor-porated an intersectional understanding of identitycategories from the outset (Brown, Clements andGrundy,2006-2017).TranslatingOrlando’sExtensibleMarkup Language (XML) data into linked open data(LOD)tomakeitaccessible,interoperable,andamena-bletoarangeofanalyticalapproaches(SimpsonandBrown)requiresanontologythatwillservebothOr-landoandthebroaderresearchcommunityhostedbytheCanadianWritingResearchCollaboratory(CWRC).ThispaperoutlinestheCWRContologydesignandthe

challenges of shifting from semi-structured to struc-tureddata(Smith,2016:273).

Muchwork on digital diversity expresses skepti-cismof the ability of systematized knowledge struc-tures to capture the performative, processual, andcontingentnatureoflivedsubjectivities.TaraMcPher-sonstressesthat“computersarethemselvesencodersofculture”andcalls formoreattentiontobepaidtothe interconnectednessof thestructuresofcodeandthemanagementofracesocially:"Justastherelationaldatabaseworksbynormalizingdata—thatis,bystrip-pingitofmeaningful,idiosyncraticcontext,creatingasystem of interchangeable equivalencies—our ownscholarlypracticestendtoexistinrelativelyhermeti-cally sealedboxesornodes." Scholars includingLisaNakamura (2002: 120) andMoya Bailey (2011) seevaluein“messiness”asawaytopushagainstandre-definethecontoursofadigitalhumanitiesscholarshipthat remains rooted in predominantly white episte-mology.

Atthesametime,relegatingrepresentationsofdif-ference tonarrative rather than structureddatawillproducegapswithinbigdatathatareboth impover-ishing forhumanities inquiryanddangerous in theirpolitical implications (Lerman, 2013; Trevinarus,2014;“Use”;BrownandSimpson,2013).AdrielDean-Hall and RobertWarren (2013) have advocated ap-proachesthatrespecttheprivacyandpreferencesoflivedhumansubjectswhileimprovingtheresponsive-ness of online systems to diversity and complexity.WithinaLODcontext,whatare finally findable,pro-cessable,andreusableontheglobalgrapharethings,notstrings,sothechallengeistheextenttowhichnu-ance, context, and indeedmessiness canbe incorpo-ratedintoaLODontology.

The Orlando Project (Brown, et al., 2006-2017)chartedamiddlegroundbetweennarrativeandstruc-ture for its bespokeXML tagset. The team struggledwiththehierarchicalnatureofXMLparticularlyinre-lationtoidentitycategories,tornbetweenknowledgethatreaderswouldturntoOrlandotofindwritersas-sociatedwithparticularculturalidentitiesandrecog-nitionthatsuchcategoriesarediscursiveratherthanessential (Fuss, 2013). It devised a “Cultural For-mation”tagsettodepictidentityasneitherunitarynorimmutable, and asmuch related to representationalactsastothelivedexperiencesintowhichthoserep-resentations blur. Precisely because constitutedthroughdiscursiveandsocialpractices,vocabulariesassociatedwithsubjectivitiesand identitiescanshiftover time and place, and throughout an individual’slifetime.

Cultural formation tagset TheCulturalFormation(CF)tagsetrecognizescat-

egorizationasendemictosocialexperience,whilein-corporatingvariation interminologyandcontextual-izationofidentitycategoriesbyemployingtagsatdif-ferent discursive levels. CF tags describe the subjectpositions of individuals through 1) contextual tagsthat encode substantial discussions: class; language;nationality;raceandethnicity;religion;andsexuality;and2)granulartagsthatdescribe,inawordorshortphrase,class;ethnicity;gender;geographicalheritage;language;nationality;nationalheritage;politicalaffil-iation;raceorcolour;religiousdenomination,andsex-ual identity.With theexceptionofgenderandsocialclass, the Orlando schema eschewed fixed attributevaluesforthegranulartags,allowingtheprosetoem-ploy themost appropriate language for the context.Thestructurewasnotentirelylogicalorparallel,andwe are making the ontology more consistent. Thegranular tags possess attributes regarding forebearsandwhetherasubjectself-identifiedwithaparticularterm.Thetagsetaimedtohighlighttheextenttowhichsocialclassificationisculturallyproducedanddiscur-sively embedded. Rather than disambiguating leakycultural categories, it considered them as mutuallyconstitutive with historically specific discursiveframeworks,includingourtaggingstructures.

CF encoding pointed users towards a frameworkforraisinganddebatingcomplexmattersforculturalinvestigationratherthanstandardizedclassifications,refusingtoneatlygroupwritersintodistinctandfixedcategories,sincethosecategorieswereneitherstablenormutuallyexclusive(Algee-Hewitt,Porter,Walser,forthcoming). It can represent quite complex identi-ties, as in the case of Anna Leonowens, the writerwhosestoryoflifeasgovernesstotheroyalSiameseharem was popularized in The King and I. PartialmarkupforthefirstparagraphofherCFdescriptionisshowninFigure1.

Figure 1: Adapted from Brown, Clements and Grundy, “Anna Leonowens”, Life tab, Show Markup option

TheCFcomponentofOrlando’sknowledgerepre-sentationisthuscrucialtoitsintersectionalapproach

toidentity(Brownetal.,2006).CreatingaLODontol-ogy that was not self-referential, however, requirestranslatingthestringsorliteralvaluesfromCFtags,tolinkOrlando’ssemanticstructures toothersemanticwebcommunities.

LOD ontology creation Anontology“isa formalnaminganddefinitionof

thetypes,properties,andinterrelationshipsoftheen-titiesthatreallyorfundamentallyexistforaparticulardomain of discourse” (Wikipedia, Ontology - Infor-mationScience).Usingastandardontology languagesuchasOWLallowsothers to interactandexchangewithaparticularviewoftheworldthroughacompu-tationalprocessofmediation.Asa representationofthat understanding, an ontology can be referenced,(dis)agreed with, extended, and used operationally.ThecoexistenceofdifferentrepresentationsprovidesthefoundationfortranslationsbetweenLODconcepts.

Ontology creation inour case, as inmanyothers,wasdrivenbytheidiosyncrasiesandlimitationsofanexistingdataset.Theinformationarchitecturesofap-plicationdatabasesorXMLstoresarenotalwaysrec-oncilable toa consistent informationsystem.TheCFtagset representsamajorchallenge in that its struc-turewasdesignedtoeschewdisambiguation.Eventhemajortagsweredifficulttorelatewithinaconciseon-tology(Figure2).

Figure 2: Schematic representation of the granular Cultural

Formation tags from Orlando (Please note that these representations are simplified in order to make them legible

to the reader.)

Forexample,nationalityandnationalheritagearenot employed as commensurate with citizenship, awell-defined legal concept related to an organizedstate.Theycanalsoberelatedtoageographicalarea,whichmayormaynot coincidewitha state.Finally,nationhood can reference socio-political constructssuchasLesbianNation (Johnston,1973;Ross,1995;Munt,1998)ordisavowalsofnationalitysuchasVir-ginia Woolf’s (1938: 197), which Orlando quotesalongside assigning Woolf an English nationality, acontradiction that requires contextual evidence tomakesense.

Linked into context We decided tomake all human-readable annota-

tionswithinthedatasetinstancesofcontextualnotestowhichtheontologicalclassesaredirectlytied(Fig-ure3).

Figure 3: Schematic representation of how the discursive context (note) links to the classificatory structure, and how

classificatory labels relate to predicates and external ontologies. Skos:narrower/broader relationships are also

used, but omitted here to improve legibility

Thus we model the discursive context within a

Race[or]EthnicityContext class. The note instancelinks to instances of granular category labels, hereRaceColour;itprovidestheprovenanceandthebasisforlinkstosourceinformation.Linkingtotheprove-nanceoftheLODisparticularlyimportantfordisputedor contradictory information, as in our example.Weare modeling the original Orlando narrative as asource document for our LOD provenance using thethe Web Annotation Data Model’s subproperty in-stances.Weaimtolinkeverytripletotheprosefromwhichitisderived,providingprovenanceinformationandcontainscitationstothesourcesonwhichidentityassertionsarebased.

Relating cultural formations Cultural formation forOrlando isunderstoodpri-

marilyasrepresentational,whichisnottosaythatcul-turalformationisnotrealorthatithasnomaterialef-fects.Thecomplexsignifiersofculturalidentitiesfloatacross Orlando tags as cdata or free text in a semi-structured representation of cultural identities andcategories.FortheCWRContology,westrategizedtorelate thisontologicalperspective to thatofexternal

vocabularieswithoutconflatingourtruthwiththeirs.Our architecture does not import other ontologieswholesale, but adopts components ofmajor vocabu-laries such asBIBO, FOAF, and FRBR, and relates tolargevocabulariesindefinedways.AsindicatedinFig-ure 3, the instances of cwrc:whiteRaceColour andcwrc:whiteEthnicity within the CWRC ontology aresubclasses of the cwrc:whiteLabel. This retains theambiguity of terms such as “white” or “Jewish” pre-ciselyaslabelsthatdrawtogetherparticulartypesofidentitycategories,aswellassubClassesof those la-bels. As indicated, those subClasses canbe linked totermsinexternalvocabularies,butbothinternalandexternaltermsareunderstoodwithintheCWRContol-ogy as labels. Indeed, constructing this ontology hasbroughthometoustheneedfortheLODcommunitytothinkthroughwithgreatercaretherelationshipbe-tweenrepresentationand“reality”inLODontologies.A furthercomplication is that identitycategoriesarenotonlyhistoricallycontingentbutoftenalsochangeoveraparticularindividual’slifetime.TheOrlandoda-tasetsupportssuchnuanceinonlyafewcases,sowehavenotstartedwiththisgnarlyproblem,butweaimto build into the ontology the capacity to representsuchcultural formationdynamics inordertoaccom-modatemoretemporallyprecisedata.

Conclusion The CWRC ontology design avoids representing

RDFextractionsfromOrlandodataaspositivistasser-tions,andyetproducesmachine-readableOWL/RDF-compliant graph structures. It allows references to,without endorsing, external ontological vocabulariesthat are nevertheless part of documenting intersec-tionalculturalprocessesandidentities.

WewillpresentCWRContologyasbuiltaroundtheCFdesigndescribedhere,andwewilldemonstrateitsimplications throughseveralpracticalexamples.Fig-ure4showsschematicallytheintersectionalityofmul-tiple identity categories associatedwith Leonowens,includingthewaysthatinstancesarerelatedbysub-classrelationshipsinaccordancewithOWLprinciples.Thisimportantlyallowsustoreferencecomponentsofotherontologies(heretheMuninnAppearancesontol-ogy, LibraryofCongress SubjectHeadings,GettyArtand Architecture Thesaurus, and DBpedia) withoutadoptingthemwholesale.

Figure 4: Cultural Formation triples related to Anna

Leonowens, with corresponding XML-encoded context notes

Figure 5 indicates the ability to see patterns andoutliersrelatedtodifferentcategorizationsofJewish-nessinasmallsubsetofOrlandoauthors.

Figure 5: Subset of CF triples related to a subset of writers, with sample context annotations and external links;

predicates linking individuals to subclasses are inferred (e.g. the edge between Elizabeth Sarah Gooch and

cwrc:jewishReligion is hasReligion)

OurlivepresentationwilldemonstratetheontologyinactionusingtheinteractiveHuViz(HumanitiesVisual-izer)interfacewithalargerdataset.

Ontologies • CWRC ontology: http://sparql.cwrc.ca/ontol-

ogy/cwrc• CWRCsparqlendpoint:http://sparql.cwrc.ca/• OrlandoBiographyschemacontainingCulturalFor-

mation tagset: https://github.com/cwrc/CWRC-

Schema/blob/master/schemas/orlando_biog-raphy.rng

Bibliography

Alexiev,V.,Cobb, J.,Garcia,G.,andHarpring,P. (2016).GettyArtandArchitectureThesaurus.J.PaulGettyTrust.http://vocab.getty.edu/doc/queries

Algee-Hewitt,M.,Porter,J.D.andWalser,H.(Forthcom-

ing,2017).“RepresentingraceandethnicityinAmericanfiction:1789-1964.”

Bailey,M.Z.(2011)."Allthedigitalhumanistsarewhite,all

thenerdsaremen,butsomeofusarebrave."JournalofDigital Humanities 1.1. http://journalofdigitalhumani-ties.org/1-1/all-the-digital-humanists-are-white-all-the-nerds-are-men-but-some-of-us-are-brave-by-moya-z-bailey/(accessed7April2017)

Brickley,D.,andMiller,L.(2000-2014).FOAFVocabulary

Specification0.99.http://xmlns.com/foaf/spec/Brown,S.,Clements,P.,andGrundy,I(eds.)(2006-2017).

Orlando:Women’sWritingintheBritishIslesfromtheBe-ginnings to thePresent. Cambridge:CambridgeUniver-sityPressOnline.

Brown, S., Clements, P., andGrundy, I. (2006). "Sorting

things in: Feminist knowledge representation andchangingmodesofscholarlyproduction."Women'sStud-iesInternationalForum29.3.

Brown,S.,&Simpson,J.(2013,October).Thecuriousiden-

tityofMichaelFieldanditsimplicationsforhumanitiesresearchwiththesemanticweb.InBigData,2013IEEEInternationalConferenceon(pp.77-85).IEEE.

Canadian Writing Research Collaboratory. (n.d.)

http://cwrc.caD’Arcus,B.,andGiasson,F.(2008-2013).BibliographicOn-

tology Specification (BIBO). http://purl.org/ontol-ogy/bibo/StructuredDynamics.

Davis,I.,andNewman,R.(2005).FunctionalRequirement

for Bibliographic Records (FRBR) http://purl.org/vo-cab/frbr/core#

DBpedia.(n.d.)http://wiki.dbpedia.org/Dean-Hall,A.andWarren,R.(2013).“Sex,privacy,andon-

tologies.” SEXI. Rome, Italy.http://www.dbdump.org/~warren/publications/dean-hall:sexi:2013/dean-hall:sexi:2013.pdf(accessed7April2017).

Fuss, D. (2013).Essentially Speaking: Feminism, Nature &Difference.NewYork:Routledge.

Gold, M. (ed.) (2012). Debates in the Digital Humanities.

Minnesota:UniversityofMinnesotaPress.Gold,M.K.,andKlein,L.F.(eds.)(2016).DebatesintheDig-

ital Humanities 2016. Minnesota: University of Minne-sotaPress.

Johnston,J. (1973).LesbianNation:TheFeministSolution.

NewYork:SimonandSchuster.Lerman,J.(2013).“Bigdataanditsexclusions.”66Stanford

Law Review Online 55: 55-63. http://www.hei-nonline.org.subzero.lib.uoguelph.ca/HOL/Page?han-dle=hein.journals/slro66&start_page=55&collec-tion=journals&id=66(accessedApril7,2017).

McPherson,T.(2012)."WhyaretheDigitalHumanitiesso

white? Or thinking the histories of race and computa-tion."InM.Gold(ed).DebatesintheDigitalHumanities.Minnesota:UniversityofMinnesotaPress,pp.139-160.

Muninn Project. “Appearances Ontology Specification -

0.1.” 2012. http://rdf.muninn-project.org/ontolo-gies/appearances.html

Munt,S.(1998)."Sisters inexile: the lesbiannation."New

Frontiers of Space, Bodies and Gender. London:Routledge,pp.3-19.

Nakamura,L.(2002).Cybertypes:Race,Ethnicity,andIden-

tityontheInternet.London:Routledge.Ross,B.(1995).TheHousethatJillBuilt:ALesbianNationin

Formation.Toronto:UniversityofTorontoPress.Smith, J. (2016). “Workingwith the SemanticWeb.” In C.

Crompton,R.J.Lane,andR.Siemens(ed.).DoingDigitalHumanities: Practice, training, research. London:Routledge,pp.273-88.

Treviranus, J. (2014).“Thevalueof thestatistically insig-

nificant.” Educause 49:1. http://er.educause.edu/arti-cles/2014/1/the-value-of-the-statistically-insignificant(accessed7April2017).

W3C. (2017). Web Annotation Data Model. 23 February

2017.https://www.w3.org/TR/annotation-model/(ac-cessed:7April2017).

Wikipedia contributors (2017). "Ontology (information

science)," Wikipedia, The Free Encyclope-dia.https://en.wikipedia.org/w/index.php?title=Ontol-ogy_(information_science)&oldid=772391479 (ac-cessedApril7,2017).

Woolf,V.(1938).ThreeGuineas.London:HogarthPress.