cultural (re-)formations: structuring a linked data ontology for intersectional identities · ·...
TRANSCRIPT
Cultural (Re-)formations: Structuring a Linked Data Ontology for Intersectional Identities [email protected],[email protected],[email protected],[email protected],[email protected],Canada
Introduction Culturaldiversityhasbeenanincreasingsourceof
debatewithinthedigitalhumanitiescommunity.TheconcentrationwithintheDebatesinDigitalHumanitiesseries(Gold,2012;GoldandKlein,2016)ofpiecesre-flectingtheincreasingprominenceofmattersrelatedtorace,gender,culturaldiversityanddifferenceisbutonemarkerof theextent towhichdiversitymatters.TheOrlandoProjectinfeministliteraryhistoryincor-porated an intersectional understanding of identitycategories from the outset (Brown, Clements andGrundy,2006-2017).TranslatingOrlando’sExtensibleMarkup Language (XML) data into linked open data(LOD)tomakeitaccessible,interoperable,andamena-bletoarangeofanalyticalapproaches(SimpsonandBrown)requiresanontologythatwillservebothOr-landoandthebroaderresearchcommunityhostedbytheCanadianWritingResearchCollaboratory(CWRC).ThispaperoutlinestheCWRContologydesignandthe
challenges of shifting from semi-structured to struc-tureddata(Smith,2016:273).
Muchwork on digital diversity expresses skepti-cismof the ability of systematized knowledge struc-tures to capture the performative, processual, andcontingentnatureoflivedsubjectivities.TaraMcPher-sonstressesthat“computersarethemselvesencodersofculture”andcalls formoreattentiontobepaidtothe interconnectednessof thestructuresofcodeandthemanagementofracesocially:"Justastherelationaldatabaseworksbynormalizingdata—thatis,bystrip-pingitofmeaningful,idiosyncraticcontext,creatingasystem of interchangeable equivalencies—our ownscholarlypracticestendtoexistinrelativelyhermeti-cally sealedboxesornodes." Scholars includingLisaNakamura (2002: 120) andMoya Bailey (2011) seevaluein“messiness”asawaytopushagainstandre-definethecontoursofadigitalhumanitiesscholarshipthat remains rooted in predominantly white episte-mology.
Atthesametime,relegatingrepresentationsofdif-ference tonarrative rather than structureddatawillproducegapswithinbigdatathatareboth impover-ishing forhumanities inquiryanddangerous in theirpolitical implications (Lerman, 2013; Trevinarus,2014;“Use”;BrownandSimpson,2013).AdrielDean-Hall and RobertWarren (2013) have advocated ap-proachesthatrespecttheprivacyandpreferencesoflivedhumansubjectswhileimprovingtheresponsive-ness of online systems to diversity and complexity.WithinaLODcontext,whatare finally findable,pro-cessable,andreusableontheglobalgrapharethings,notstrings,sothechallengeistheextenttowhichnu-ance, context, and indeedmessiness canbe incorpo-ratedintoaLODontology.
The Orlando Project (Brown, et al., 2006-2017)chartedamiddlegroundbetweennarrativeandstruc-ture for its bespokeXML tagset. The team struggledwiththehierarchicalnatureofXMLparticularlyinre-lationtoidentitycategories,tornbetweenknowledgethatreaderswouldturntoOrlandotofindwritersas-sociatedwithparticularculturalidentitiesandrecog-nitionthatsuchcategoriesarediscursiveratherthanessential (Fuss, 2013). It devised a “Cultural For-mation”tagsettodepictidentityasneitherunitarynorimmutable, and asmuch related to representationalactsastothelivedexperiencesintowhichthoserep-resentations blur. Precisely because constitutedthroughdiscursiveandsocialpractices,vocabulariesassociatedwithsubjectivitiesand identitiescanshiftover time and place, and throughout an individual’slifetime.
Cultural formation tagset TheCulturalFormation(CF)tagsetrecognizescat-
egorizationasendemictosocialexperience,whilein-corporatingvariation interminologyandcontextual-izationofidentitycategoriesbyemployingtagsatdif-ferent discursive levels. CF tags describe the subjectpositions of individuals through 1) contextual tagsthat encode substantial discussions: class; language;nationality;raceandethnicity;religion;andsexuality;and2)granulartagsthatdescribe,inawordorshortphrase,class;ethnicity;gender;geographicalheritage;language;nationality;nationalheritage;politicalaffil-iation;raceorcolour;religiousdenomination,andsex-ual identity.With theexceptionofgenderandsocialclass, the Orlando schema eschewed fixed attributevaluesforthegranulartags,allowingtheprosetoem-ploy themost appropriate language for the context.Thestructurewasnotentirelylogicalorparallel,andwe are making the ontology more consistent. Thegranular tags possess attributes regarding forebearsandwhetherasubjectself-identifiedwithaparticularterm.Thetagsetaimedtohighlighttheextenttowhichsocialclassificationisculturallyproducedanddiscur-sively embedded. Rather than disambiguating leakycultural categories, it considered them as mutuallyconstitutive with historically specific discursiveframeworks,includingourtaggingstructures.
CF encoding pointed users towards a frameworkforraisinganddebatingcomplexmattersforculturalinvestigationratherthanstandardizedclassifications,refusingtoneatlygroupwritersintodistinctandfixedcategories,sincethosecategorieswereneitherstablenormutuallyexclusive(Algee-Hewitt,Porter,Walser,forthcoming). It can represent quite complex identi-ties, as in the case of Anna Leonowens, the writerwhosestoryoflifeasgovernesstotheroyalSiameseharem was popularized in The King and I. PartialmarkupforthefirstparagraphofherCFdescriptionisshowninFigure1.
Figure 1: Adapted from Brown, Clements and Grundy, “Anna Leonowens”, Life tab, Show Markup option
TheCFcomponentofOrlando’sknowledgerepre-sentationisthuscrucialtoitsintersectionalapproach
toidentity(Brownetal.,2006).CreatingaLODontol-ogy that was not self-referential, however, requirestranslatingthestringsorliteralvaluesfromCFtags,tolinkOrlando’ssemanticstructures toothersemanticwebcommunities.
LOD ontology creation Anontology“isa formalnaminganddefinitionof
thetypes,properties,andinterrelationshipsoftheen-titiesthatreallyorfundamentallyexistforaparticulardomain of discourse” (Wikipedia, Ontology - Infor-mationScience).Usingastandardontology languagesuchasOWLallowsothers to interactandexchangewithaparticularviewoftheworldthroughacompu-tationalprocessofmediation.Asa representationofthat understanding, an ontology can be referenced,(dis)agreed with, extended, and used operationally.ThecoexistenceofdifferentrepresentationsprovidesthefoundationfortranslationsbetweenLODconcepts.
Ontology creation inour case, as inmanyothers,wasdrivenbytheidiosyncrasiesandlimitationsofanexistingdataset.Theinformationarchitecturesofap-plicationdatabasesorXMLstoresarenotalwaysrec-oncilable toa consistent informationsystem.TheCFtagset representsamajorchallenge in that its struc-turewasdesignedtoeschewdisambiguation.Eventhemajortagsweredifficulttorelatewithinaconciseon-tology(Figure2).
Figure 2: Schematic representation of the granular Cultural
Formation tags from Orlando (Please note that these representations are simplified in order to make them legible
to the reader.)
Forexample,nationalityandnationalheritagearenot employed as commensurate with citizenship, awell-defined legal concept related to an organizedstate.Theycanalsoberelatedtoageographicalarea,whichmayormaynot coincidewitha state.Finally,nationhood can reference socio-political constructssuchasLesbianNation (Johnston,1973;Ross,1995;Munt,1998)ordisavowalsofnationalitysuchasVir-ginia Woolf’s (1938: 197), which Orlando quotesalongside assigning Woolf an English nationality, acontradiction that requires contextual evidence tomakesense.
Linked into context We decided tomake all human-readable annota-
tionswithinthedatasetinstancesofcontextualnotestowhichtheontologicalclassesaredirectlytied(Fig-ure3).
Figure 3: Schematic representation of how the discursive context (note) links to the classificatory structure, and how
classificatory labels relate to predicates and external ontologies. Skos:narrower/broader relationships are also
used, but omitted here to improve legibility
Thus we model the discursive context within a
Race[or]EthnicityContext class. The note instancelinks to instances of granular category labels, hereRaceColour;itprovidestheprovenanceandthebasisforlinkstosourceinformation.Linkingtotheprove-nanceoftheLODisparticularlyimportantfordisputedor contradictory information, as in our example.Weare modeling the original Orlando narrative as asource document for our LOD provenance using thethe Web Annotation Data Model’s subproperty in-stances.Weaimtolinkeverytripletotheprosefromwhichitisderived,providingprovenanceinformationandcontainscitationstothesourcesonwhichidentityassertionsarebased.
Relating cultural formations Cultural formation forOrlando isunderstoodpri-
marilyasrepresentational,whichisnottosaythatcul-turalformationisnotrealorthatithasnomaterialef-fects.Thecomplexsignifiersofculturalidentitiesfloatacross Orlando tags as cdata or free text in a semi-structured representation of cultural identities andcategories.FortheCWRContology,westrategizedtorelate thisontologicalperspective to thatofexternal
vocabularieswithoutconflatingourtruthwiththeirs.Our architecture does not import other ontologieswholesale, but adopts components ofmajor vocabu-laries such asBIBO, FOAF, and FRBR, and relates tolargevocabulariesindefinedways.AsindicatedinFig-ure 3, the instances of cwrc:whiteRaceColour andcwrc:whiteEthnicity within the CWRC ontology aresubclasses of the cwrc:whiteLabel. This retains theambiguity of terms such as “white” or “Jewish” pre-ciselyaslabelsthatdrawtogetherparticulartypesofidentitycategories,aswellassubClassesof those la-bels. As indicated, those subClasses canbe linked totermsinexternalvocabularies,butbothinternalandexternaltermsareunderstoodwithintheCWRContol-ogy as labels. Indeed, constructing this ontology hasbroughthometoustheneedfortheLODcommunitytothinkthroughwithgreatercaretherelationshipbe-tweenrepresentationand“reality”inLODontologies.A furthercomplication is that identitycategoriesarenotonlyhistoricallycontingentbutoftenalsochangeoveraparticularindividual’slifetime.TheOrlandoda-tasetsupportssuchnuanceinonlyafewcases,sowehavenotstartedwiththisgnarlyproblem,butweaimto build into the ontology the capacity to representsuchcultural formationdynamics inordertoaccom-modatemoretemporallyprecisedata.
Conclusion The CWRC ontology design avoids representing
RDFextractionsfromOrlandodataaspositivistasser-tions,andyetproducesmachine-readableOWL/RDF-compliant graph structures. It allows references to,without endorsing, external ontological vocabulariesthat are nevertheless part of documenting intersec-tionalculturalprocessesandidentities.
WewillpresentCWRContologyasbuiltaroundtheCFdesigndescribedhere,andwewilldemonstrateitsimplications throughseveralpracticalexamples.Fig-ure4showsschematicallytheintersectionalityofmul-tiple identity categories associatedwith Leonowens,includingthewaysthatinstancesarerelatedbysub-classrelationshipsinaccordancewithOWLprinciples.Thisimportantlyallowsustoreferencecomponentsofotherontologies(heretheMuninnAppearancesontol-ogy, LibraryofCongress SubjectHeadings,GettyArtand Architecture Thesaurus, and DBpedia) withoutadoptingthemwholesale.
Figure 4: Cultural Formation triples related to Anna
Leonowens, with corresponding XML-encoded context notes
Figure 5 indicates the ability to see patterns andoutliersrelatedtodifferentcategorizationsofJewish-nessinasmallsubsetofOrlandoauthors.
Figure 5: Subset of CF triples related to a subset of writers, with sample context annotations and external links;
predicates linking individuals to subclasses are inferred (e.g. the edge between Elizabeth Sarah Gooch and
cwrc:jewishReligion is hasReligion)
OurlivepresentationwilldemonstratetheontologyinactionusingtheinteractiveHuViz(HumanitiesVisual-izer)interfacewithalargerdataset.
Ontologies • CWRC ontology: http://sparql.cwrc.ca/ontol-
ogy/cwrc• CWRCsparqlendpoint:http://sparql.cwrc.ca/• OrlandoBiographyschemacontainingCulturalFor-
mation tagset: https://github.com/cwrc/CWRC-
Schema/blob/master/schemas/orlando_biog-raphy.rng
Bibliography
Alexiev,V.,Cobb, J.,Garcia,G.,andHarpring,P. (2016).GettyArtandArchitectureThesaurus.J.PaulGettyTrust.http://vocab.getty.edu/doc/queries
Algee-Hewitt,M.,Porter,J.D.andWalser,H.(Forthcom-
ing,2017).“RepresentingraceandethnicityinAmericanfiction:1789-1964.”
Bailey,M.Z.(2011)."Allthedigitalhumanistsarewhite,all
thenerdsaremen,butsomeofusarebrave."JournalofDigital Humanities 1.1. http://journalofdigitalhumani-ties.org/1-1/all-the-digital-humanists-are-white-all-the-nerds-are-men-but-some-of-us-are-brave-by-moya-z-bailey/(accessed7April2017)
Brickley,D.,andMiller,L.(2000-2014).FOAFVocabulary
Specification0.99.http://xmlns.com/foaf/spec/Brown,S.,Clements,P.,andGrundy,I(eds.)(2006-2017).
Orlando:Women’sWritingintheBritishIslesfromtheBe-ginnings to thePresent. Cambridge:CambridgeUniver-sityPressOnline.
Brown, S., Clements, P., andGrundy, I. (2006). "Sorting
things in: Feminist knowledge representation andchangingmodesofscholarlyproduction."Women'sStud-iesInternationalForum29.3.
Brown,S.,&Simpson,J.(2013,October).Thecuriousiden-
tityofMichaelFieldanditsimplicationsforhumanitiesresearchwiththesemanticweb.InBigData,2013IEEEInternationalConferenceon(pp.77-85).IEEE.
Canadian Writing Research Collaboratory. (n.d.)
http://cwrc.caD’Arcus,B.,andGiasson,F.(2008-2013).BibliographicOn-
tology Specification (BIBO). http://purl.org/ontol-ogy/bibo/StructuredDynamics.
Davis,I.,andNewman,R.(2005).FunctionalRequirement
for Bibliographic Records (FRBR) http://purl.org/vo-cab/frbr/core#
DBpedia.(n.d.)http://wiki.dbpedia.org/Dean-Hall,A.andWarren,R.(2013).“Sex,privacy,andon-
tologies.” SEXI. Rome, Italy.http://www.dbdump.org/~warren/publications/dean-hall:sexi:2013/dean-hall:sexi:2013.pdf(accessed7April2017).
Fuss, D. (2013).Essentially Speaking: Feminism, Nature &Difference.NewYork:Routledge.
Gold, M. (ed.) (2012). Debates in the Digital Humanities.
Minnesota:UniversityofMinnesotaPress.Gold,M.K.,andKlein,L.F.(eds.)(2016).DebatesintheDig-
ital Humanities 2016. Minnesota: University of Minne-sotaPress.
Johnston,J. (1973).LesbianNation:TheFeministSolution.
NewYork:SimonandSchuster.Lerman,J.(2013).“Bigdataanditsexclusions.”66Stanford
Law Review Online 55: 55-63. http://www.hei-nonline.org.subzero.lib.uoguelph.ca/HOL/Page?han-dle=hein.journals/slro66&start_page=55&collec-tion=journals&id=66(accessedApril7,2017).
McPherson,T.(2012)."WhyaretheDigitalHumanitiesso
white? Or thinking the histories of race and computa-tion."InM.Gold(ed).DebatesintheDigitalHumanities.Minnesota:UniversityofMinnesotaPress,pp.139-160.
Muninn Project. “Appearances Ontology Specification -
0.1.” 2012. http://rdf.muninn-project.org/ontolo-gies/appearances.html
Munt,S.(1998)."Sisters inexile: the lesbiannation."New
Frontiers of Space, Bodies and Gender. London:Routledge,pp.3-19.
Nakamura,L.(2002).Cybertypes:Race,Ethnicity,andIden-
tityontheInternet.London:Routledge.Ross,B.(1995).TheHousethatJillBuilt:ALesbianNationin
Formation.Toronto:UniversityofTorontoPress.Smith, J. (2016). “Workingwith the SemanticWeb.” In C.
Crompton,R.J.Lane,andR.Siemens(ed.).DoingDigitalHumanities: Practice, training, research. London:Routledge,pp.273-88.
Treviranus, J. (2014).“Thevalueof thestatistically insig-
nificant.” Educause 49:1. http://er.educause.edu/arti-cles/2014/1/the-value-of-the-statistically-insignificant(accessed7April2017).
W3C. (2017). Web Annotation Data Model. 23 February
2017.https://www.w3.org/TR/annotation-model/(ac-cessed:7April2017).
Wikipedia contributors (2017). "Ontology (information
science)," Wikipedia, The Free Encyclope-dia.https://en.wikipedia.org/w/index.php?title=Ontol-ogy_(information_science)&oldid=772391479 (ac-cessedApril7,2017).
Woolf,V.(1938).ThreeGuineas.London:HogarthPress.