final project: word galaxycutler/classes/... · from the ground up using meteor.js. however, after...

11
FINAL PROJECT: WORD GALAXY Lucas Silva & Connor Ameres

Upload: others

Post on 03-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: FINAL PROJECT: WORD GALAXYcutler/classes/... · from the ground up using Meteor.js. However, after demo day and many issues with dependencies, we decided to built upon an existing

FINALPROJECT:WORDGALAXYLucasSilva&ConnorAmeres

Page 2: FINAL PROJECT: WORD GALAXYcutler/classes/... · from the ground up using Meteor.js. However, after demo day and many issues with dependencies, we decided to built upon an existing

MotivationandAudienceThemotivationforourprojectstemmedfromthepotentialtodeliverapplicationthat

wouldvisualizerelationshipsbetweenwordsinanintuitivemanor.Inourresearch,wefoundthatthevisualizationswewereenvisioningforourprojectwerenotpresentinmanyofthepopulardictionariesandthesaurionline.Thusbeganourresearchboththewaysintherelationshipsbetweenwordsandthewaysinwhichwecouldrepresentthem.

Afterreviewingpreviouscoursematerial,wefoundthedevelopmentofforcedirectedgraphsofinterest.Graphsofferedusthemostintuitivevisualizationforconveyingtherelationshipsbetweenwords.Also,wewereintriguedbythewaysinwhichwemightbeabletoapplyvariousclusteringalgorithmstogroupwords.Wewerealsointerestedinthewhetherthesegroupingswouldrevealanyinformationaboutthewordsgraphed.

Lastly,wefoundthatwewantedtocreateaprojectthatwouldutilizealargeamountofdata.Inourpreviousclasswork,wehadworkedwithrathersmalldatasets.Forourfinalproject,wewantedtoworkwithadatasetthatwouldpushthelimitsofapotentialonlineapplicationaccessiblefromnearlyanyone,regardlessoftheirplatformofchoice.Graphingallwordsissimplynotrivialtask,astherearemorethan1,000,000wordsintheEnglishdictionaryandwefoundthatwewouldlaterfindthatwordswerehighlyinterconnected.Thisdrewustothecreationofthisfinalproject,becausewehadlittlepriorexperienceinworkingwithgraphsofthissizeandpresentingtheminawebapplication.

Ourtargetaudienceofourapplicationisanyonethatisavisuallearner.Naturallythisaudienceincludesanyonewhoisinterestedinlearningwordsandtherelationshipsthattheyhave.Thiswouldspecificallybenefitthoseinthefieldofacademia.Thiswouldlikelybestudentsinhighschoolorabove,astheapplicationsetouttocontainallwords,includingthosethatareexplicit.Wealsoanticipatethatuserswillbeabletousetheapplicationwithnoexperiencewithgraphtheory.VisualizationGoals

Ourapplicationmustbeinteractive,web-based,andintuitive.Itmustfulfillthesecriteriabecauseitmustbeeasilyaccessibletoeveryoneonanyplatformregardlessoftheircomputerhardware.Theapplicationmustbeintuitive,becausenonexperiencedusersmustbeabletouseit.Itmustbeinteractivetoenableanexploratorymodeoflearningandgrabtheattentionoftheusers.

Wemustalsoprovideawayinwhichtosearchforaparticularword.Thegraph

structureisinheritablydenseandlarge.Itwouldbeverydifficulttofindaparticularnodeorword.Inordertobecompetitivewithalternativeonlinedictionaries,wemustprovideasearchmechanisminordertomovetoaspecificnodeinthespaceortoalerttheuserthatthewordisnotinthegraph.

Page 3: FINAL PROJECT: WORD GALAXYcutler/classes/... · from the ground up using Meteor.js. However, after demo day and many issues with dependencies, we decided to built upon an existing

Lastlytheapplicationmustbefast.Theapplicationshouldloadquickly,evenonslowernetworks,asadelaywouldlikelyresultonfrustrationorauserbouncingfromourapplication.Theapplicationalsomusthavehighframerates,around60framespersecond,aslowerframerateswillresultinusershavingadifficulttimenavigatingthegraphandtracingthepathsthroughtheexploratoryspace.

ResearchQuestion

Wesetouttoanswerwhetherornotwecouldcreateavisualizationthatismoreintuitivethanmanyoftheavailableonlinedictionariesforunderstandingtherelationshipsbetweenwords.

ExpectedResults&Hypothesis

UsingagraphgeneratedofhyponymtreesfromPrinceton’sWordNetAPI[4],wehypothesizethatoursitewillhavegreateraveragesessiontimeswhencomparedtomoretraditionaldictionarysites,asitisdesignedtobemoreimmersive.Weexpectthatgivenasurvey,themajorityofuserswillfindourprojectmoreexploratoryandvisuallyappealingthantraditionalsites.

PriorWork/ResearchPapersThefirstdraftofthisprojectattemptedtousetheForceAtlas2algorithmdesignedfor

GephibyJacomyetal.[1],asitisknownfordisplayingcommunitiesverywell.WeusedthisalgorithmforourprototypingusingGephi,andthepaperhelpedusdeterminetheappropriateconstantsforanoptimallayout.However,wecouldnotfinda3DimplementationofthealgorithmthatfitourneedssowedecidedtoabandonitanduseamoregenericforcedirectedlayoutimplementedbyKashcha[2].Inthefuture,wemaymovetoForceAtlas2byimplementingaJavaScriptversion.

Fromthebeginningofourdesign,wefocusedontheconceptofoverviewanddetaildescribedbyJakobsenetal.[3].Wewanteduserstobeabletoexploretherelationsofallwordsinanoverviewcontext,whilealsobeingabletospecifyaparticularwordandgainmoredetailaboutit.Theabilitytospecifyawordisacrucialfeatureinordertomakeourprojectcompetewithtraditionaldictionariesandthesauruses. Inordertogatherthedataweneeded,weusedPrinceton’sWordNet[4]whichorganizeswordsbasedonsynsets.Itthenprovidessemanticrelationshipsbetweensynsets,suchashyponyms,troponyms,meronyms,etc.Afterreadingthispaper,wedecidedhypernymsandhyponymsbestrepresenttherelationsbetweenwordsassynonymscanthenbedefinedbywordsbelongingtosynsetsthatsharethesamehypernym.Usingonlythesetworelations,webelievetheuserwouldbeabletogatherenoughsemanticrelationshipsbetweensynsetstomakeourprojectuseful. Atfirstweattemptedtobuildourprojectfromscratch,andourprototypewasbuiltfromthegroundupusingMeteor.js.However,afterdemodayandmanyissueswith

Page 4: FINAL PROJECT: WORD GALAXYcutler/classes/... · from the ground up using Meteor.js. However, after demo day and many issues with dependencies, we decided to built upon an existing

dependencies,wedecidedtobuiltuponanexistingproject[5]foundonGithubthathadmanyofthefeatureswewerelookingfor.SincetheprojectisunderanopensourceMITlicense,wedecidedtoforkthatprojectandbuiltontopofit.Thisprojectwasoriginallyusedtovisualizepackagemanagers,likenpm.Thisgaveusagreatstartingpointtorendertheexploratory3Dgraph,afterwhichweaddedourown2Dviewandothermodifications.

VisualizationDesignEvolutionStoryboarding

Initiallywesketchedouta2D,forcedirectedgraphinastoryboardingtoolcalledSketch.Thistoolallowedustocreateadrawingofwhatour2Dviewofthefinaldesignwouldenduplookinglike.Figure1showswhatwebelievedtheinitialgraphwouldlooklikeuponloadingtheapplication.Thesketchshowedthatwewouldsupportexplorationofthegraphthroughzooming.

Figure1alsoshowswhatthevisualizationwouldlooklikeaftersearchingforaparticular

word.Whensearchingforaparticularword,thevisualizationwouldremovenodesthataren’tdirectlyconnectedtothewordofinterest.Thevisualizationwouldalsomovethesearchednodetothecenterofthevisiblescreen.

Figure1:Leftimageshowsinitial2D,forcedirectedgraph.Rightshowssearchingforaword.

Page 5: FINAL PROJECT: WORD GALAXYcutler/classes/... · from the ground up using Meteor.js. However, after demo day and many issues with dependencies, we decided to built upon an existing

Gephi

WeconstructourgraphwithapythonscriptthatgatherswordrelationsfromWordNet,

butbeforeattemptingtorenderitinthebrowserwewantedapreviewofwhattheresultwouldlooklike.Forthispurpose,weusedGephi,apopularopensourcegraphvisualizationsoftware.Gephicontainsmanydifferentforcelayoutalgorithmsbuilt-in,butforourprototypingweusedtheForceAtlas2algorithm.

Thiswashelpfulbecausewhenusingwordsinsteadofsynsetsfornodes,we

immediatelynoticedtheveryhighaveragevalenceandinterconnectivityofthenodes.Thiswasaproblembecausenodeswerebeingobscuredbyedges,andthegraphwasverycluttered,seeFigure2forobscuredgraphwith1,000words.Knowingthisbeforeactuallydoingtherenderingworkwasveryhelpful,aswewereabletoswitchtovisualizinghyponymtreesofsynsetsinsteadofwords.Thiscutdownouraveragevalence,unobscurednodes,andmadethegraphmoreaestheticallypleasingwhilestillencodingenoughsemanticrelationshipstomakeitmeaningful.

Figure2:Left,zoomedoutviewofGephigraphofGoogle'stop1,000words.Right,zoomedinviewofsamegraph.

Page 6: FINAL PROJECT: WORD GALAXYcutler/classes/... · from the ground up using Meteor.js. However, after demo day and many issues with dependencies, we decided to built upon an existing

DemoDayPrototype

Atdemoday,wehadaworkingprototypethatwebuiltfromscratchusingopensourcengraphmodules.Theprototypewasabletorendera3Dgraphusingaforcedirectedgraphlayout,butquicklybecameveryclutteredwhenedgeswerevisualized.Hidingnodeedgesmadethegraphlessclutteredattheexpenseofremovingthesemanticrelationsbetweenwords(seeFigure3).Wewereabletosuccessfullystorenodepositionsandedgeinformationinbinaryfiles,whichweremuchsmallerthantheirasciicounterparts.Thisaidedinimprovingloadtime,buttherenderingtimeforourgraphwasverylarge.Theprototypewouldtakearound10secondstorender,duringwhichjustablackscreenwasdisplayedtotheuser.

Figure4:Searchfunctionality

Wealsohadaworkingsearchfunctionalityatdemoday,wheretheuserwouldenterasearchtermandtheapplicationwouldrendera2Dgraph.The3Dgraphwouldbeminimizedtothebottomright,andfocuswouldbeswitchedtothe2Dversion.Therootofthe2Dgraph

Figure3:Left,Clutteredgraphwithedgestoggled.Right,clutteredgraphwithedgesinvisible

Page 7: FINAL PROJECT: WORD GALAXYcutler/classes/... · from the ground up using Meteor.js. However, after demo day and many issues with dependencies, we decided to built upon an existing

wouldbethesearchterm,andallwordsconnectedtothattermwouldbranchofffromtheroot.Nodelabelswerenotyetimplemented,andtheforcelayoutingalgorithmwouldcontinuouslyruninaninfiniteloopwhichmadethegraphmoveandrearrangeitself.SeeFigure4forthissearchfunctionality.FinalDesign

Duringthefinaliterationofourdesign,wepivotedtoutilizeadifferentvisualizationproject.Thisprojectwaspm[5].Firstly,weadaptedthewordgraphthatwehadpreviouslygeneratedtoworkwiththeexistingvisualization.Weaddedworddefinitions,changedtherelationstobehyponymsandhypernyms,andre-implementedsearchtotheoriginalproject.Ournewsearchfunctionalityactuallysearchesthroughasynset’slemmas,whichareitssynonyms,insteadofthesynsetitself.Thisisbetterbecauseuserscansearchformorenaturalterms,andwegethigherqualitymatches.Forexample,‘hi’wouldnowmatchtothesynset‘hello’,asbeforeitwouldnot.Wealsoaddedcolortonodesbasedonpartofspeech,selectedfromaqualitativecolorschemetakenfromColorBrewer2.0.Withtheadditionofcolor,wealsoaddedalegendtothebottomrightofthescene,ascanbeseeninFigure7. Inaddressingfeedbackregardingthedifficultytonavigatetheapplication,weaddedanavigationcontroloverlaywhichcanbetoggledusingthescrollwheelorlaptoptrackpad.Theoverlaydisplaysallthepertinentnavigationalcontrols,manyofwhichweretakenfromtheoriginalpmproject,seeninFigure6. Lastly,inordertoprovidetheabilitytofocusonawordanditsdirectrelationshipswecreateda2Ddetailedviewwhichisenabledbydoubleclickingonaword.The3Dgraphwillsmoothlytransitiontothetoprightofthescene,anda2Dforce-directedgraphwillbe

Figure5:Finaldesign,zoomedoutview.

Page 8: FINAL PROJECT: WORD GALAXYcutler/classes/... · from the ground up using Meteor.js. However, after demo day and many issues with dependencies, we decided to built upon an existing

renderedinthecenterofthescene.Therootnodeofthenewgraphbecomestheselectedword,anditsdirectrelationshipsareconnected.Theusercanthentogglewhethertodisplayhypernymsorhyponymsbyeitherclickingtherootortheinfoboxinthebottomleftcorner.Clickingonaleafnodemakesthatnodethenewroot,andchangesthegraphaccordingly.ThisviewcanbeseeninFigure8.

Figure8:Forcedirected2Dgraphdisplayedafterdoubleclickingonanode.

Figure6:Navigationcontrolsoverlay,displayedwhenpagefirstloads

Figure7:Hoveringmouseonanodedisplaysthetooltip

Page 9: FINAL PROJECT: WORD GALAXYcutler/classes/... · from the ground up using Meteor.js. However, after demo day and many issues with dependencies, we decided to built upon an existing

FutureWork Inafutureimplementationwewouldliketobeabletofiltersynsetsbasedontheagegroupoftheuser.Wewouldhaveauserentertheirdateofbirthinadialogboxduringtheintroduction.ThisdetailwouldbestoredwithLocalStorageintheuser’sbrowser. Wealsowouldcompressthedatafilesthatareservedfromtheserver.Ifthefilesarecompressed,wewouldachieveapproximately75%compressiononthelabelsfilethatcontainsallnodeinformation.Thiscompressionwouldresultinlowerlatencyfornavigationandcoloringofnodes. Improvingthesearchperformanceisdesired,becausetheapplicationcanperformslowlywhenshorterwordsaresearched.Thisisbecausewhenauserentersasearchterm,wefindallnodeswhoselemma(synonyms)lengthsarelongerthantheenteredterm.Then,welookthroughalllemmasandtrytofindonesthatmatchthesearchterm.Amatchisconsideredvalidifthelemmastartswithorendswiththeenteredterm,whichiswhywefirstprunedshorterlemmasastheycannotmatch.Thismeansthatforshortersearchterms,thesearchspaceislargerbecauselesslemmasarepruned.Theyalsomatchmorenodes,becausemorenodesarelikelytostartwiththeterm.

FeedbackonProjectNotImplemented

Ourprojectreceivedgreatfeedbackfromtheclassduringdemoday.However,weweren’tabletoimplementeverythingthatourclassmateshadsuggestedduetotimeconstraints.Oneofthefirstgreatideasthatwassuggestedtouswasto“createa2Drenderingofamini-mapforlocationin3Dgraph”.Wereallylikedthisidea,becausewewouldhavebeenabletoreuseUIelementsfromourapplicationtorenderthemini-map,whilecontinuingthetrendofprovidingadetailandoverviewoftheexplorationofthegraphofwords.

Anothergreatideawasto“colorwordsbasedontheircountryorlanguageoforigin”.

Thisideawouldhavebeenincorporatedifwehadthetimetoresearchadatasettopullthecountryorlanguageoforiginfrom.Coloringthenodeswouldhavebeentrivial.

Lastlywehaddifficultywithouruseofclusteringalgorithms,sowewereunableto

implementthesuggestionto“colornodesbytheclusteringofrelatedwords”.Withouttheuseordevelopmentofaclusteringalgorithm,theapplicationwouldn’tbeabletocolornodesbytheirclusteringinawayinwhichtheuserwouldbeabletointuitivelyexplore.

Implemented Wewereabletoimplementnearlyallsuggestionsthatdidn’trequireanotherdatasourceortheadditionofaclusteringalgorithm.Manystudentsclaimedthatweshould“displayatutorialforuserstobecomefamiliarwiththeapplication”.Weattributedtheseclaimstostudentshavingavagueideaofwhatourapplicationwasusedforoncewetoldthem.

Page 10: FINAL PROJECT: WORD GALAXYcutler/classes/... · from the ground up using Meteor.js. However, after demo day and many issues with dependencies, we decided to built upon an existing

However,theydidn’tfeelcomfortablesittingdownwithoutabackgroundlessonandoradditionalhelp. Studentsalsoclaimedthat“thevelocityofthecamerawasdifficulttocontrol”.Whileournodeswerestillunlabeledduringthedemoday,usershadadifficulttimereturningtospecificnodesinthegraphandperformingtasks.Thiswasprimarilyduetonavigationalissueswiththecamera’svelocity. Lastly,studentsclaimedthatweshould“useautocompletetosuggestwordstotheuserduringsearch”.Duringthisstageofdevelopment,theuserreceivednofeedbackwhentypingawordintosearch.Theusercouldonlyinteractwiththeapplicationaftertheyhadtypedinawordandpressedtheenterkey.Theuseralsowasn’tpresentedwithanyfeedbackwhenawordwasnotpresentinthegraph,sousersthoughtautocompletewouldworktocombattheseotherissues.

Contributions Thevariousworkwasprimarilydividedintoclientfacingworkandworkdedicatedtoinitiallygeneratingthegraph.ConnorworkedonthestoryboardingwithSketchtoinitiallydesignaUIfortheapplication.Afterthestoryboardwascompleted,heworkedtogeneratethegraphandwordinformationusingNetworkXandWordNetinpythonsothatitcouldbelaterlaidoutinJavaScript.Lastly,Connorworkedonimplementingthe2Dforcedirectedgraphintothevisualization.ThiswascreatedusingD3. LucasworkedontheprototypingusingGephitotestgraphlayouts,andimplementedtheMeteorprototypethatwasshownduringdemoday.HeworkedwiththelayoutingalgorithminJavaScript.Thisinvolvedtuningthegravityandspringconstantstogetanaestheticallypleasinggraph.Lucasalsoworkedonmodifyingthepmprojecttogeneratethe3Dgraphusingourgeneratedlayout,andre-implementedsearchinordertosearchbylemmasinsteadofsynsets.

TechnicalImplementationDetails/Challenges Thedifficultythatwehaven’tyetcoveredisthelackof3Dgraphtools.WespecificallyhaddifficultiesfindingaforceAtlas2algorithmthatwouldworkforourplatform.Wewerealsounabletofindanalgorithmthatwouldprovidecomparableclustering.

FutureWorkBothmemberswilllikelycontinuetoworkontheproject,duetogreatfeedbackwe

receivedduringourfinalpresentation.Wewerenotifiedonsocialmediathatuserswereinterestedinusingourvisualizationlongterm.Wewillbeworkingonthetasksthatwewerenotabletoimplementfromthedemodayfeedback.

Page 11: FINAL PROJECT: WORD GALAXYcutler/classes/... · from the ground up using Meteor.js. However, after demo day and many issues with dependencies, we decided to built upon an existing

ProjectLinks:VideoDemo:https://www.youtube.com/watch?v=56is1Xv97y4Project:http://lusilva.github.io/word-galaxy.

References[1] Jacomy,Mathieu,etal."ForceAtlas2,acontinuousgraphlayoutalgorithmforhandy

networkvisualizationdesignedfortheGephisoftware."PloSone9.6(2014):e98679.[2] Kashcha,Andrei."Anvaka/ngraph.offline.layout."GitHub.N.p.,n.d.Web.09June2016.

<https://github.com/anvaka/ngraph.offline.layout>.[3] RønneJakobsen,Mikkel,andKasperHornbæk."Sizingupvisualizations:effectsofdisplay

sizeinfocus+context,overview+detail,andzoominginterfaces."ProceedingsoftheSIGCHIConferenceonHumanFactorsinComputingSystems.ACM,2011.

[4] Miller,GeorgeA."WordNet:alexicaldatabaseforEnglish."CommunicationsoftheACM

38.11(1995):39-41.[5] Kashcha,Andrei."Anvaka/pm."GitHub.N.p.,n.d.Web.02May2016.