dependency parsing 2 - university of maryland · data-driven dependency parsing goal:learn a good...
TRANSCRIPT
DependencyParsing2CMSC723/LING723/INST725
MarineCarpuat
Figcredits:Joakim Nivre,DanJurafsky &JamesMartin
DependencyParsing
• Formalizingdependencytrees
• Transition-baseddependencyparsing• Shift-reduceparsing• Transitionsystem• Oracle• Learning/predictingparsingactions
Data-drivendependencyparsing
Goal: learnagoodpredictorofdependencygraphsInput:sentenceOutput:dependencygraph/treeG=(V,A)
Canbeframedasastructuredpredictiontask- verylargeoutputspace- withinterdependentlabels
2dominantapproaches:transition-basedparsingandgraph-basedparsing
Transition-baseddependencyparsing
• Buildsonshift-reduceparsing[Aho &Ullman,1927]
• Configuration• Stack• Inputbuffer ofwords• Setofdependencyrelations
• Goalofparsing• findafinalconfigurationwhere• allwordsaccountedfor• Relationsformdependencytree
Transitionoperators
• Transitions:produceanewconfigurationgivencurrentconfiguration
• Parsingisthetaskof• Findingasequenceoftransitions• Thatleadsfromstartstatetodesiredgoalstate
• Startstate• StackinitializedwithROOTnode• Inputbufferinitializedwithwordsinsentence• Dependencyrelationset=empty
• Endstate• Stackandwordlistsareempty• Setofdependencyrelations=finalparse
ArcStandardTransitionSystem
• Defines3transitionoperators[Covington,2001;Nivre 2003]• LEFT-ARC:• createhead-dependentrel.betweenwordattopofstackand2nd word(undertop)• remove2nd wordfromstack
• RIGHT-ARC:• Createhead-dependentrel.betweenwordon2nd wordonstackandwordontop• Removewordattopofstack
• SHIFT• Removewordatheadofinputbuffer• Pushitonthestack
Arcstandardtransitionsystems
• Preconditions• ROOTcannothaveincomingarcs• LEFT-ARCcannotbeappliedwhenROOTisthe2nd elementinstack• LEFT-ARCandRIGHT-ARCrequire2elementsinstacktobeapplied
Transition-basedDependencyParser
• Assumeanoracle
• Parsingcomplexity• Linearinsentencelength!
• Greedyalgorithm• UnlikeViterbiforPOStagging
Transition-BasedParsingIllustrated
Wheretowegetanoracle?
• Multiclassclassificationproblem• Input:currentparsingstate(e.g.,currentandpreviousconfigurations)• Output:onetransitionamongallpossibletransitions• Q:sizeofoutputspace?
• Supervisedclassifierscanbeused• E.g.,perceptron
• Openquestions• Whataregoodfeaturesforthistask?• Wheredowegettrainingexamples?
GeneratingTrainingExamples
• Whatwehaveinatreebank • Whatweneedtotrainanoracle• Pairsofconfigurationsandpredictedparsingaction
Generatingtrainingexamples
• Approach:simulateparsingtogeneratereferencetree
• Given• Acurrentconfig withstackS,dependencyrelationsRc• Areferenceparse(V,Rp)
• Do
Let’stryitout
Features
• Configurationconsistofstack,buffer,currentsetofrelations
• Typicalfeatures• Featuresfocusontoplevelofstack• Usewordforms,POS,andtheirlocationinstackandbuffer
Featuresexample
• Givenconfiguration • Exampleofusefulfeatures
Featuresexample
Researchhighlight:Dependencyparsingwithstack-LSTMs• FromDyeretal.2015:http://www.aclweb.org/anthology/P15-1033
• Idea• Insteadofhand-craftedfeature• Predictnexttransitionusingrecurrentneuralnetworkstolearnrepresentationofstack,buffer,sequenceoftransitions
Researchhighlight:Dependencyparsingwithstack-LSTMs
Researchhighlight:Dependencyparsingwithstack-LSTMs
AlternateTransitionSystems
Note:Adifferentwayofwritingarc-standardtransitionsystem
Aweaknessofarc-standardparsing
Rightdependentscannotbeattachedtotheirheaduntilalltheirdependentshavebeenattached
ArcEagerParsing• LEFT-ARC:
• Createhead-dependentrel.betweenwordatfrontofbufferandwordattopofstack
• popthestack• RIGHT-ARC:
• Createhead-dependentrel.betweenwordontopofstackandwordatfrontofbuffer
• Shiftbufferheadtostack• SHIFT
• Removewordatheadofinputbuffer• Pushitonthestack
• REDUCE• Popthestack
ArcEagerParsingExample
Trees&Forests
• Adependencyforest(here)isadependencygraphsatisfying• Root• Single-Head• Acyclicity• butnot Connectedness
Propertiesofthistransition-basedparsingalgorithm
- Correctness- Foreverycompletetransitionsequence,theresultinggraphisaprojectivedependencyforest(soundness)
- ForeveryprojectivedependencyforestG,thereisatransitionsequencethatgeneratesG(completeness)
- Trick:forestcanbeturnedintotreebyaddinglinkstoROOT0
Dealingwithnon-projectivity
Projectivity• Arc fromheadtodependentisprojective• Ifthereisapathfromheadtoeverywordbetweenheadanddependent
• Dependencytree isprojective• Ifallarcsareprojective• Orequivalently,ifitcanbedrawnwithnocrossingedges
• Projectivetreesmakecomputationeasier• Butmosttheoreticalframeworksdonotassumeprojectivity• Needtocapturelong-distancedependencies,freewordorder
Arc-standardparsingcan’tproducenon-projectivetrees
Howfrequentarenon-projectivestructures?
• StatisticsfromCoNLL sharedtask• NPD=nonprojectivedependencies• NPS=nonprojectivesentences
Howtodealwithnon-projectivity?(1)changethetransitionsystem
• Addnewtransitions• Thatapplyto2nd wordofthestack• Topwordofstackistreatedascontext
[Attardi 2006]
Howtodealwithnon-projectivity?(2)pseudo-projectiveparsing
Solution:• “projectivize”anon-projectivetreebycreatingnewprojectivearcs• Thatcanbetransformedbackintonon-projectivearcsinapost-processingstep
Howtodealwithnon-projectivity?(2)pseudo-projectiveparsing
Solution:• “projectivize”anon-projectivetreebycreatingnewprojectivearcs• Thatcanbetransformedbackintonon-projectivearcsinapost-processingstep
Graph-basedparsing
Graphconceptsrefresher
DirectedSpanningTrees
MaximumSpanningTree
• Assumewehaveanarcfactoredmodeli.e.weightofgraphcanbefactoredassumorproductofweightsofitsarcs
• Chu-Liu-Edmondsalgorithmcanfindthemaximumspanningtreeforus!• Greedyrecursivealgorithm• Naïveimplementation:O(n^3)
Chu-Liu-Edmondsillustrated
Chu-Liu-Edmondsillustrated
Chu-Liu-Edmondsillustrated
Chu-Liu-Edmondsillustrated
Chu-Liu-Edmondsillustrated
Arcweightsaslinearclassifiers
Exampleofclassifierfeatures
HowtoscoreagraphGusingfeatures?
Arc-factoredmodelassumption
Bydefinitionofarcweightsaslinearclassifiers
Howcanwelearntheclassifierfromdata?
DependencyParsing:whatyoushouldknow• Formalizingdependencytrees
• Transition-baseddependencyparsing• Shift-reduceparsing• Transitionsystem:arcstandard,arceager• Oracle• Learning/predictingparsingactions
• Graph-baseddependencyparsing
• Aflexibleframeworkthatallowsmanyextensions• RNNsvsfeatureengineering,non-projectivity
Extension:dynamicoracle
Problemwithstandardclassifier-basedoracle:- Itis“static”
- ie tiedtooptimalconfig sequencethatproducesgoldtree
- Whatiftherearemultiplesequencesforasinglegoldtree?- Howcanwerecoveriftheparserdeviatesfromgoldsequence?
Onesolution:“dynamicoracle”[Goldberg&Nivre 2012]
SeealsoLocallyOptimalLearningtoSearch[Changetal.ICML2015]
Extension:dynamicoracleProblemwithstandard
See[Goldberg&Nivre 2012]fordetails