1 center for computational language and education research (clear)

34
WHAT’S NEXT? TARGET CONCEPT IDENTIFICATION AND SEQUENCING LEE BECKER 1 , RODNEY NIELSEN 1,2 , IFEYINWA OKOYE 1 , TAMARA SUMNER 1 AND WAYNE WARD 1,2 1 Center for Computational Language and EducAtion Research (CLEAR) University of Colorado at Boulder 2 Boulder Language Technologies 2010.06.18

Upload: isaiah

Post on 15-Jan-2016

17 views

Category:

Documents


0 download

DESCRIPTION

What’s next ? Target Concept Identification and Sequencing Lee Becker 1 , Rodney Nielsen 1,2 , Ifeyinwa Okoye 1 , Tamara Sumner 1 and Wayne Ward 1,2. 1 Center for Computational Language and EducAtion Research (CLEAR) University of Colorado at Boulder 2 Boulder Language Technologies. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

WHAT’S NEXT? TARGET CONCEPT

IDENTIFICATION AND SEQUENCING

LEE BECKER1, RODNEY NIELSEN1,2, IFEYINWA OKOYE1,

TAMARA SUMNER1 AND WAYNE WARD1,2

1 Center for Computational Language and EducAtion Research (CLEAR)University of Colorado at Boulder

2 Boulder Language Technologies

2010.06.18

Page 2: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Goals:

Introduce Target Concept Identification (TCI) Potentially the most important QG related

task Encourage discussion related to TCI

Define a TCI based shared task Illustrate viability

via Baseline and straw man systems Challenge the QG Community to

consider TCI

Page 3: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Overview

Define the Target Concept Identification and Sequencing tasks

Describe component and baseline systems

Discuss the utility of these subtasks in the context of the full Question Generation task

Final Thoughts

Page 4: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

QG as a Dialogue Process

Question Generation is much more than surface form realization depends not only on the text or knowledge

source also depends on the context of all previous

interactions

Page 5: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

The Stages of Question Generation

What to talk about next?

Direction of flow- or -

Series circuits

How to ask it?

•Definition Question•Prediction Question•Hypothesis Question

Final natural language output

What will happen to the flow of electricity if you flip the battery around?

Page 6: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Target Concept Identification Out of the limitless number of concepts related to

the current dialogue, which one should be used to construct the question?

Inputs: Knowledge sources Dialogue Context / Interaction History

Output: The next target concept

Subtasks Key Concept Identification Concept Relation Identification and Classification Concept Sequencing

Page 7: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Key Concept Identification

Goal: Extract important concepts from a knowledge source (plain text, structured databases, etc…)

Want not just the concepts, but the concepts most critical to learning

Preferably identify core versus supporting concepts

Page 8: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Key Concept Identification:

CLICK - Customized Learning Service for Concept Knowledge [Gu, et al. 2008] Personalized learning system Utilizes Key Concept Identification to:

Assess learner’s work Recommend digital library resources to help

learner remedy diagnosed deficiencies Driven by concept maps

Expert concept map Automatically derived concept maps

Page 9: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Key Concept Identification: CLICK: Building a gold standard concept map

Source data 20 Digital library resources Textbook like web text collectively considered to contain all the

information a high school graduate should know about earthquakes and plate tectonics

___________________________________________________________________________________________

___________________________________________________________________________________________

___________________________________________________________________________________________

___________________________________________________________________________________________

Page 10: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Key Concept Identification: CLICK: Building a gold standard concept map

Experts asked to extract and potentially paraphrase spans of text (concepts) from each resource Concept 19: Mantle convection is the process that carries heat

from the core and up to the crust and drives the plumes of magma that come up to the surface and makes islands like Hawaii.

Concept 21: asthenosphere is hot, soft, flowing rock Concept 176: The Theory of Plate tectonics Concept 224: a plate is a large, rigid slab of solid rock

___________________________________________________________________________________________

___________________________________________________________________________________________

___________________________________________________________________________________________

___________________________________________________________________________________________

Page 11: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Key Concept Identification: CLICK: Building a gold standard concept map

Experts link and labeled concepts (i.e. build a map) for each of the 20 resources Open ended label vocabulary Discourse-style relations: elaborates, cause, defines, evidence,

etc… Domain specific relations: technique, type of, and indicates,

etc… 10 most frequent labels account for 64% of labels

Page 12: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Key Concept Identification: CLICK: Building a gold standard concept map

Experts individually combined 20 resource maps to span the whole domain

Experts collaboratively combined their individual resource maps to create a final concept map

Page 13: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Key Concept Identification: CLICK: Automated Approach

_____________________________________________

_____________________________________________

_____________________________________________

Page 14: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Key Concept Identification: Concept Extraction

COGENT System [De la Chica 2008] MEAD [Radev et al. 2004] – Multi-document

summarizer Supplemented with additional features to tie

into educational goals Run on 20 digital library resources used to

construct expert concept map Extracted concepts evaluated against expert

map concepts ROUGE-L: F-Measure 0.6001 Cosine Similarity: 0.8325

Page 15: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Key Concept Identification: Concept Relation ID and Classification

Concept Relation Identification AKA Link Identification Given two concepts, determine if they should be

linked Concept Relation Classification

AKA Link Classification Given a linked pair of concepts, assign a label

describing their relationship This information can be useful both for

concept sequencing and question realization Can potentially comprise a separate task

Page 16: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Key Concept Identification: Concept Relation Identification

Given two concepts, determine if they should be linked Approach [De la Chica et al. 2008]:

SVM-based classifier Lexical, syntactic, semantic, and document structure

features Performance

P = 0.2061 R = 0.0153

Data set is extremely unbalanced majority classification (no-link) overwhelmingly dominates

A good starting point for a challenging task worthy of further investigation

Page 17: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Key Concept Identification: Concept Relation Classification

Towards a gold standard Experts labeled links on concept maps [Ahmad et al.

2008] Discourse-like labels: cause, evidence, defines,

elaborates… Domain-specific labels: technique, type of, slower than Vocabulary unspecified

10 most frequent labels account for 64% of the links With some refinement could use RST or Penn Discourse

labels to create gold standard Next steps

Create more reliable link classifier Develop a link relation classifier

Page 18: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Key Concept Identification:Graph Analysis

Given a concept-map (graph) identify the key or central concepts (versus supporting concepts)

Approach: Graph analysis using PageRank + HITS algorithm Key concepts are the intersection of:

Concepts selected by PageRank + HITS Concepts with the highest ratio of incoming vs. outgoing links Concepts with the highest term density

Evaluation: No gold standard set of core concepts Experts asked to identify subtopic regions on concept map

Earthquake types, Tsunamis, theory of continental drift… 80% core concept coverage of 25 subtopics

Page 19: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Concept Sequencing

Goal: Create a directed acyclic graph, which represents the logical order in which concepts should be introduced in a lesson or tutorial dialogue (w/r to a pedagogy)

Partial Ordering Example:

1. Pitch represents the perceived fundamental frequency of a sound.2. A shorter string produces a higher pitch.3. A tighter string produces a higher pitch.4. A discussion of the difference in pitch across each of the strings of

a violin and a cello.

1

3

2

4

Page 20: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Concept Sequencing: Straw Man Approach

Aim: Show the viability of a concept sequencing task

Intuition: Concepts that should precede other concepts will exhibit this behavior across the corpus of digital library resources

Issues: Concepts may not appear in their entirety in a

document Aspects of concepts may show up earlier than the

concept as a whole Approach: Treat concept to document alignment

as an information retrieval task

Page 21: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Concept Sequencing:Implementation

Indexed the original 20 CLICK resources at the sentence level using Lucene (Standard Analyzer, similarity score threshold = 0.26)

Concepts are queries A concept’s position in a resource is the sentence

number of the earliest matching sentence

Concept A 1____________2____________3____________4____________5____________6____________

Resource 1 Resource 2 Resource 3

Concept B

Concept C

1____________2____________3____________4____________5____________6____________

1____________2____________3____________4____________5____________6____________

Page 22: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Concept Sequencing:Implementation

With concept positions identified and tabulated, compute pairwise comparisons between all concepts’ sentence numbers

If concept does not appear in a resource, do not include it in comparison

Concepts with an identical number of predecessors are considered to be at the same level

Preceedes

A B C

A 1 1

B 1

C

Resource 1 Resource 2Preceedes

A B C

A 1 X

B X

C

Resource 3Preceedes

A B C

A 1

B 1 1

C

TotalPreceedes

A B C

A 2 1

B 1 2

C

Page 23: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Concept SequencingResults

Concept Sequencing System Output

Page 24: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Concept SequencingEvaluation Data

Currently no canonical concept sequence for CLICK data

Instead derived gold-standard evaluation data using a set of expert provided remediation strategies for individual students essaysRe

mediation Strategey

Student Essay Sentence Number

Concept Number

21,23 85, 88, 92, 94, 176

1,3 210, 215, 217, 53, 55, 57, 58

24,26 444, 324, 342, 360

19,31 94, 95, 96, 138

42,44,45,46 610, 615, 613, 616, 618, 627

Rem

edia

tion O

rder

Page 25: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Concept SequencingEvaluation Data

Of 55 key concepts 14 did not occur in any of the remediation

strategies 41 left to define concept sequence

evaluation Used frequency of precedence across

remediations to create a first pass concept sequence

Manually removed loops and errant orderings

Page 26: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Concept SequencingEvaluation Data

Gold-standard Evaluation Sequence

Page 27: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Concept SequencingEvaluation

F1-Measure Average Instance Recall (IR) over all gold-standard key

concepts that have predecessors Average Instance Precision (IP) over all of the non-initial

system-output concepts that are aligned to gold-standard key concepts

Gi all predecessors of ith gold-standard key concept Oj all predecessors of jth system output concept

R =1

hIRi =

i=1

h

∑ 1

h

G i∩OiG ii=1

h

P =1

lIPj =

j=1

l

∑ 1

l

O j ∩G j

O jj=1

l

Page 28: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Concept SequencingResults and Discussion

F1=0.526 (P=0.383, R=0.726) Gold-standard

Multiple initial nodes System output

One single initial node Linear hierarchies All nodes with same number of predecessors at the same level All inclusive ordering favors recall

Future Work Utilize pairwise data to produce less densely packed graphs More sophisticated measures of semantic similarity Make use of concept map link relationships (cause, define…) Conduct expert studies to get gold-standard sequences and

concepts

Page 29: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Tutorial Dialogue and Question Realization

Dialogue-based ITS Labor intensive Effort centers on authoring of dialogue

content and flow Design of dialogue states non-trivial

Page 30: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Tutorial Dialogue and Question Realization

So what does Target Concept Identification buy us? Critical steps towards more automated ITS

creation Decreased effort Scalability Contextual grounding

TCI Mappings to Dialogue Management Key Concepts = States or Frames Concept Sequence = Default Dialogue

Management Strategy

Page 31: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Tutorial Dialogue and Question Realization

Example: Concept 486: an earthquake is the

sudden slip of part of the Earth’s crust... Concept 561: …When the stress in a

particular location is great enough... an earthquake begins

Suppose student has stated a paraphrase of 486

ITS can produce: Now that you have defined what an

earthquake is, can you explain what causes them?

Cause

d-b

y

Page 32: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Final Thoughts

Defined Target Concept Identification Baseline and past results suggest

feasibility of TCI subtasks Challenge the QG community to

continue to think of QG as the product of several tasks including TCI

Page 33: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

Acknowledgements

Advisers and colleagues at: The University of Colorado at Boulder The Center for Computational Language and EducAtion

Research (CLEAR) Boulder Language Technologies

Support from: The National Science Foundation. NSF (DRL-0733322,

DRL-0733323, DRL-0835393, IIS-0537194) The Institute of Educational Sciences. IES

(R3053070434).

Any findings, recommendations, or conclusions are those of the author and do not necessarily represent the views of NSF or IES.

Page 34: 1 Center for Computational Language and  EducAtion  Research (CLEAR)

References

1. F. Ahmad, S. de la Chica, K. Butcher, T. Sumner, and J.H. Martin. Towards automatic conceptual personalization tools. In Proc 7th ACM/IEEE-CS joint conference on Digital Libraries. ACM, 2007.

2. I. L. Beck, M. G. McKeown, C. Sandora, L. Kucan, and J Worthy. Questioning the author: A year-long classroom implementation to engage students with text. The Elementary School Journal, 98:385– 414, 1996.

3. B.S. Bloom. Taxonomy of Educational Objectives: The Classification of Educational Goals. Susan Fauer Company, Inc, 1956.4. S. de la Chica, F. Ahmad, J.H. Martin, and T. Sumner. Pedagogically useful extractive summaries for science education. In

Proc CoLing, volume 1, pages 177– 184. Association for Computational Linguistics, 2008.5. A Graesser, V Rus, and Z Cai. Question classification schemes. In Proc WS on the QGSTEC, 20086. Q. Gu, S. Chica, F. Ahmad, H. Khan, T. Sumner, J.H. Martin, and K. Butcher. Personalizing the selection of digital library

resources to support intentional learn- ing. In Proc Euro Research and Advanced Technology for Digital Libraries, 2008.7. P.W. Jordan, B Hall, M Ringenberg, Y Cue, and C Rose. Tools for authoring a dialogue agent that participates in learning

studies. In Proc AIED, pages 43–50, Amsterdam, The Netherlands, The Netherlands, 2007. IOS Press.8. W.C. Mann and S.A. Thompson. Rhetorical structure theory: Toward a functional theory of text organization. Text, 8(3):243–

281, 1988.9. RD. Nielsen. Question generation: Proposed challenge tasks and their evaluation. In Proc WS on the QGSTEC, 2008.10. RD Nielsen, J Buckingham, G Knoll, B Marsh, and L. Palen. A taxonomy of questions for question generation. In Proc WS on

the Question Generation Shared Task and Evaluation Challenge., 2008.11. R Prasad, N Dinesh, A Lee, E Miltsakaki, L Robaldo, A Joshi, and B Webber. The penn discourse treebank 2.0. In Proc LREC,

2008.12. R Prasad and Aravind Joshi. A discourse-based approach to generating why- questions from texts. In Proc WS on the

QGSTEC, 2008.13. D. Radev, T. Allison, S. Blair-Goldensohn, J. Blitzer, A. C Uelebi, S. Dimitrov, E. Drabek, A. Hakim, W. Lam, D. Liu, J.

Otterbacher, H. Qi, H. Saggion, S. Teufel, M. Topper, A. Winkel, and Z. Zhang. Mead - a platform for multidocument multilingual text summarization. In Proc. LREC 2004, 2004.

14. C.M. Reigeluth. The elaboration theory: Guidance for scope and sequence decisions. In Instructional-Design Theories and Models: A New Paradigm of Instructional Theory. Lawrence Erlbaum Assoc, 1999.

15. V. Rus, Z. Cai, and A.C. Graesser. Question generation: An example of a multi- year evaluation campaign. In Proc WS on the QGSTEC, 2008.

16. R. Soricut and D. Marcu. Sentence level discourse parsing using syntactic and lexical information. In Proc HLT/NAACL, pages 228–235, 2003.

17. S. Susarla, A. Adcock, R. Van Eck, K. Moreno, A. C. Graesser, and the Tutoring Research Group. Development and evaluation of a lesson authoring tool for autotutor. In V. Aleven, U. Hoppe, R. Mizoguchi J. Kay, H. Pain, F. Verdejo, and K. Yacef, editors, Proc. AIED2003, pages 378–387, 2003.

18. L. Vanderwende. The importance of being important. In Proc WS on the QGSTEC, 2008.19. Howard Wainer. Computer-Adaptive Testing: A Primer. 2000.