crowdsourcing disagreement on open-domain questions

24
Crowdsourcing Disagreement on Open-Domain Questions July 18 th 2014 Crowdsourcing Disagreement on Open-Domain Questions By Benjamin Timmermans, Lora Aroyo (VU) & Chris Welty (IBM)

Upload: crowdtruth

Post on 26-Jun-2015

684 views

Category:

Technology


2 download

DESCRIPTION

Crowdsourcing Disagreement on Open-Domain Questions

TRANSCRIPT

Page 1: Crowdsourcing Disagreement on Open-Domain Questions

Crowdsourcing Disagreement on Open-Domain QuestionsJuly 18th 2014

Crowdsourcing Disagreementon

Open-Domain Questions

By Benjamin Timmermans,Lora Aroyo (VU) & Chris Welty (IBM)

Page 2: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

● Natural Language

● Generate and evaluate hypothesis

● Learn from evidence

Cognitive Systems

Page 3: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Gold Standard

● Distant supervision incomplete (Min et al., 2013)

● Crowdsourcing ground-truth (Inel et al., 2013)

● Large diverse volume of annotations

● Crowd Truth

Page 4: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Closed QuestionsIs rice a grain? (Yes)

Are all snakes Brown? (No)

Is it hard to get into Stanford?

Is MO polar?

Do isotopes of the same element have the same atonic weight?

Page 5: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Do isotopes of the same element have the same atonic weight?1. 1913: Henry Moseley, working from Van den Broek's earlier idea, introduces concept of atomic number to fix inadequacies of Mendeleev's periodic table, which had been based on atomic weight,. ;1913: Frederick Soddy proposes

the concept of isotopes, that elements with the same chemical properties may have differing atomic weights.2. A chemical element consists of atoms with a specific number of protons in their nuclei, but different atomic weights owing to variations in the number of neutrons. Atoms of the same element with differing atomic weights are

called isotopes.3. A primary isotope shift is the change in chemical shift observed between spectra of different isotopes of the same element, such as proton and deuterium. Primary isotope shifts between 1H and 2D and are usually less than 0.1

ppm.4. A radioisotope does not "rot"; it decays by turning into another isotope of the same element-or even into another element entirely.5. Although receiving considerable support it was eventually rejected when it was found that many elements have non-integral weights (e.g. chlorine: 35.453). Frederick Soddy in 1913 had introduced the idea of isotopes; that is, the

same chemical element in different forms having differing weights. Aston established that isotopes are not restricted to radioactive elements but are common throughout the periodic table.6. An alternate form of an element that has the usual number of protons but a nonstandard number of neutrons; the fewer or addit ional neutrons give the isotope a different atomic weight than the regular element and may make

the isotope radioactive, but otherwise an isotope has the same chemical action as the regular element. Because of this, isotopes (such as radioactive carbon) are used as tracers in biological systems or processes.7. An isotope is one of several kinds of atoms of the same element that have different masses. These atoms have the same number of protons in their nuclei, but different numbers of neutrons, and therefore different mass

numbers.8. As noted in the discussion of deuterium, tritium can only be separated from protium due to the differences in mass. The chemical properties of isotopes with the same parent element make them otherwise indistinguishable, and

hence purely chemical means cannot be used to separate them.9. At the turn of the nineteenth century Dalton extended and refined Prout s remarkable conclusion that compounds were of fixed composition by proposing that atoms of the same element had the same atomic weight. Today,

following the discovery of isotopes in this century, we define an atomic weight for a blend of isotopes of an element as the ratio of the average mass per atom of the element to one-twelfth the mass of an atom of 12C

10. Atoms of the same atomic number but different atomic weights are called isotopes. Elements can exist in both stable and unstable (radioactive) forms.

11. Atoms of the same element whose nuclei contain a different number of neutrons are said to be different isotopes of the element. A pure element can exist as monatomic units or as diatomic or polyatomic units comprising the same kind of atoms.

12. Isotopes are forms of the same element having different atomic weights13. Atoms of the same element with differing atomic weights are called isotopes. Radioactive decay is a spontaneous process in which an isotope (the parent) loses particles from its nucleus to form an isotope of a new element (the

daughter)14. By the way, an isotope means the forms of an element having different atomic weights because of the difference in number of neutrons.15. Carbon has atomic number 6 and atomic weight 12.011, and is represented by the symbol C. It occurs in two different isotopes. Isotopes share the same atomic number, hence the same identity as elements and same chemical

behavior, but have different atomic weights. Put differently, isotopes have the same number of protons but a different number of neutrons. The isotopes of carbon are carbon-12 (six protons plus six neutrons) and carbon-14 (six protons plus eight neutrons)

16. Deuterium is an isotope of Hydrogen. An isotope is any of two or more forms of an element having the same atomic number but with different weights (mass). Hydrogen has 2 isotopes, Deuterium (stable, i.e. non-radioactive) and Tritium (radioactive)

17. Dr. W. D. Harkins, Professor of Physical Chemistry at the University of Chicago, who earlier in the week had announced his discovery of Zeta rays, described his original work with isotopes, or elements having the same chemical structure but different atomic weights.

18. During this time, he worked with American chemist and physicist Harold Urey at Columbia University on gaseous diffusion techniques for the separation of uranium isotopes (isotopes are different forms of the same element having the same atomic number but different atomic weights)

19. During this time, he worked with American chemist and physicist Harold Urey at Columbia University on gaseous diffusion techniques for the separation of uranium isotopes (isotopes are different forms of the same element having the same atomic number but different atomic weights). After the war, he accepted an appointment as a professor of chem istry at the University of Chicago and began to conduct research at the Institute of Nuclear Studies.

20. Five elements have seven stable isotopes, eight have six stable isotopes, nine have five stable isotopes, nine have four stable isotopes, nine have three stable isotopes, 16 have two stable isotopes (counting m as stable), and 26 elements have only a single stable isotope (of these, 19 are so-called mononuclidic elements, having a single primordial stable isotope that dominates and fixes the atomic weight of the natural element to high precision;

21. For an element with three naturally occurring isotopes the method is the same: sum the masses of the isotopes weighted by atom-fraction.22. For an element with three naturally occurring isotopes the method is the same: sum the masses of the isotopes weighted by atom-fraction. This method of calculating the average mass takes into account the relative abundance

of all of the isotopes of an element, so that this mass number always gives the same total number of atoms, for a natural sam ple of any element.23. He called atoms of the second group isotopes, atoms of the same element with different atomic weights. In any natural sample of an element, there may be several types of isotopes. As a result, the atomic weight of an element

that was calculated by Berzelius was actually an average of all the isotope weights for that element. This was the reason that some elements did not fall into the right order on Mendeleev's Periodic Table--the average atomic weight depended on how much of each kind of isotope was present. Soddy suggested placing the elements in the Periodic Table by similarity of chemical reactions and then numbering them in order.

24. He called such chemically identical elements, with slightly differing atomic weights, isotopes (from the Greek words meaning in the same place)25. His assistant, Francis Aston, developed Thomson's instrument further and with the improved version was able to discover isotopes-atoms of the same element with different atomic weights-in a large number of nonradioactive

elements.26. In 1913, Richards proved the existence of lead isotopes (two or more atoms of the same element that differ in atomic weight) by investigating the weight of lead from various sources.27. In 1914 he received the Nobel Prize in Chemistry for accurately determining the atomic weights of more than 25 chemical elements and ascertaining the existence of isotopes, chemical elements that have the same atomic

number and position in the periodic table but different atomic masses and physical properties.28. In a short letter to the editor of Nature, published on December 4, 1913, he first proposed the term isotope to designate chemically identical elements with different atomic weights (in modern terms, elements with the same

atomic numbers but different mass numbers)29. In a short letter to the editor of Nature, published on December 4, 1913, he first proposed the term isotope to designate chemically identical elements with different atomic weights (in modern terms, elements with the same

Page 6: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Machine way of thinking

Crowdsourcing Architecture

Page 7: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

High amount of annotations40

Answer Passages3000

Questions10

Workers??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

??????????????????????????????

0 - 20Matching Terms

Page 8: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Passage Alignment Template

Page 9: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Result of AlignmentAnswer Passage Terms

QuestionTerms

Page 10: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Annotated Relation Types

Page 11: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Negation in Passages vsWiki Answers

Is MO polar?

o Yes

x No Question ID

Page 12: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Passage Alignment not perfect

● No expertise required

● High variety and volume of annotations

● Difficult to find the answer

● Expensive at $0.06 per task for 10 workers for 120.000 passages =

$72.000

Page 13: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Question Answering Architecture

Page 14: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Question Answering Template

Page 15: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Question Answering Template

Page 16: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Question Answering Template

Page 17: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Answer Justification

Question ID

Justify

ing

Passag

es

Page 18: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Reduce dataset

120.000 -10% -75% 27.000

Page 19: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Ag

gre

ga

ted W

ork

er A

nsw

er

Crowdsourcing Answers vs Wiki Answers

Wiki Answer is Yes Wiki Answer is No

Is rex burkhead a senior?

Is MO polar?

Question IDQuestion ID

Ag

gre

ga

ted W

ork

er A

nsw

er

Page 20: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Tracking low quality workersNo Answer Found

Unanswerable

Yes

No

Other Answer

Yes

NoClosed Question

Other AnswerOpen Question

Page 21: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

AMT vs CF worker efficiency

Low Quality Workers

Page 22: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Question Answering Task

● No expertise required

● Judgements on:

● Answer Justification

● Question Type

● Question Answer

120.000 -10% -75% 27.000

Page 23: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Future Work

● Gamification

● Domain specific expert annotators

● Improve preprocessing

● Relation tasks

Page 24: Crowdsourcing Disagreement on Open-Domain Questions

July 18th 2014 Crowdsourcing Disagreement on Open-Domain Questions

Conclusions

● Crowd can answer open-domain questions

● Aligning passages via Crowdsourcing is expensive

● Crowd performs better on AMT than on CF