providing help with statistical concepts and terms enhanced glossary and ontology

32
Providing Help with Statistical Concepts and Terms: Enhanced Glossary and Ontology Stephanie W. Haas Ron Brown Cristina Pattuelli

Post on 19-Oct-2014

532 views

Category:

Education


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Providing Help with Statistical Concepts and Terms:

Enhanced Glossary and Ontology

Stephanie W. Haas

Ron Brown

Cristina Pattuelli

Page 2: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Development of Enhanced Glossary

term

content formatcontext specificity

presentations user control

ontology

Page 3: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Terms• Include terms that users frequently encounter on

agency sites, not comprehensive dictionary• Basic level of statistical literacy, not highly

technical resource• Strategies for term identification

– examination of frequently-visited pages– anecdotal evidence from agency and non-agency

consultants– metadata user study– webcrawl of agency sites

Page 4: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Content

• Provide basic level of explanation• May include:

– definition– example– brief tutorial– demonstration– interactive simulation– combination

• May incorporate related terms and concepts• Give pointers to more complete and/or more

technical explanations

Page 5: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Context specificity

• Explanations provided at varying levels of specificity– General, context-free, “universal”– Agency or concept-specific, incorporating

entities from agriculture, labor, science R&D, energy, etc.

– Table- or statistic-specific, based on a single row, column, or statistic, e.g., CPI, national death rate, gasoline prices in NY state, etc.

Page 6: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

• Provide explanations of term or concept that are as relevant to user’s current context as possible.

• When user invokes help on a term, the most specific explanations available are offered.

• If there is no explanation for that specific statistic or table, more general (e.g., agency-specific) ones are offered. Default is “universal” level.

• Path from specific to general is based on the ontology.

Page 7: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Format

• User can choose desired format of explanation, based on interest, learning style, reading level, hardware/software limitations– text– text plus audio (narration)– graphic– animation– interactive

Page 8: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

User Control

• Make glossary help attractive and accessible• Help users understand the statistics they find

without interrupting their information-seeking task

• Let users know when help is available• Let users choose the format and specificity they

desire• Control mechanisms, e.g., means of invocation

and termination, pop-up windows, mouse-overs, etc.

Page 9: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Creating the Ontology

• Select ontology editor to meet our needs• Include terms and concepts to support

glossary. – May need “connecting nodes” that aren’t in

glossary

• Relationships– standard – isa, instance, etc.– domain-specific – predicts, smoothes, etc.

• Visualization tools for end users (future work)

Page 10: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Ontology support for glossary

Relationships support design and display of term explanations

• Specificity of explanations– inheritance of more general explanations

• Explanation templates– sample: samples for specific surveys– index: CPI, Antiknock Index

• Related terms – incorporation into tutorial– population, sample

Page 11: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Current Coverage

• adjustment– universal– age adjustment - FL death rates– seasonal adjustment - NY unemployment rate

• index – universal, CPI, Antiknock index

• population, parameter, sample, statistic– universal, weekly gasoline prices, NY state

weekly gasoline prices, height & weight of U.S. adult residents

Page 12: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Mock-ups

Page 13: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Suppose this picture represents the population of people in the entire country.In this population, a certain percentage (p) of people like dogs. In this example,10 people like dogs. P is the parameter that measures this view of the population.It is the value that you would get if you could survey the entire population. 20% of the people in this population like dogs.

Dislikes dogsLikes dogs

p = 10/50 = .2 = 20%

population & sample (1)Population

Page 14: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

In real life it is difficult to survey the entire population so we take a sample. We can then count the number of people in the sample who like dogs,and calculate a statistic (P*) that is an estimate of the value of p.In this case, P* overestimates the value of the parameter p.

Dislikes dogsLikes dogs

population & sample (2)

Sample

p = 10/50 = .2 = 20% P* = 3/10 = .3 = 30%

Population

Page 15: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

EIA weekly gasoline prices

Every Monday, retail prices for all three grades of gasoline are collected by telephone from a sample of approximately 900 retail gasoline outlets.

Reported in:Weekly U.S. Retail Gasoline Prices, Regular Grade

Dollars per gallon, including all taxeshttp://www.eia.doe.gov/oil_gas/petroleum/data_publications/wrgp/mogas_home_page.html

Page 16: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

•text example of population and sample for this table•graphical example of population and sample for this table

Page 17: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

population: all retail gasoline outlets

sample: 900 retail gasoline outlets

regular gasoline,mean price/gallon,9/30/02 = $1.413

graphical example of population & sample, gasoline prices

Page 18: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

•text example of sample for NY•graphical example of sample for NY

Page 19: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

$$

9/30/02

sample of New York retail gasoline outlets

mean cost = $1.529 per gallon

graphical example of sample, NY gasoline prices

Page 20: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

•graphical example of population and sample for body measurements

Page 21: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

graphical example of population and sample for body measurements

each participant represents approximately 50,000 other U.S. residents

5,000 individuals are surveyed annually

Page 22: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Populationis_described_by

mean

standard_deviation

Parameter

Sample

Is p

art

of

sample_mean

sample_standard_deviation

Statisticis_described_by

Is a

pre

dict

or o

f

Page 23: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Populationis_described_by

mean

standard_deviation

Parameter

Sample

Is p

art

of

sample_mean

sample_standard_deviation

Statisticis_described_by

Is a

pre

dict

or o

f

Is a

pre

dict

or o

f

Is a

pre

dict

or o

f

Page 24: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Population

Sample

Is p

art

of

U.S. residents

NY State retail gasoline outlets

U.S. retail gasoline outlets

5,000 U.S. residents/yr

900 U.S. retail gasoline outlets

n NY State retail gasoline outlets

Is p

art

of

Is p

art o

f

Is p

art

of

instance of

U.S. R&D companies

n U.S. R&D companies

Is p

art

of

Page 25: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

IndexAn index combines numbers measuring different things into a single number. The single number represents all the different measures in a compact, easy-to-use form. Values for an index can be compared to each other, for example, over time.

combiner

index = 12.3

10.1

103

24.759

6

42

Page 26: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

12

12.5

13

13.5

14

14.5

Jan Apr Jul Oct

Jan.combiner

Apr.combiner

Jul.combiner

Oct.combiner

12.3 13.1 13.9 14.3

The index has increased this year.

Page 27: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Consumer Price Index (CPI)

The Consumer Price Index (CPI) represents changes in prices of all goods and services produced for consumption by urban households. It combines prices into a single number that can be compared over time.

Items are classified into 8 major groups:•Food and Beverages•Housing•Apparel•Transportation•Medical Care•Recreation•Education and Communication•Other

Page 28: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Consumer Price Index

medical careother

CPI combiner

transportationfood & beverage

apparel

recreation

housing

education & communication

Telephone

Page 29: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

The Consumer Price Index has increased since 1995.

1997 CPICombiner

1998 CPICombiner

1999 CPICombiner

2000 CPICombiner

2001 CPICombiner

160

165

170

175

180

1997 1998 1999 2000 2001

Page 30: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Antiknock Index, also known as Octane Rating

A number used to indicate gasoline’s antiknock performance in motor vehicle engines. The two recognized laboratory engine test methods for determining the antiknock rating, i.e., octane rating, of gasolines are the Research method and the Motor method. In the United States, to provide a single number as guidance to the consumer, the antiknock index (R+M)/2, which is the average of the Research and Motor octane numbers, was developed.

http://www.eia.doe.gov/glossary/glossary_a.htm

Page 31: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Research method

Motor method

Antiknock Index, also known as Octane Rating

Regular:

85 - 88

Midrange:

88 - 90

Premium:

90 or above

(R + M)/2

AntiknockCombiner

Page 32: Providing Help With Statistical Concepts And Terms Enhanced Glossary And Ontology

Next Steps

• expand coverage of core terms– webcrawl indicates measures of central

tendency are next: average, mean, median, mode

• expand coverage of ontology• expand presentation examples

– animations, simulations

• explore user controls• user study of effectiveness