lecture # 31 category trees. binary trees 16 how many steps to reach a leaf? 4

40
Lecture # 31 Category Trees

Upload: blake-whitehead

Post on 25-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Lecture # 31

Category Trees

Category Trees

Binary Trees

16

How many stepsto reach a leaf?

4

Binary Trees

N

How many stepsto reach a leaf?

log2(N)

4 branch trees

16 How many stepsto reach a leaf?

2

4 branch trees

N How many stepsto reach a leaf?

Log4(N)

What is the General Algorithm?

N How many stepsto reach a leaf?

What is the General Algorithm?

N How many stepsto reach a leaf?

LogM(N), where M = “branching factor?

M

4 branch trees

If I double the number of branches, what happensto the number of steps to reach a leaf?

4 branch trees

If I double the number of branches, what happensto the number of steps to reach a leaf?

It is cut in half

10 branch trees

Log10(N)

Count digits

Binary Trees

How many stepsto reach a leaf?

16 leaves

Binary Trees

Balanced Unbalanced

Binary Trees

Really Unbalanced

16 leaves

how many steps?

Average of 8 - N/2

N branches

• If I have N leaves, why not just have N branches in the tree?

– I can reach each leaf in one step

• The time to choose a leaf– Binary tree

• constant time

– N-ary tree (N branches)• N checks (one for each branch)

What if I don’t know which branch to choose?

Try all of them

Average N/2

Trees - time to find things

• Number of branches (B)– logB(N)– Too many branches

• searching for branches is a problem

– Too few branches • too many steps to a leaf

• Balance• Probability of correct choice

Examples of trees

• Dewey

• Library of congress

• Biology

• Yahoo

• Menus

• File system

Dewey

• 000 Computers, information, & general reference• 100 Philosophy & psychology• 200 Religion• 300 Social sciences• 400 Language• 500 Science• 600 Technology• 700 Arts & recreation• 800 Literature• 900 History & geography

Dewey

• 500 Science– 510 Mathematics– 520 Astronomy – 530 Physics– 540 Chemistry – 550 Earth sciences & geology – 560 Fossils & prehistoric life – 570 Biology & life sciences – 580 Plants (Botany)– 590 Animals (Zoology)

Dewey

• 500 Science– 550 Earth sciences & geology

• 551 Geology, hydrology, meteorology• 552 Petrology• 553 Economic geology• 554 Earth sciences of Europe• 555 Earth sciences of Asia• 556 Earth sciences of Africa• 557 Earth sciences of North America• 558 Earth sciences of South America• 559 Earth sciences of other areas

What are the numbers in the Dewey tree?

• A path name

• How many possibilities?– 1000

• Where do we get more?– 343.123 c

Using the Dewey tree

• What is the Dewey Decimal number for Ostriches?

• If we don’t know how to choose then how do we find things?

• Search to get Dewey decimal number

What good is the system?

• If we can’t correctly choose the path to a book, then isn’t the search just linear (N/2)?

• Unique location for each book– Why not just assign them a number as each

book comes into the library?

• Browsing - keep related books together

Library of Congress

• How many books?~ 20 Million

• How many maps, documents, videos, photos?~ 100 Million

Library of Congress• A -- GENERAL WORKS• B -- PHILOSOPHY. PSYCHOLOGY. RELIGION• C -- AUXILIARY SCIENCES OF HISTORY• D -- HISTORY: GENERAL AND OLD WORLD• E -- HISTORY: AMERICA• F -- HISTORY: AMERICA• G -- GEOGRAPHY. ANTHROPOLOGY. RECREATION• H -- SOCIAL SCIENCES• J -- POLITICAL SCIENCE• K -- LAW• L -- EDUCATION• M -- MUSIC AND BOOKS ON MUSIC• N -- FINE ARTS• P -- LANGUAGE AND LITERATURE

• Q -- SCIENCE• R -- MEDICINE• S -- AGRICULTURE• T -- TECHNOLOGY• U -- MILITARY SCIENCE• V -- NAVAL SCIENCE• Z -- BIBLIOGRAPHY. LIBRARY SCIENCE. INFORMATION RESOURCES (GENERAL)

Library of Congress• A -- GENERAL WORKS• B -- PHILOSOPHY. PSYCHOLOGY. RELIGION• C -- AUXILIARY SCIENCES OF HISTORY• D -- HISTORY: GENERAL AND OLD WORLD

• E -- HISTORY: AMERICA• F -- HISTORY: AMERICA• G -- GEOGRAPHY. ANTHROPOLOGY. RECREATION• H -- SOCIAL SCIENCES• J -- POLITICAL SCIENCE• K -- LAW• L -- EDUCATION• M -- MUSIC AND BOOKS ON MUSIC• N -- FINE ARTS• P -- LANGUAGE AND LITERATURE• Q -- SCIENCE• R -- MEDICINE• S -- AGRICULTURE• T -- TECHNOLOGY

• U -- MILITARY SCIENCE• V -- NAVAL SCIENCE• Z -- BIBLIOGRAPHY. LIBRARY SCIENCE. INFORMATION RESOURCES (GENERAL)

Why do these subjectsget so much spacein the tree?

What is the purposeof the Library ofCongress?

Organizing trees of information

• Branches

• Balance

• Can the user make a correct choice– Information about each choice

Which of these things is not like the others?

• If you are in first grade?

• If you are a biologist?

Which of these things is not like the others?

• Studying the arctic

• Studying child/adult animal behavior

Tree problems

LiteratureNovels

Whales. . .

IndustrySea

Whaling. . .

BiologyMammals

Whales. . .

Yahoo.com - a huge tree of WWW information

What are the majoritems for?

Why are the minoritems here?

Click Here

What is this?

• A path name in a tree

• Why doesn’t “Animals” follow “Science”?

What is this section for?

• To identify the user’s purpose

Tree problems

LiteratureNovels

Whales. . .

IndustrySea

Whaling. . .

BiologyMammals

Whales. . .

Aliases - connect tree branches

LiteratureNovels

Whales. . .

IndustrySea

Whaling. . .

BiologyMammals

WhalesStoriesIndustry

. . .

Aliases - Windows Shortcuts

• Named objects that contain the path name of other objects

• Lets us organize a tree in many ways

Shortcut symbol

Challenge

• If you run a commercial art house and want to organize all of your pictures into a tree

• If you are creating a new program and want to organize your menu items into a tree

• Who has the answer to the organization?– The users

How to get user input

• Get samples– 100 pictures – Names of menu items on 3x5 cards

• Give them to users– Tell them to organize them into 7-12 stacks– Tell them to write name on each stack– Video tape the process and ask them to talk

about what they are doing

Trees - time to find

• Number of branches (B)– logB(N)– Too many branches

• searching for branches is a problem

– Too few branches • too many steps to a leaf

• Balance• Probability of correct choice• “Similar” things together

– Aliases to apply multiple organizations