design data retrieval and manipulation for subset of ‘gombe’ database using qbe durga gumaste
DESCRIPTION
Design data retrieval and manipulation for subset of ‘Gombe’ database using QBE Durga Gumaste Advisor: Dr. Shashi Shekhar June 10, 2003. Agenda. Objective Background and Motivation Related work and my contribution Porting of tables Query description Query optimization Summary Demo. - PowerPoint PPT PresentationTRANSCRIPT
1
Design data retrieval and manipulation for subset of ‘Gombe’ database using QBE
Durga GumasteAdvisor: Dr. Shashi ShekharJune 10, 2003
2
Agenda Objective Background and Motivation Related work and my contribution Porting of tables Query description Query optimization Summary Demo
3
ObjectiveDesign and implement queries to access andmanipulated ‘Gombe’ chimpanzees data subset,
suchthat queries can be modified by the user having
no background of any Data Manipulation
Language(DML)
4
Agenda Objective Background and Motivation Related work and my contribution Porting of tables Query description Query optimization Summary Demo
5
Background and motivation Data about ‘Gombe’ chimpanzees
Collected since 1953 Behavioral and location data 15 tables Average size: 10-12 MB
Dr. Jane Goodall has done active research of ‘Gombe’ chimpanzee for last 35 years
Jane Goodall Institute's Center for Primate Studies at the University of Minnesota
Data retrieval for analysis on the data Frequent query modification
Ease of modification
6
Sample in ‘Gombe’ database
Name Description No of records
chimp Chimps observed by biologists 212Follow Each time a chimp is observed by
biologists8463
Follow_arrival_ new
Any chimp arriving with the focal chimp
230743
Food_bout What and where a focal chimp eats during the follow
68871
Follow_map_position
Location of the focal chimp during the follow
393615
7
Relationship between tables
8
Agenda Objective Background and Motivation Related work and my contribution Porting of tables Query description Query optimization Summary Demo
9
Related work Earlier implementations
Oracle Paradox
Limitations of earlier implementations Ecologists not comfortable in modifying PL/SQL
queries in Oracle Paradox is not licensed at University of
Minnesota
10
Present Implementation Database: Microsoft Access 2000
Ecologists familiar with MS Access environment
Desktop database Microsoft office suite University of Minnesota has a license for
Microsoft Office Provides QBE
11
My contribution Port ‘Gombe’ database to MS Access
Implement new queries using MS Access
Helped behavioral ecologists to modify queries using MS Access QBE
Optimize queries
12
Agenda Objective Background and Motivation Related work and my contribution Porting of tables Query description Query optimization Summary Demo
13
Port tables to MS Access
Apply primary key constraints Apply referential integrity constraints
Create tables in design view
Import tables using import utility
14
Verification of porting Number of records present in .txt files
Follow: 8459 records
Count of records by count query
15
Agenda Objective Background and Motivation Related work and my contribution Porting of tables Query description Query optimization Summary Demo
16
Queries Nested, join, range
Q1: Find all chimps arriving alone Q2: Include mothers arriving with off springs in Q1Q3: Include siblings in Q1Q4: include mothers and siblings in Q1Q5: Find chimps arriving together with other chimp
Single table, aggregate, pointQ6: Find food count of food items in a particular month of a year (Find % food counts)Q7: Find duration for which food items are eaten in a particular month of a year(Find % food duration)
17
Implementation (Q1)follow_arrival (A)
follow_arrival (B)
Inner join on A and B (self join)
A.date=B.date A.follow=B.follow A.chimp<>B.chimp
Result Set
Arrival time difference between A.chimp and B.chimp > 5 minutes
Follow_map_position (F)
Inner join with F
A.date=F.date A.follow=F.follow A.chimp<>F.focal A.seq = F.seq
1. Inner join (self join) on follow_arrival
2. Select chimps having fa_time_start difference more than 5 minutes for a particular follow on a particular date
3. Take location coordinates for such chimps from follow_map_position table by joining follow_arrival table with follow_map_position table
Chimps are said to be alone when arrival timebetween 2 chimps is more than 5 minutes
18
MS Access Implementation for Q1 Few Inner joins conditions cannot be displayed
in QBE MS Accesses uses Dyna sets Sub-query over base query
Base query in SQL (views) Sub-query in QBE
Sub-query easy to change by ecologists
19
MS Access Implementation for Q1
20
Comparison - PL/SQL & QBE for Q1
21
Comparison - PL/SQL & QBE for Q1
22
Q1 extension
Chimps arriving alone Mothers arriving with off springs are counted as
arriving alone (Q2) Chimps which arrive with their siblings are
counted as arriving alone(Q3) Both Q2 and Q3 (Q4)
23
Derived table
Follow Date Time Map Grpsize
------
chimp_id
AL
AO
AP
AR
AT
Follow_Arrival Follow_map_time
Time interval adjustment10:03 10:0010:11 10:15
Follow_arrival
Certainty Value 1 1 0 0 not observed blank
Sum of certainties
AL740101 1/1/1974
10:00 AM 2 1 1 0 13
AL AO AP AR
Group_composition_table
•Written in VB
24
Agenda Objective Background and Motivation Related work and my contribution Porting of tables Query description Query optimization Summary Demo
25
Query optimization in MS Access Cost bases query optimization
MS Jet 4.0 Table statistics
Rushmore optimization Efficient use of indexes Index intersection,union,minus
26
Performance evaluationExecution times
11.6 13.5 13.2 13.8 18
200
260240
290
450
0
50
100
150
200
250
300
350
400
450
500
Queries
Tim
e(s)
indexno index
27
Compacting databaseCompact database using Compact utility
provided byMS Access
De-fragmentation Reordering database pages Reclaim unused space
Original size: 1.1 GB After compaction: 284 MB
Flags queries needing recompilation
28
Summary Porting of data to MS Access Query modification using QBE
Ease of writing and modifying queries GUI
Queries over views Base queries in SQL and sub-queries in QBE Access uses dyna-sets
Derived tables created in VB Multiple queries Onetime queries
Indexes on join attributes improve the performance by 90-95%
29
Future work Updating group composition table if new
chimp gets added to chimp table
30
Demo Query Description
Find out chimpanzees arriving alone (include relations)
Base query (view) in SQL Sub-query in QBE Modifications in QBE Results of modifications
31
Base query (view) in SQL
32
Sub-query in QBE
33
Modifications in query (1)
34
Results(1)
35
Modifications in query (2)
36
Results(2)
37
Modifications in query (3)
38
Results(3)
39
Acknowledgements I would like to thank Prof. Shekhar for giving me
this wonderful opportunity to work on this project and his precious guidance from time to time
I would also like to thank Prof. Pusey, Carson Murray, and Ian Gilby of Ecology Department for their help in understanding Gombe database.
I would like to thank Prof. Pusey and Prof. Srivastava for their time today
40
Thank You!!