ontology generation based on a user-specified ontology seed

20
1 Ontology Generation Based on a User- Specified Ontology Seed Cui Tao Data Extraction Research Group Department of Computer Science Brigham Young University Supported by NSF

Upload: lesley

Post on 09-Jan-2016

36 views

Category:

Documents


5 download

DESCRIPTION

Ontology Generation Based on a User-Specified Ontology Seed. Cui Tao Data Extraction Research Group Department of Computer Science Brigham Young University. Supported by NSF. Introduction. Motivation: Traditional search engines: return documents - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Ontology Generation Based on a User-Specified Ontology Seed

1

Ontology Generation Based on a User-Specified

Ontology Seed

Cui TaoData Extraction Research GroupDepartment of Computer Science

Brigham Young University

Supported by NSF

Page 2: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu2

Introduction

Motivation: Traditional search engines: return documents Ontology-based data extraction: return information

Problem: Build extraction ontology that meet users needs

Goal: Automatically build ontologies for users’ needs

Page 3: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu3

Example

Example: a biologist is interested in information about large proteins in humans and their functions

Possible queries: Find proteins in humans that are >20 kDa Find all the proteins in humans that serve as receptors ...

Information sources --- various online databases NCBI Gene Cards The Gene Ontology GPM Proteomics Database …

Page 4: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu4

Extraction Ontology

Regular Expression: ^\d{1,5}(\.\d{1,2})?

Unit: kilodaltons?|kdas?|kds|?das?|daltons?

Molecular Weight

Page 5: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu5

User Interface

Select a title for the forms

Page 6: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu6

User InterfaceBinary Relationship

NameProtein

Protein

Name

Page 7: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu7

User InterfaceBinary Relationship

Molecular Weight

Protein

NameNameProtein

Molecular weight

Page 8: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu8

User InterfaceN-ary Relationship

Chromosome number

Start End Orientation

Chromosome locationChromosome location

Chromosome number

Start End

Orientation

Page 9: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu9

User InterfaceN-ary Relationship

GO

GO

GO phrase

GO ID

Go ID

Go term

Page 10: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu10

Protein

Molecular Weight

Name

Chromosome location

GO

Chromosome number Start End Orientation

Overall Form

Go ID

Go term

Page 11: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu11

Ontology View

Name

Chromosome location

Protein

Chromosome number

Start End

Orientation

GO

GO phrase

GO ID

Molecular weight

Page 12: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu12

Protein

Molecular Weight

Name

Chromosome location

GO

Chromosome number Start End Orientation

Go ID

Go term

Fill in the Form

Page 13: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu13

Protein

Molecular Weight

29175 Daltons

Name

14-3-3 protein epsilonMitochondrial import stimulation factor LsubunitProtein kinase C inhibitor protein-1KCIP-114-3-3E

Chromosome location

GO

Chromosome number17

Start End Orientation1,250,267 1,194,558 minus

Fill in the Form

GO:0019899GO:0019904

Go ID

Go term

enzyme bindingprotein domain specific binding

Page 14: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu14

Mapping

Name

14-3-3 protein epsilonMitochondrial import stimulation factor LsubunitProtein kinase C inhibitor protein-1KCIP-114-3-3E

Page 15: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu15

Mapping

Name

14-3-3 protein epsilonMitochondrial import stimulation factor LsubunitProtein kinase C inhibitor protein-1KCIP-114-3-3E

Page 16: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu16

Mapping

Name

Page 17: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu17

Data Frame Generation

Choose from data frame library Data frames for basic values

Numbers within different ranges Integers, floats, etc Emails, phone numbers, addresses, etc

Domain specific values (DNA sequences) Units

Build lexicon files

Page 18: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu18

Data Frame Generation

• Find the best matched data frame from the library• Find the correct units

Page 19: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu19

Build Lexicon Files

Name

Page 20: Ontology Generation Based on a User-Specified Ontology Seed

www.deg.byu.edu20

Contribution

Automatically generates ontologies depending on users’ requests

Provides a tool for users to easily provide ontology seeds

Automatically generates ontology views from ontology seeds

Automatically map ontology concepts to source databases