an ontological approach to assessing ic need to know phillip burnscta inc. prof. amit shethlsdis...
TRANSCRIPT
![Page 1: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/1.jpg)
An Ontological Approach to Assessing IC Need to Know
Phillip Burns CTA Inc.
Prof. Amit Sheth LSDIS Lab, University of GeorgiaPresented to ARDA PI Meeting, Myrtle Beach, February 16 2005
Contract # NBCHC030083
![Page 2: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/2.jpg)
6/21/2004
A thought to begin with …
You cannot separate two facets of information retrieval (“systematic serendipity)—information recovery and information discovery.
Eugene Garfield … in essays of an Information Scientist
![Page 3: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/3.jpg)
6/21/2004
Objective & Approach
Determine if (classified) documents reviewed by an IC analyst satisfy his/her “need to know” Characterization of “need to know” w.r.t. ontology Characterizing document content in terms of
ontology Discovering weighted semantic relationships
between document content and “need to know” characterization
![Page 4: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/4.jpg)
6/21/2004
Characterizing “Need to Know” using a Semantic Approach (using Ontology) Requires domain ontology
models important concepts & relationships of domain (schema), captures factual knowledge (instances)
Relate analyst’s need to know to concepts & relationships in ontology e.g. terrorist organization, funding sources,
facilitators, members, methods
![Page 5: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/5.jpg)
6/21/2004
Characterizing document content in terms of ontology: “Semantic Annotation” Correlate words/phrases from document with
concepts/relationships in ontology Meta-data added to document (from associated
ontological knowledge) Active area of research but practically useful
technology now available (e.g., Semagix Freedom)
![Page 6: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/6.jpg)
6/21/2004
Semantic Relationships between Document & “Need to Know” Semantic associations: relationships between
document concepts & “need to know” concepts are discovered and ranked
Ranking based on multiple factors no. of links, types of links, location in ontology, …
Ranking indicates degree of semantic “closeness” and therefore, how related document is to “need
to know”
![Page 7: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/7.jpg)
6/21/2004
Research Content
Discovery & ranking of semantic associations Characterizing “need to know” in terms of
ontological concepts & relationships (context of investigation)
While applying emerging technologies for Ontology design and population Meta-data annotation of heterogeneous documents
correlation of document content with concepts in ontology
![Page 8: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/8.jpg)
6/21/2004
Relevance Ranking of DocumentsFour groups of document-ranking:- Not Related Documents
- unable to determine relation to context- Ambiguously Related Documents
- some relationship exists to the context- Closely Related Documents
- Entities are closely related to the context- Highly Related Documents
- Entities are a direct match to the context
Cut-off values determine grouping of documents w.r.t. relevance- These are customizable cut-off values (more control and more
meaningful parameters compared to say automatic classification or statistical approaches)
“Inspection” of a document is possible via (a) original document or (b) original document with highlighted entities
![Page 9: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/9.jpg)
6/21/2004
IA Context of Investigation(characterization of “Need to Know”)
We define the context of investigation as a combination of the following:
A set of entity classes and relationships, and/or a negation of a set of entity classes and relationships
A set of entity instance names, and/or a negation of a set of entity instance names
A set of keyword values that might appear at any attribute of the populated instance data, and/or a negation of a set of keyword values
![Page 10: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/10.jpg)
6/21/2004
Context of Investigation (cont) Goal is to capture, at a high level, the types of
entities, (or relationships), that are considered important.
Relationships can be constrained to be associated with specified class types E.G. It can be specified that a relation ‘affiliated with’ is part
of the context only when it is connected with an entity that belongs to a specific class, say, ‘Terror Organization’
![Page 11: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/11.jpg)
6/21/2004
graph-based creation of a context of investigation 26,489 entities
34,513 (explicit) relationships
Add relationship to context
![Page 12: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/12.jpg)
6/21/2004
Additional Semantic Constraints
![Page 13: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/13.jpg)
6/21/2004
Components of Document Relevance
(specific entities)• Abu Abdallah • Turkmenistan • Konduz Province • …
Context of Investigation
e7:Terror Organization
e4:Watch-List
worksat
friends withcit
izen
of
citizen of
liste
d in
e3:Person
claims
responsibility fo
r
e8:Event
friends
withe1:Person
livesin
e5:Person
e6:State
e9:Person
e2:Country
e6:Company
e7:Terror Organization
e4:Watch-List
worksat
friends withcit
izen
of
citizen of
liste
d in
e3:Person
claims
responsibility fo
r
e8:Event
friends
withe1:Person
livesin
e5:Person
e6:State
e9:Person
e2:Country
e6:Company
e7:Terror Organization
e4:Watch-List
worksat
friends withcit
izen
of
citizen of
liste
d in
e3:Person
claims
responsibility fo
r
e8:Event
friends
withe1:Person
livesin
e5:Person
e6:State
e9:Person
e2:Country
e6:Company
Entities belong to classes in the
Context type(entity) Context
1.
Relationships constrains
Relationship [Class]
2.
Entities match a list of entities of interest (in the Context)
entity Entities-List
3.
![Page 14: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/14.jpg)
6/21/2004
Some thoughts along the way “An object by itself is intensely uninteresting.”
Grady Booch, Object Oriented Design with Applications, 1991
I might as well join my better known colleagues:
“Relationship is at the heart of semantics. Ontology is at the hear of the Semantic Web.”
![Page 15: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/15.jpg)
6/21/2004
Schematic of Ontological Approach to the Legitimate
Access Problem Semagix Freedom
Semagix Freedom
![Page 16: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/16.jpg)
6/21/2004
Show me the stuff …
here you go … demonstration
![Page 17: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/17.jpg)
6/21/2004
![Page 18: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/18.jpg)
6/21/2004
Semantic Annotation
Document searched for entity names (or synonyms) contained in ontology
Then document entities are annotated with additional information from corresponding entities in ontology including named relationships to other entities
Following chart is example Highlighted text are entities found corresponding to
concepts in ontology XML is corresponding meta-data annotation
![Page 19: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/19.jpg)
6/21/2004
Relevance Measures for Documents(relating document content to IA “need to know” Relevance engine input
the set of semantically annotated documents the context of investigation for the assignment the ontology schema represented in RDFS, and
the ontology instances represented in RDF Relevance measure function used to verify
whether the entity annotations in the annotated document can be fit into the entity classes, entity instances, and/or keywords specified in the context of investigation.
![Page 20: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/20.jpg)
6/21/2004
Challenges we have addressed- Discovery of Semantic Associations per entity per
document- Input/Visualization/Management of Context of
Investigation- Scalability on number of documents & ontology size
- Performs well (in terms of time and scalability) with thousands of documents and for scenarios when a IA investigation has involved hundreds of documents
- No systematic measure of quality for this specific application/scenario (general evaluation of research is done)
![Page 21: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/21.jpg)
6/21/2004
Challenges to be addressed
- Scalability to a million+ documents (possibly with preprocessing/filtering)
- Further development/enrichment of the ontology- Improved measure of the strength of Semantic
Associations- Evaluations by human subjects- Visualization and interactive discovery
![Page 22: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/22.jpg)
6/21/2004
References 1. B. Aleman-Meza, C. Halaschek, I.B. Arpinar, A. Sheth, Context-Aware Semantic Association Ranking. Proceedings of Semantic Web and Databases Workshop, Berlin, September 7- 8 2003, pp. 33-50 2. B. Aleman-Meza, C. Halaschek, A. Sheth, I.B. Arpinar, and G. Sannapareddy. SWETO: Large-Scale Semantic Web Test-bed. Proceedings of the 16th International Conference on Software Engineering and Knowledge Engineering (SEKE2004): Workshop on Ontology in Action, Banff, Canada, June 21-24, 2004, pp. 490-493 3. R. Anderson and R. Brackney. Understanding the Insider Threat. Proceedings of a March 2004 Workshop. Prepared for the Advanced Research and Development Activity (ARDA). http://www.rand.org/publications/CF/CF196/ 4. K. Anyanwu and A. Sheth ρ-Queries: Enabling Querying for Semantic Associations on the Semantic Web The Twelfth International World Wide Web Conference, Budapest, Hungary, 2003, pp. 690-699 5. K. Anyanwu, A. Maduko, A. Sheth, SemRank: Ranking Complex Relationship Search Results on the Semantic Web, In Proceedings of the 14th International World Wide Web Conference, Japan 2005 (accepted, to appear) 6. K. Anyanwu, A. Maduko, A. Sheth, J. Miller. Top-k Path Query Evaluation in Semantic Web Databases. (submitted for publication), 2005 7. C. Halaschek, B. Aleman-Meza, I.B. Arpinar, A. Sheth Discovering and Ranking Semantic Associations over a Large RDF Metabase Demonstration Paper, VLDB 2004, 30th International Conference on Very Large Data Bases, Toronto, Canada, 30 August - 3 September, 2004 8. B. Hammond, A. Sheth, and K. Kochut, Semantic Enhancement Engine: A Modular Document Enhancement Platform for Semantic Applications over Heterogeneous Content, in Real World Semantic Web Applications, V. Kashyap and L. Shklar, Eds., IOS Press, December 2002, pp. 29-49
![Page 23: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/23.jpg)
6/21/2004
References (cont)
9. M. Rectenwald, K. Lee, Y. Seo, J.A. Giampapa, and K. Sycara. Proof of Concept System for Automatically Determining Need-to-Know Access Privileges: Installation Notes and User Guide. Technical Report CMU-RI-TR-04-56, Robotics Institute, Carnegie Mellon University, October, 2004. http://www.ri.cmu.edu/pub_files/pub4/rectenwald_michael_2004_3/rectenwald_michael_20 04_3.pdf 10. C. Rocha, D. Schwabe, M.P. Aragao. A Hybrid Approach for Searching in the Semantic Web, In Proceedings of the 13th International World Wide Web, Conference, New York, May 2004, pp. 374-383. 11. M.A. Rodriguez, M.J. Egenhofer, Determining Semantic Similarity Among Entity Classes from Different Ontologies, IEEE Transactions on Knowledge and Data Engineering 2003 15(2):442-456 12. A. Sheth, C. Bertram, D. Avant, B. Hammond, K. Kochut, and Y. Warke. Managing Semantic Content for the Web. IEEE Internet Computing, 2002. 6(4):80-87 13. A. Sheth, B. Aleman-Meza, I.B. Arpinar, C. Halaschek, C. Ramakrishnan, C. Bertram, Y. Warke, D. Avant, F.S. Arpinar, K. Anyanwu, and K. Kochut. Semantic Association Identification and Knowledge Discovery for National Security Applications. Journal of Database Management, Jan-Mar 2005, 16 (1):33-53 14. Boanerges Aleman-Meza, Phillip Burns, Matthew Eavenson,Devanand Palaniswami, Amit Sheth. An
Ontological Approach to the Document Access Problem of Insider Threat
![Page 24: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/24.jpg)
6/21/2004
Conclusions
New Semantic Approach to a class of challenging problems: vendor vetting, knowledge discovery, ….
Viability demonstrated on a small scale (comprehensive demonstration)
Significant new research that builds upon the latest Semantic Platform
![Page 25: An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,](https://reader035.vdocuments.net/reader035/viewer/2022062801/56649e225503460f94b0fa3c/html5/thumbnails/25.jpg)
6/21/2004
A parting thought
“Discovery commences with an awareness of anomaly …”
Thomas S. Kuhn, in The Structure of Scientific Revolutions