kairos knowledge-directed artificial intelligence ... proposers day final.pdf · • learning data:...

28
KAIROS Knowledge-directed Artificial Intelligence Reasoning Over Schemas Boyan Onyshkevych 1 January 9, 2019 Approved for Public Release. Distribution Unlimited.

Upload: others

Post on 08-Jul-2020

9 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

KAIROSKnowledge-directed Artificial Intelligence

Reasoning Over Schemas

Boyan Onyshkevych

1

January 9, 2019

Approved for Public Release. Distribution Unlimited.

Page 2: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

2

Agenda

Start End Session

9:00 10:00 Registration

10:00 10:15 Security Briefing

10:15 11:00 Contracts Management Office Briefing Mark Jones, DARPA Contracting Management Office

11:00 12:00 KAIROS PresentationDr. Boyan Onyshkevych, Program Manager, DARPA I2O

12:00 1:15Break (Each attendee may speak for 2 minutes for teaming purposes. No slides or handouts. DARPA representatives will not be present)

1:15 2:30 Question Answering Session

Approved for Public Release. Distribution Unlimited.

Page 3: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

3

Proposers’ Day Information

BAA Location• Posted on FedBizOpps website (http://www.fedbizopps.gov)

and Grants.gov website (http://www.grants.gov)Questions Today• Questions can be submitted until 12:00 to [email protected] or on 3x5 cards• Questions will be answered during Q&A session in the afternoonProposers Day Website• Proposers’ Day presentations will be posted• Frequently Asked Questions (FAQ) will be updated with Q/A from [email protected] precedence• If anything said or addressed during this presentation or in the FAQ conflicts with the

published solicitation, the BAA takes precedence. The Government may issue amendments to the BAA to effect any changes deemed necessary in response to the FAQ. Such amendments would be posted to FBO and Grants.gov prior to the solicitation closing date and would supersede previous versions of the solicitation.

Approved for Public Release. Distribution Unlimited.

Page 4: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

4

Create schema-based artificial intelligence capability to enable contextual and temporal reasoning about complex real-world events in order to generate actionable understanding of these events and predict how they will unfold

KAIROS Program Goal

Approved for Public Release. Distribution Unlimited.

Page 5: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

5

KAIROS Contribution

Static Events Temporally and Contextually Ordered Events

Truck Acquisition

Soccer GameLaptop

Acquisition

Demonstration

Truck Modification

Missile Launch Test

Marathon

Laptop Delivery

Software Loading

No way to connect these without temporal information and event patterns (schemas)

Time

Truck Acquisition

Fertilizer Acquisition

Laptop Acquisition

Demonstration

Truck Modification

Marathon

Laptop Delivery

Software Loading

Schema1. Acquisition2. Delivery3. Reuse4. Testing

Temporal information and event patterns (schemas) enable connection of seemingly disjoint events

Fertilizer Acquisition

Tire Purchase

Delivery

Truck Repair

Meeting

Software Loading

Truck Acquisition

Missile Launch Test

Approved for Public Release. Distribution Unlimited.

Page 6: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

6

• A schema is an organized unit of knowledge for an event or series of events, first posited by the cognitive theorist Jean Piaget in 1923

• Schemas are based on past experience and are accessed to guide current understanding or action and predict future events and participants

• Schemas are dynamic – they develop and change based on new information and experiences and thereby support contextual adaptation

• Schemas have deeper levels of organization that tie them together with other schemas that share attributes

• Schemas have been applied to AI problems using First Wave and Second Wave methods

What Schemas Are

Take out jack

Take off hubcap

Remove bad tire

Put away jack

Position jack

Unscrew lug nuts

Put on new tire

Example Schema - Changing a Tire

Approved for Public Release. Distribution Unlimited.

Page 7: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

7

Automated Schema Generation

Startracks

Instance 11. Get metal cutters, etc. 2d2. Remove Door 2h3. Buy lift, brackets, door, and frame 1mo4. Attach lift and door 5h

Instance 21. Get carpentry tools2. Buy beds, kitchen fixtures 1mo3. Find lumber 30min4. Build structure 5d5. Install structure and fixtures 2dDoItYourselfRV

Ubuntu Forums

Instance 31. Buy cameras, lidar, servers 3mo2. Make sensor frame 4h3. Get metalwork tools 10min4. Configure servers 5d5. Connect frame, lidar, camera 2d

Observed Reuse Instances Generalized Reuse SchemaAccessible Truck

Camper

Self-Driving Car

Process of Schema Generation• Intake of massive amounts of open-source event data• Detection, classification, and clustering of events from input• Automatic learning of common schemas like Purchase Schema by generalizing from

instances in big data• Composition of complex less common schemas like Reuse Schema using common

schemas as building blocks

Acquisition Schema1.Purchase 1hr-3mo2.Find 30min3.Make 2h-5d

Reuse Schema1. Acquisition 10m-3mo2. Get tools 10min-2d3. Remove old

structure 1h-2h4. Install new

structure 2h-2d…

In any order

or

Purchase Schema1.Identify 1min-3mo2.Det. Price 30min…

Approved for Public Release. Distribution Unlimited.

Page 8: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

8

• Can we determine technological capabilities from available data, such as commercial transactions?

• Without KAIROS, we may have enough information about potentially-relevant entities and events to populate a knowledge base, but we don’t know which are important or how they fit together

• KAIROS will help us link the relevant entities, locations, events, and sub-events, and make inferences and predictions

Example

Approved for Public Release. Distribution Unlimited.

Page 9: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

Example – Disjoint vs. Connected Events

Truck Purchase

Seller

Buyer

LocationTravel

Destination

Special Vehicle Factory

John Smith

Adam

Participant

Truck Delivery

Destination

Recipient

Discerning Event Structures• These two seemingly unconnected events may actually be the first two steps in

a more complex event• These two individuals are shown by their travel plans possibly to be the same

person

Dr. Scott

Missile FactorySpecial

Vehicle Factory

OriginEmployee

9Approved for Public Release. Distribution Unlimited.

Page 10: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

10

Example - Status of a Complex Event

Reuse Schema1. Acquisition2. Get tools3. Remove old structure4. Install new structure

New Capability Development Schema1. Acquisition2. Delivery3. Reuse4. Testing5. Deployment

New Capability Development

Truck Acquisition

(Nov 2011)

Truck Delivery(Jan 2012)

1st Step 2nd Step

Truck Reuse Modification (Mar-Dec 2012)

3rd Step

Previously Extracted Generalized Hierarchical

Schemas

Acquisition Schema1. Purchase

i. Negotiateii. Pay

Delivery Schema1. Shipping2. Pick-up

i. Travel toii. Take possessioniii. Travel from

Time

Matthew Taylor

Special Vehicle Factory

JohnSmith

Seller

Missile Factory

Dr. Adam Scott

Negotiation(May-Nov 2011)

Travel(3 Jan 2012)

Employee

Recipient

BuyerParticipant

EmployeeOrigin

Destination

Participant

Destination Employee

Location

Approved for Public Release. Distribution Unlimited.

Page 11: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

11

Example – Prediction

Matthew Taylor

Missile Launch Capability Development

Truck Delivery(Jan 2012)

1st Step 2nd Step

Special Vehicle Factory

John Smith

Seller

Missile Factory

Dr. Adam Scott

Truck Reuse Modification(Mar-Dec 2012)

Launcher Testing

(?)

4th Step3rd Step

New Missile Test Site???

TravelTesting Schema1. Travel2. Load equipment3. Execute4. Evaluate

New Capability Development Schema1. Acquisition2. Delivery3. Reuse4. Testing5. Deployment

Analyst schema specialization and search for potential test site

Negotiation(May-Nov 2011)

Travel(3 Jan 2012)

• Discover relevant generalized event schema

• Transfer learning for new scenario adaptation Time

Truck Acquisition

(Nov 2011)

Employee

Location

Recipient

BuyerParticipant

EmployeeOrigin

Destination

Participant

Destination Employee

DestinationLocation

Participant

Participant

Previously Extracted Generalized Hierarchical

Schemas

Approved for Public Release. Distribution Unlimited.

Page 12: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

12

New Methods for Complex Event Analysis

Schema Library

Generalization

Schema Application

Instance Extraction

KnowledgeBase

Archived Data New Data

Learning Time Run Time

Bottom-up Schema LearningUsing 1st and 2nd Wave

Methods

Top-Down Contextual Inference & Temporal Reasoning

Using New Methods

Composition

Specialization

Approved for Public Release. Distribution Unlimited.

Page 13: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

13

KAIROS Architecture

Input Analysis with Temporal

Annotation

Schema Generalization

Predictive Analysis

User Interaction

Schema Matching &

Temporal Reasoning

Multi-media Multilingual Informationin Big Data

Text

Speech

Images

Video

Learning Process

Run-time System Flow

TA1

TA2 TA3

Temporal Knowledge

Base

One-Time Triage of

Generalized Schemas

Schema Composition

Domain-Specific Schema

Curation & Specialization

Curated Schema Library

Approved for Public Release. Distribution Unlimited.

Page 14: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

14

Focus• Analyzing complex events in terms of subsidiary elements, arguments, and temporal and

sequential information to compose and generalize event schemasInputs and Outputs• Input: multi-media, multilingual batch data produced by TA4• Output: Schemas representing the structure of events and their subsidiary elements, how the

events evolve, and what the typical durations and orderings of subsidiary elements areAdditional Requirements• Proposers to TA1 should have available previously-developed entity, relationship, and event

extraction (detection, classification, and representation) technology • TA1 algorithms must be able to communicate their results to a user in a human-readable form

and accept changes from users• TA1 performers will be responsible for a user interface to the TA1 platform and user staffing• TA1 and TA2 performers are expected to provide their KAIROS software to TA3 in a Docker

container or similar form

TA1 Generation of Schemas for Events

Learning Process

Schema Generalization

Multi-media Multilingual Informationin Big Data

TA1

One-Time Triage of

Generalized Schemas

Schema Composition

Domain-Specific Schema

Curation & Specialization

Curated Schema Library

Approved for Public Release. Distribution Unlimited.

Page 15: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

15

Focus• Design and implementation of a temporal knowledge base for use during run-time

using the schemas developed in TA1Inputs and Outputs• Input: From multi-media, multilingual streaming or batch data produced by TA4• Output: Knowledge base containing information about instantiated schemas from the

run-time data, all the events and participants that relate to the schemas, and the temporal relations of the event elements and participants; predictions of possible subsequent events on the basis of the schemas

Additional Requirements• Proposers to TA2 should have available previously-developed entity, relationship, and

event extraction (detection, classification, and representation) technology• TTA2 performers are expected to provide their KAIROS software to TA3 in a Docker

container or similar form

TA2 Representation and Use of Temporal Knowledge and Schemas

Run-time System Flow

Input Analysis with Temporal

Annotation

Predictive Analysis

User Interaction

Schema Matching &

Temporal Reasoning

Text

Speech

Images

Video TA2 TA3

Temporal Knowledge

Base

Curated Schema Library

Approved for Public Release. Distribution Unlimited.

Page 16: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

16

Focus• Design a platform that enables multimedia input in streaming mode or from a corpus and

allows all system components to communicate with an interface that enables users to query the system and control a visualization module

Inputs and Outputs• Input: Algorithms from TA1 and TA2 in Docker container or similar form; data in streaming

mode or from a corpus• Output: Platform able to pass data to TA1 and TA2, enable the schema library built by TA1

to be accessible to TA2, and allow interface between the users and the TA2 algorithms, enabling the users to read, edit, and/or visualize the TA2 instantiated schemas

Additional Requirements• The TA3 interface must also present relative or absolute temporal information• Proposers to TA3 must have the capability to handle classified data• Developing APIs (TA1&TA2), schema format (TA1), & knowledge base format (TA2)

TA3 System Integration and User Interface

Run-time System Flow

Input Analysis with Temporal

Annotation

Predictive Analysis

User Interaction

Schema Matching &

Temporal Reasoning

Text

Speech

Images

Video TA2 TA3

Temporal Knowledge

Base

Curated Schema Library

Approved for Public Release. Distribution Unlimited.

Page 17: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

17

Focus• Development of novel techniques for effective creation, collection, and annotation of the data

necessary for KAIROS research, development, and evaluation.Inputs and Outputs• Input: Direction from DARPA on appropriate data selection• Output: two types of data - one for the learning of schemas and the other for the run-time

development and evaluation Additional Requirements• Learning Data: large volumes of open source material, including news and public posts,

videos, etc. • English and one additional language for which NLP training data sets are available• Roughly 1,000,000 documents containing on the order of 100 different types of events

complex enough to give rise to multi-level schemas• At least five different instances of each type of event, with multiple sources for each event

instance when possible• Proposers to TA4 should propose annotation schemes

• Run-Time Data: resources selected to contain events and schema to support training and evaluation

• English for every evaluation and one other language per evaluation• Five different scenarios: one scenario for development and four for evaluation• The annotation for the evaluation scenarios should consist of the labeling of: complex

events and schemas of interest, the participants in each event, temporal information, and all properties essential for comprehension of each event and its relation to the schema(s)

TA4 Data Creation for Training and Evaluation

Approved for Public Release. Distribution Unlimited.

Page 18: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

18

Evaluation and Metrics

• NIST will conduct evaluations of TA1 and TA2• Evaluation Schedule

• Pilot – 9 months after kickoff• Baseline – end of phase 1• Full evaluation – mid phase 2, end of phase 2, and end of phase 3

• The TA1 schema-generation module will be evaluated for accuracy, consistency, and completeness

• The TA2 representation and use of temporal knowledge and schemas module will be evaluated to determine which schema in the schema library is related to the event being analyzed and the accuracy of all sub-events and their actors

• During Phase 1, TA2 will use manually-constructed schemas • Starting in Phase 2, TA2 will use automated schemas, with reduced improvement expected

• Accuracy targets for the different evaluations of KAIROS relative to the baselines

• Percentages represent F-value, the harmonic mean of precision and recall• Transition partner will conduct evaluation using their own data and

performance metrics

Phase 1Phase 2

MidPhase 2

FinalPhase 3

FinalTA1 – Schema

Generation7% 23% 46% 73%

TA2 – Predictive Analysis

12% 33% 54% 79%

Approved for Public Release. Distribution Unlimited.

Page 19: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

19

Automatic Determination of Validity Periods

Date of Birth

Married to First Spouse

Employed as President

• Rate of validity decay is learned from data like Wikipedia, news, birth announcements, etc.• Examples include:

On Business Travel

Birth Death

Marriage Death

Term 1 Term 3

Mon Sun

Likelihood of current validity

Tue Wed Thu Fri Sat

Approved for Public Release. Distribution Unlimited.

Page 20: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

20

Validity Periods and Pattern of Life Analysis

2005 2011 Now

Point of observation Likelihood of current validity

Name: Adam

Dr. Scott

DOB: 1 Jan. 1942

Nuclear PhysicsEducation:

Job: Chief Engineer & Nuclear Scientist

Workers' Party Leader

Research:Manufacturing

Nuclear Physics

Future

Location:Missile Test Site

Hometown

Research Center

?

Approved for Public Release. Distribution Unlimited.

Page 21: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

Data Creation for Training & Evaluation

System Integration & User Interface

Temporal KB for Predictive Analysis

Evaluation

21

Schedule

Month 0 6 12 18 24 30 36 42 48 54

TA1

TA3

TA4

Baseline Eval 1 Eval 2

TA2

Schema Generation

Pilot

Phase 1 Phase 2 Phase 3

Eval 3

Approved for Public Release. Distribution Unlimited.

Page 22: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

22

Abstracts Due (optional but helpful) January 23, 2019, 12:00 noon (ET)

Proposal Deadline February 27, 2019, 12:00 noon (ET)

Note: the primary filter to be applied in the abstract review process is for relevance and scope. A rough assessment of technical plausibility may be additionally applied.

Deadlines

Approved for Public Release. Distribution Unlimited.

Page 23: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

23

Government-Furnished Information• For every non-English language chosen for any scenario, the Government will provide linguistic

resources and tools of a quality and composition to be determined, but consisting at least of the type and size found in a LORELEI Related Language Pack

Intellectual Property• The program will emphasize creating and leveraging open source technology and architecture.

Intellectual property rights asserted by proposers are strongly encouraged to be aligned with open source regimes.

Security Clearance Requirements• At the time of proposal submission, all proposers to TA3 must have personnel with Top Secret

clearances who are eligible for SCI, access to facilities to store and process SCI material and hold SCI discussions, and the ability to conduct experiments on classified data in government facilities

Teaming• Proposers are welcome to team up, and teaming agreements should be specified in the proposals

Non-US Entities• Non-U.S. organizations and/or individuals may participate as a prime or a sub-contractor to the

extent that such participants comply with any necessary nondisclosure agreements, security regulations, export control laws, and other governing statutes applicable under the circumstances

Miscellaneous Proposal Information

Approved for Public Release. Distribution Unlimited.

Page 24: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

24

Travel• All proposers should expect to send appropriately-sized teams to PI meetings throughout the

continental U.S. for the kick off and then every six months• Proposers to TA3 must be prepared to travel to both CONUS and OCONUS transition partner sites

Number of Awards• DARPA anticipates multiple awards for Technical Areas 1 and 2 and single awards for Technical

Areas 3 and 4. No awards are anticipated for evaluation

Awards for Multiple TAs• Proposals for TA1 and TA2 may be combined into a single proposal, and proposals for TA3 and

TA4 may be combined into a single proposal, but no other combinations are allowed. The decision as to which proposal to consider for award is at the discretion of the Government

• While a proposer may submit proposals for all four technical areas, a particular proposer (as identified by CAGE Code), if selected for TA3 or TA4, will be dispreferred for selection for any portion of TA1 and/or TA2. This preference is intended to avoid OCI situations between the research TAs and the integration and evaluation activities, to ensure objective evaluation

Miscellaneous Proposal Information (cont.)

Approved for Public Release. Distribution Unlimited.

Page 25: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

25

Questions Today• Questions can be submitted until 12:00 to [email protected] or on 3x5 cards• Questions will be answered during Q&A session in the afternoon• Answers will be posted on the AIDA Proposers Day website

Questions in the Future• Email questions to [email protected]• The Frequently Asked Questions section on the KAIROS Proposers Day website will

be updated as new questions come in

How to Ask Questions

Approved for Public Release. Distribution Unlimited.

Page 26: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

www.darpa.mil

26Approved for Public Release. Distribution Unlimited.

Page 27: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

28

No Temporal Information or

Schemas

Manually-Created Schemas

Supervised Machine

Learning of Schemas

KAIROS Multi-Layer Learning

of Schemas

Extracts Basic Event Elements

Creates Schemas without SME Bias

Discovers Specific Schemas from Data

Deals with Uncommon Events

Adapts Dynamically

Approaches to Schema Creation

X

Approach

Attribute

XXX

XX

XX

X

Approved for Public Release. Distribution Unlimited.

Page 28: KAIROS Knowledge-directed Artificial Intelligence ... proposers day FINAL.pdf · • Learning Data: large volumes of open source material, including news and public posts, videos,

29

• Generalization. This will entail identifying, from a large corpus of openly-available news and other public data, schemas describing primitive and complex events. For example, we observe multiple instances of people buying sandwiches. From those instances, we generalize to produce a schema for buying a sandwich. Similarly, we could build schemas for buying books, tires, lumber, etc. for other event instances observed in the data set. From these schemas, we can generalize to produce a generic purchasetransaction schema (which has a buyer, a seller, and an exchange of payment for goods or services). This purchase transaction schema will then be available for further hierarchal generalization, for example, to create a general financial transaction schema. It is expected that generalization will be primarily automated, possibly with some manual curation in which a user can triage putative generalized schemas.

• Composition. The purchase transaction schema will also be available for composition to form complex event schemas like flip house, which would be composed of a sequence of multiple schemas, including a purchase transaction schema with a house as the goods purchased, multiple purchase transactions of building materials by the buyer of the house, and a purchase transaction in which the buyer of the house in the first purchase is the seller. Composition therefore requires the use of role constraints (e.g., the buyer of the house is likely be identical to the buyer of the building materials and the seller of the house) and temporal constraints (the buying of the house will likely take place before the buying of the building materials, which in turn must precede the selling of the house).

• Specialization. The purchase transaction schema can be specialized to include useful domain-specific knowledge about domains not observed in the original training data set. For example, if we have never observed real estate transactions in the training data set, a user can add information about the domain-specific ways in which a real estate transaction differs from other types of purchase transactions, such as adding the role of an agent, title insurance, or escrow. It is expected that specialization will be primarily accomplished through user curation, possibly with some assistance from automated suggestions for added constraints.

Generalization, Composition, & Specialization

Approved for Public Release. Distribution Unlimited.