data tactics unified dataspace architecture and description

14
WWW.DATA–TACTICS.COM ARCHITECT – ENGINEER – INTEGRATE – SOL © 2012 Data Tactics Data Tactics Unified DataSpace

Upload: datatactics

Post on 26-Jan-2015

126 views

Category:

Documents


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Data Tactics Unified Dataspace Architecture and Description

WWW.DATA–TACTICS.COM ARCHITECT – ENGINEER – INTEGRATE – SOLUTIONS © 2012 Data Tactics

Data Tactics

Unified DataSpace

Page 2: Data Tactics Unified Dataspace Architecture and Description

WWW.DATA–TACTICS.COM ARCHITECT – ENGINEER – INTEGRATE – SOLUTIONS © 2012 Data Tactics

Cloud

Page 3: Data Tactics Unified Dataspace Architecture and Description

WWW.DATA–TACTICS.COM ARCHITECT – ENGINEER – INTEGRATE – SOLUTIONS © 2012 Data Tactics

SYSTEMS ENGINEERING

• Data Ingestion Frameworks (structured, unstructured, semi-structured)

• Semantic DataSpace Enrichment

• Cloud Management Systems (CMS)

• Cloudbase/Accumulo

– Pig (Big Data) Plug-in

• Dissemination and Reporting Tools

• Data Mining, Exploitation, and Correlation Tools

Systems Engineering & Integration

SYSTEM INTEGRATION

• Ingestion

– Generalized Ingest / NiagraFiles

• Geospatial Capabilities

• Biometric Capabilities

Page 4: Data Tactics Unified Dataspace Architecture and Description

WWW.DATA–TACTICS.COM ARCHITECT – ENGINEER – INTEGRATE – SOLUTIONS © 2012 Data Tactics

Cloud Experience

17 Enclaves at SECRET//NOFORN

• 3 in Tyson’s• 1 at GISA, Ft. Bragg• 2 in Hawaii• 2 in Germany• 7 at Aberdeen• 2 in Afghanistan

6 Enclaves at TS//SCI• AF TENCAP• NRL• DARPA• INSCOM• DCGS-A• DHS OI&A

4 Enclaves for NATO ISAF• 2 in Afghanistan• 1 at GISA, Fort Bragg• 1 in Germany

US BICES Cloud in GermanyOver a dozen at UNCLASS//FOUO

• Supporting real-world missions on contract

• At various levels of complexity

Cloud Domains is where we live

Data, is the Hard Problem

Page 5: Data Tactics Unified Dataspace Architecture and Description

WWW.DATA–TACTICS.COM ARCHITECT – ENGINEER – INTEGRATE – SOLUTIONS © 2012 Data Tactics

Data – The Hard Part

Page 6: Data Tactics Unified Dataspace Architecture and Description

WWW.DATA–TACTICS.COM ARCHITECT – ENGINEER – INTEGRATE – SOLUTIONS © 2012 Data Tactics

Data Tactics has delivered solutions that manage PETABYTES of data and provide mission relevant analytics, metrics and user interfaces • DESIGN, DEVELOPMENT AND INTEGRATION OF REFERENCE

ARCHITECTURES– Ghost Machine

– Stratus

• SECURE DATABASE ARCHITECTURES– Secure Entity Database (SED)

– Defense Cross-Domain Analytic Capability (DCAC)

• DATA MIGRATION, EXTRACTION, TRANSFORM AND PARSING

• FEDERATED DATA MANAGEMENT– Federated Search, Multi-Source / Multi-Vendor Integration

– Storage Cluster Management

• DATA MINING AND FORENSIC ANALYSIS

• SPATIAL, MULTI-DOMAIN, AND CLOUD DATA SERVICES

BigData Architecture

Page 7: Data Tactics Unified Dataspace Architecture and Description

WWW.DATA–TACTICS.COM ARCHITECT – ENGINEER – INTEGRATE – SOLUTIONS © 2012 Data Tactics

Data Models

Unified DataSpaceThe Wild• Data sources with rich data & semantic context locked in domain silos• Data tightly coupled to data-models

• Data-models tightly coupled to storage models

Silos isolated by• Implementation technology

• Storage structure• Data representation

• Data modality

Segment 2 - Data Description

Segment 1 - Artifact Description

Segment 3 - Model Description

Unstructured Data

Rich semantic context

Rich data context

IntegrationEnrichmentExploitationExplorationAcross all sources

Structured Data

Page 8: Data Tactics Unified Dataspace Architecture and Description

WWW.DATA–TACTICS.COM ARCHITECT – ENGINEER – INTEGRATE – SOLUTIONS © 2012 Data Tactics

Unified DataSpaceHigh-Level Conceptual Model of the DataSpace

and Ingest/Extraction Flows

Segment 1 - Artifact Semantics

. .

ARTIFACT

Segment 2 - Data Semantics .

. .

. .

.

TERM STATEMENT

Segment 3 - Model Semantics . .

.

. .

CONCEPT PREDICATE

UsesUses

ARTIFACT_ASSOCIATION

Segment 0 - Artifacts

Metadata

Data+

Metadata

Uses

Semantics+

Metadata

Ingest Extraction

22

CONCEPT_ASSOCIATION

2PREDICATE_ASSOCIATION

2

SOURCE

Page 9: Data Tactics Unified Dataspace Architecture and Description

WWW.DATA–TACTICS.COM ARCHITECT – ENGINEER – INTEGRATE – SOLUTIONS © 2012 Data Tactics

Unified DataSpace

High-Level Conceptual Model of the DataSpace and Ingest/Extraction Flows

Segment 1 - Artifact Semantics

. .

ARTIFACT

Segment 2 - Data Semantics .

. .

. .

.

TERM STATEMENT

Segment 3 - Model Semantics . .

.

. .

CONCEPT PREDICATE

UsesUses

ARTIFACT_ASSOCIATION

Segment 0 - Artifacts

Metadata

Data+

Metadata

Uses

Semantics+

Metadata

Ingest Extraction

22

CONCEPT_ASSOCIATION

2PREDICATE_ASSOCIATION

2

SOURCE

•Segment 0 is an artifact store (i.e., binary representation of artifacts).

•Segment 1 represents artifact semantics and includes artifact metadata and associations between the artifacts. Indexing of Segment 1 supports search on text content, geospatial, and artifact meta data.

•Segment 2 represents data and semantics of structured data elements extracted from artifacts. Indexing of Segment 2 supports search on properties of entities (e.g., Person, Location) based on their properties and relationships.

•Segment 3 represents data-models extracted from artifacts and models used for aligning, disambiguating, and enriching the elements of Segments 1 and 2.

Page 10: Data Tactics Unified Dataspace Architecture and Description

WWW.DATA–TACTICS.COM ARCHITECT – ENGINEER – INTEGRATE – SOLUTIONS © 2012 Data Tactics

• DDF – looks at data in the following ways– Mention: A chunk of data, either physically located within a tangible

artifact, or contained within an analyst’s mind • “Washington” at offset x in file Y

– Sign: A representation of all disambiguated mentions that are identical except for their indexicality

• E.g., “Washington”

– Concept: An abstract idea, defined explicitly or implicitly by a source data-model

• E.g., City, Person, Name, Address, Photo

– Predicate: An abstract idea used to express a relationship between “things” • E.g., isCity, isPerson, hasName, hasAddress, hasPhoto

– Term: A disambiguated sign abstracted from the source artifact or asserting analyst

• E.g., Washington Person; Washington Location

– Statement: Encodes a binary relationship between a subject (term) and an object mediated by a predicate

• E.g.,[Washington, Person] hasPhoto [GeorgeWashingtonImage.jpg]

Data Description Framework

Page 11: Data Tactics Unified Dataspace Architecture and Description

WWW.DATA–TACTICS.COM ARCHITECT – ENGINEER – INTEGRATE – SOLUTIONS © 2012 Data Tactics

Unified DataSpace

Page 12: Data Tactics Unified Dataspace Architecture and Description

WWW.DATA–TACTICS.COM ARCHITECT – ENGINEER – INTEGRATE – SOLUTIONS © 2012 Data Tactics

Data Model Example

Page 13: Data Tactics Unified Dataspace Architecture and Description

WWW.DATA–TACTICS.COM ARCHITECT – ENGINEER – INTEGRATE – SOLUTIONS © 2012 Data Tactics

DataSpace Workbench

Page 14: Data Tactics Unified Dataspace Architecture and Description

WWW.DATA–TACTICS.COM ARCHITECT – ENGINEER – INTEGRATE – SOLUTIONS © 2012 Data Tactics

Elastic Data Ingest

QueueLoader

Artifact Processor

Persistence Manager

Index Manager

Error Manager

Java Messaging Service

Artifact Processor Queue

Persistence Manager Queue

Index Manager Queue

Error Manager Queue

File System

UDS Components

Custom Components

Artifact Processor Modules

Persistence Manager Modules

Hadoop DFS

BigTable

Lucene