anzo smart data integration dgiq 2014
TRANSCRIPT
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential.
Smart Enterprise Data Management
DGIQ 2014
Cambridge Semantics Contact:Marty LoughlinVice President, Financial ServicesCambridge Semantics141 Tremont St., 6th Floor, Boston, [email protected](o) 617.855.9565
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 2
Introduction to Cambridge Semantics
Anzo, a software suite driven by Semantic Web standards to execute Smart Data Solutions for diverse data from varied sources
Company:
Founded by senior members of IBM’s Advanced Internet Technology Group
Software:
Our Anzo software suite is built on W3C Semantic Web open data standards
Currently 3rd generation of the product in production use
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 3
Smart Enterprise Data Management
Smart Enterprise Data Management,
a new, sensible paradigm for managing enterprise data at the conceptual level.
1. Common conceptual models
2. Models “glue” related data
3. Automates data integration and data services
4. Operationalizes Data Governance
5. Accessible to domain experts
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 4
Smart Data Integration Overview
Smart Data Integration uses common, conceptual models with existing ETL tools to increase the speed and decrease the cost of completing high-quality, governed data integration
projects by 10 times or more.
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 5
The Data Integration Challenge
Customer 360
EnterpriseWarehouse
BusinessData Marts
Compliance andRegulatory Reporting
Source Systems
Lead(SFA system)
Quote(Quote system)
Order(OMS system)
Contract(CMS system)
Target SystemsS x T
ETL Jobs
Each Job
DefineMapping
Requirements
Code ETL Job
Test & Deploy
Business Analyst Developers QA & OpsEach ETL Project:• Manually coded• Requires source
& target SMEs• Many hand-offs
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 6
The Common Model is “Data Glue”
Lead(SFA system)
Quote(Quote system)
Order(OMS system)
Contract(CMS system)
Common Model(“Data Glue”)
Source Systems
• Many common concepts across disparate systems
• Semantic data science connects these common concepts
• Data is “glued” together by its underlying business meaning
• Potential to use industry standard models, e.g., FIBO
Business Analysts and IT can use conceptual models to:• Create data services• Understand the data landscape• Track data lineage• Conduct downstream analytics
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 7
Anzo Smart Data Integration
Compliance andRegulatory Reporting
Each ETL Job:• Generated from map• Only source SME
required• Hours, not months
Customer 360
EnterpriseWarehouse
BusinessData Marts
Source Systems
Lead(SFA system)
Quote(Quote system)
Order(OMS system)
Contract(CMS system)
Target Systems
Each Job
Map Source toConceptual Model
Business Analyst
S + T Maps
Automatically Generate ETL
Anzo SDI
Common Model(“Data Glue”)
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 8
ASDI Use Case: Creating a Data Lake
DataLake
Common Conceptual Model
Self-service Analytics
Self-service Data Extracts & Marts
Sources
Anzo Smart Data Integration
Anzo provides a platform –wide
common conceptual model
Anzo enables end-user self-service
using models
ASDI streamlines and automates data
ingestion
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 9
Map to/from conceptual
models
Automatically generate ETL
jobs
Create Data Marts and
ExtractsOn-demand
Support data lineage and governance
Anzo Smart Data Integration Capabilities
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 10
ASDI: Map to & from Conceptual Models
• ASDI Data Mapper– Built for Business Analysts– Uses an Excel-based interface to map systems to/from
conceptual models
• Supports mappings:– Physical-to-Conceptual– Conceptual-to-Conceptual– Physical-to-Physical
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 11
ASDI: Automatically Generate ETL Jobs
• Mappings are combined and reused to define projects
• Projects are automatically compiled into ETL jobs to run on Pentaho or Informatica– Support for SSIS, DataStage,
Ab Initio, Talend, and others to follow
• Generated jobs include automated quality checks and error handling
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 12
ASDI: Create Data Marts and Extracts On-demand
• Publish well-described data services into a shared, searchable data catalog
• Analysts use an “iTunes-like” interface to select data elements needed for analysis
• Populate existing or new data marts and extracts in minutes, as needed
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 13
ASDI: Support Data Lineage and Governance
• Automatically generated as a by-product of data integration projects
• Searched and browsed through governance dashboards
• Track and audit revisions of models and mappings
• Detect changes in the schemas of source databases
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 14
Anzo Smart Data Integration Architecture
Smart Data Integrator(web application)
…includes:• Project manager• Schema manager• Model manager• Data feed manager• Governance dashboards
Data Mapper(Excel-based)
Conceptual Model Editor
Anzo Smart EDM Server
ETL Compiler
MappingRegistry
ConceptualModel Registry
Schema & Sample Data Registry
Data SourceRegistry
Data Feeds Catalog
Services:
• Sample data service• Data feed persistence service• Revision & audit service• Access control service
ETL Engine
• SQL• CSV/TSV• XML• Proprietary
• SQL• CSV/TSV• XML• Proprietary
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 15
ASDI User Roles
•Defines projects and mappings
•Configures data sources & schemas
•Publishes projects to ETL tools
•Populates Data Catalog with Data Feeds
Full User
•Search and browse Data Catalog
•Creates on-demand data marts and extracts from Data Catalog
Data Consumer
•Manage models
•Browse and search projects
•Browse and search data lineage
Governance User
•Configures users and roles
•Configures dashboards and templates
Administrator
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 16
Example Smart Data Integration Use Cases
A large financial institution offering data analytics as a service to its customers needed to streamline the multi-month process of
combining data from internal systems with data from a customer’s own systems.
A large consumer lending organization faces rapidly evolving regulatory reporting requirements that place an unsustainable burden on their compliance teams. They need a flexible, self-service Data Lake with
interactive reporting capabilities.
A large consulting company helping its customers migrate from multiple legacy systems to a new platform looks to leverage their
domain expertise to accomplish high-quality migrations an order of magnitude faster than competitors.
Customer Onboarding
Regulatory Reporting
Data Migration
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 17
ASDI Lead Customer Value
Existing Process
1 Business Analyst for 6 Months
4 Developers for 6 Months
3 Testers for 6 Months
Total Cost: ~$1.5M
• Projected value from a lead customer for a complex data integration project:
• Estimated development cost reduced from $1.5M to ~ $100K• Estimated development time reduced from over 6 months to ~1 month
New Anzo SDI Process
1 Business Analyst for 1 Month
1 Developer for 1 Month
1 Tester for 1 Month
Total Cost: ~$100K
©2014 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 18
Anzo Smart Data Integration Demo
MySQL
Oracle QA
Source Systems
Holdings
Security Master
Target Systems
ConceptualAsset
Model
• Map Holdings database to Conceptual Asset Model• Map Conceptual Asset Model to MySQL target database• Publish and run Pentaho job
• Add Oracle QA database as an additional target• Add Securities Master (SMF) as an additional source• Publish and run Pentaho jobs
Scenario 1
Scenario 2/3