anzo smart data integration february 2015
TRANSCRIPT
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential.
Anzo Smart Data Integrator
Cambridge Semantics Contact:
Marty Loughlin
Vice President, Financial Services
Cambridge Semantics
141 Tremont St., 6th Floor, Boston, MA
www.cambridgesemantics.com
(o) 617.855.9565
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 2
Introduction to Cambridge Semantics (CSI)
The Anzo Smart Data Platform is used to create data analytics and management solutions with diverse data from varied sources
Company:
Founded in 2007 by senior team from IBM’s Advanced Internet Technology Group Privately Funded Select customers:
Software:
Market leading Anzo software suite is built on open Semantic Web standards
Currently 3rd generation of the product in production use
Business Intelligence / Analytics Solutions
2013(Winner) 2014(Finalist) 2014 Innovation Showcase
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 3
Data Integration Scenarios
Customer Data On-boarding
Data aggregation for regulatory reporting
Data Lake
Cloud Migration
Mergers and Acquisitions
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 4
Anzo Smart Data Integrator Overview
Anzo Smart Data Integration uses common, conceptual models
with existing ETL tools
to increase the speed and decrease the cost of completing high-quality, governed data integration
projects
by 10 times or more.
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 5
The Data Integration Challenge
Customer 360
EnterpriseWarehouse
BusinessData Marts
Compliance andRegulatory Reporting
Source Systems
Lead(SFA system)
Quote(Quote system)
Order(OMS system)
Contract(CMS system)
Target SystemsS x T
ETL Jobs
Each Job
DefineMapping
Requirements
Code ETL Job
Test & Deploy
Business Analyst Developers QA & OpsEach ETL Project:• Manually coded• Requires source
& target SMEs• Many hand-offs
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 6
The Common Model is “Data Glue”
Lead(SFA system)
Quote(Quote system)
Order(OMS system)
Contract(CMS system)
Common Model(“Data Glue”)
Source Systems
• Data is “glued” together by its underlying business meaning
• Potential to use industry standard models, e.g., FIBO, CDISC, HL7
Business Analysts and IT can use conceptual models to:• Create data services• Understand the data landscape• Track data lineage• Conduct downstream analytics
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 7
Anzo Smart Data Integration
Compliance andRegulatory Reporting
Each ETL Job:• Generated from map• Only source SME
required• Hours, not months
Customer 360
EnterpriseWarehouse
BusinessData Marts
Source Systems
Lead(SFA system)
Quote(Quote system)
Order(OMS system)
Contract(CMS system)
Target Systems
Each Job
Map Source toConceptual Model
Business Analyst
S + T Maps
Automatically Generate ETL
Anzo SDI
Common Model(“Data Glue”)
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 8
Map to/from conceptual
models
Combine maps &
automatically generate ETL
jobs
Create data marts and extracts
on-demand
Explore data provenance
Anzo Smart Data Integration Capabilities
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 9
Anzo Smart Data Integration Components
Model Manager
Data Connections
Schema Manager
Project Manager
Mapping Manager
Provenance Explorer
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 10
Anzo Smart Data Integration Demo
MySQL
Source System
HoldingsTable
Target SystemConceptualAsset
Model
• Map Holdings database to Conceptual Asset Model• Map Conceptual Asset Model to MySQL target database• Publish and run Pentaho job• Demonstrate Lineage
Scenario
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 11
ASDI Lead Customer Value
Existing Process
1 Business Analyst for 6 Months
4 Developers for 6 Months
3 Testers for 6 Months
Total Cost: ~$1.5M
• Projected value from a lead customer for a complex data integration project:
• Estimated development cost reduced from $1.5M to ~ $100K• Estimated development time reduced from over 6 months to ~1 month
New Anzo SDI Process
1 Business Analyst for 1 Month
1 Developer for 1 Month
1 Tester for 1 Month
Total Cost: ~$100K
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 12
Appendix
Ap
plic
atio
ns
Mid
dle
war
eEn
terp
rise
Dat
a Fa
bri
c
Anzo Server
Reasoning& Rules
WorkflowSemanticServices
Anzo Connect
Enterprise Directory Connect
Anzo Unstructured
Data Marts & Warehouses
EnterpriseApplications
Directory(LDAP, AD)
………
3rd Party Databases & Applications
Anzo Architecture & Capabilities
Exte
rnal
So
urc
es• User self-serve• Interactive• Conceptual model• Search, filter, BI
analytics, forms, alerts,…
• Cache or virtualize data based on W3C semantic standards
• Based on real-time event based architecture
• Embedded graph database
• Two-way integration to existing systems
• Anzo Unstructured pipeline allows easy plug-in of 3rd Party NLP and crawlers
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 14
Anzo Smart Data Integration Architecture
Smart Data Integrator(web application)
…includes:• Project manager• Schema manager• Model manager• Data feed manager• Governance dashboards
Data Mapper(Excel-based)
Conceptual Model Editor
Anzo Smart EDM Server
ETL Compiler
MappingRegistry
ConceptualModel Registry
Schema & Sample Data Registry
Data SourceRegistry
Data Feeds Catalog
Services:
• Sample data service• Data feed persistence service• Revision & audit service• Access control service
ETL Engine
• SQL• CSV/TSV• XML• Proprietary
• SQL• CSV/TSV• XML• Proprietary
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 15
ASDI Use Case: Creating a Data Lake
DataLake
Common Conceptual Model
Self-service Analytics
Self-service Data Extracts & Marts
Sources
Anzo Smart Data Integration
Anzo provides a platform –wide
common conceptual model
Anzo enables end-user self-service
using models
ASDI streamlines and automates data
ingestion
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 16
ASDI User Roles
•Defines projects and mappings
•Configures data sources & schemas
•Publishes projects to ETL tools
•Populates Data Catalog with Data Feeds
Full User
•Search and browse Data Catalog
•Creates on-demand data marts and extracts from Data Catalog
Data Consumer
•Manage models
•Browse and search projects
•Browse and search data lineage
Governance User
•Configures users and roles
•Configures dashboards and templates
Administrator
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 17
ASDI: Map to & from Conceptual Models
• ASDI Data Mapper– Built for Business Analysts– Uses an Excel-based interface to map systems to/from
conceptual models
• Supports mappings:– Physical-to-Conceptual– Conceptual-to-Conceptual– Physical-to-Physical
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 18
ASDI: Automatically Generate ETL Jobs
• Mappings are combined and reused to define projects
• Projects are automatically compiled into ETL jobs to run on Pentaho or Informatica– Support for SSIS, DataStage,
Ab Initio, Talend, and others to follow
• Generated jobs include automated quality checks and error handling
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 19
ASDI: Create Data Marts and Extracts On-demand
• Publish well-described data services into a shared, searchable data catalog
• Analysts use an “iTunes-like” interface to select data elements needed for analysis
• Populate existing or new data marts and extracts in minutes, as needed
©2015 Cambridge Semantics Inc. All rights reserved. Company Confidential Page 20
ASDI: Support Data Lineage and Governance
• Automatically generated as a by-product of data integration projects
• Searched and browsed through governance dashboards
• Track and audit revisions of models and mappings
• Detect changes in the schemas of source databases