semi-automated mapping of industry standards · the first ai poc. follow us, share with us!...

18
Semi-automated mapping of industry standards A modern approach Köln, 18.09.2019, eCl@ss Congress

Upload: others

Post on 24-Apr-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Semi-automated mapping of industry standards

A modern approach

Köln, 18.09.2019, eCl@ss Congress

Page 2: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Slide 2 | © CAMELOT 2019 | Semi-automated mapping of industry standards

Value chain management at the leading edge – from strategy to solutions

Intelligent Data Management Camelot Approach

CAMELOT Management Consultants: Thought Leader in Value Chain Management for more than 20 years

Global industry specialist in Life Sciences, Chemicals, Consumer Packaged Goods and Industrial Manufacturing

Specialized expertise for superior project quality and results

Camelot Innovative Technologies Lab is our incubator for Digital Innovations and business applications

Co-Innovation Partnerships with the leading software providers such as SAP SE & IBM

Data-centric and digital business solutions

Sourcing &Network Collaboration

Supply Chain Operations & Manufacturing

Distribution &Logistics

Sales &Customer Centricity

Strategy & Business Model Innovation

Organization & Business Transformation

Finance & Performance Management

Data & Analytics

Page 3: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Slide 3 | © CAMELOT 2019 | Semi-automated mapping of industry standards

AI@CAMELOT

DKFI

Student networks at the University of Mannheim

Start-up companies

Regular presentations at renowed conference

Our partnerships brigde science and business:

HOLISTIC APPROACH

We combine data, people, technology and processes to deliver superior quality solutions that build competitive advantage at any stage of the value chain.

TEAM OF EXPERTS

Our team of highly-qualified data scientists and businessexperts develops solutions for your individual needs and helps to generate value starting with the first AI PoC.

Follow us, share with us!

Intelligent Data Management Camelot Approach

Proven track record of over 15 years in data and information management

10 creative workshops conducted for AI MDM community memebers

8 successfully delivered PoCs (SAP MDG Assistant –chatbot, Rule mining, Web crawling, Data Plausibility Check, and others)

EXPERIENCE

COLLABORATION PLATFORM FOR AI FORERUNNERS

Quarterly newsletter and AI MDM webinar series foralready 100+ members.

ai-mdm.com

Page 4: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Slide 4 | © CAMELOT 2019 | Semi-automated mapping of industry standards

eCl@ss & Camelot for automated mapping

Intelligent Data Management Camelot Approach

March 2018eCl@ss

Workshop I

April 2018

eCl@ss Workshop II

January 2019

eCl@ss Cross Company

Call

July 2019

AI MDM Community Workshop

Page 5: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Slide 5 | © CAMELOT 2019 | Semi-automated mapping of industry standards

Business Problem

Intelligent Data Management Camelot Approach

Mapping of common standards & individual mapping to ERP data causes high entry barrier for companies.

Labor intense & time consuming

ERP classification unharmonized & grown historically, several standards exist

Product/engineering know-how & standards knowledge required

We envision a generic tool to ease

the mapping and encouragecompanies to join eCl@ss.

Page 6: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Slide 6 | © CAMELOT 2019 | Semi-automated mapping of industry standards

Business Value

Intelligent Data Management Camelot Approach

HIGHER eCl@ss ACCEPTANCECompanies will have an additional inventive to join eCl@ss.

ENABLER FOR INDUSTRY 4.0 IoTGreater number of members and easier technical compliance mechanism enlarges the potential of becoming the Industry 4.0 standard.

BETTER AND FASTERLess time required, higher-quality of mapping through elimination of human factor.

Reduced entry barriers for new companies

Scalability of mapping automation

Minimal human interaction required

Automated interfaces between manufacturer and seller

Classes, Characteristics, Values

Classes, Characteristics, Values

Optionsfor

required mappingsManufacturer Seller

X

Page 7: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Slide 7 | © CAMELOT 2019 | Semi-automated mapping of industry standards

Business case: mapping between standards

Intelligent Data Management Camelot Approach

Calculation based on the mapping between eCl@ss and APPLiA –Pi Standard

36 man-days

9 classes

700 features

2000 valuesAssumptions*:

Efforts:Team: 5 personsMeetings: 3 meetings each 2 daysPreparation: 10%Consolidation: 10%

➔ 5 persons x 3 x 2 = 30 man-days + (2 x 10%) = 30

man-days + (2 x 3) = 30 + 6 = 36 man-days

*based on the real mapping run

Efficiency gain through automated mapping of 4 man-days per class

Page 8: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Slide 8 | © CAMELOT 2019 | Semi-automated mapping of industry standards

ML/DS task for business problem

Intelligent Data Management Camelot Approach

1. Mapping of common standards based on:

Synonyms (A) – open source,

Characteristic definition, continuous text (B),

Characteristic/ value meta data (C),

Patterns of characteristic assignments to classes (D)

2. Mapping of company specific standards to common standards based on:

Similarity of the characteristic/ value name and historical mappings done by other companies (A),

Characteristic/ value meta data (B),

Patterns of characteristic assignments to classes (C)

Page 9: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Slide 9 | © CAMELOT 2019 | Semi-automated mapping of industry standards

High level solution envisioning: the Problem can be solved via transferring well established algorithms from the area of data integration

Mapping of common standards

ALGORITHM: SIMILARITY FLOODING

Developed by data base scientists from Stanford & Leipzig University

Well-established in the data science community

Widely used for the topics of data integration, e.g. mergers & acquisitions

Similarity between individual attributes computed based on semantic similarities of name, description, etc. via vectorization through thought vectors.

Local similarities are distributed along the structe of the standard. Two attributes are similar if their semantics and context are similar.

Pairs of attributes get a similarity score for. Best matches are taken as mapping candidates.

Standard mapped

Similarity based on semantics

Context-sensitive semantic similarity

Page 10: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Slide 10 | © CAMELOT 2019 | Semi-automated mapping of industry standards

The algorithm starts with a pairwise comparison of the entities in the ontologies

Mapping on Schema Level

Initial similarity comparison via:

Property names

Property values

Data Types

1

2

Page 11: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Slide 11 | © CAMELOT 2019 | Semi-automated mapping of industry standards

Local similarities are distributed along the hierarchy

Mapping on Schema Level

Computer VS Electronics

Laptop VS Notebook Desktop PC VS Notebook

Laptop VS TVDesktop PC VS TV

0.3 0.2

0.70.9

Current similarity0.5

Computer

Laptop Desktop PC

Electronics

Notebook TV

VS

Compute new similarity score for each pair :

Consider similarities of neighboring nodes

Combine to new score via predefined rules

Iterate till system converges

Saves up to 50% of the manual effort

Page 12: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Slide 12 | © CAMELOT 2019 | Semi-automated mapping of industry standards

We want to leverage machine learning technologies to minimize the manual effort.

Mapping on Schema Level

Learn to compare

Use of current NLP Technology to improve initial matches

Natural language understanding for descriptions

Unsupervised learning of semantic descriptions

Topic modelling via Latent Dirichlet Analysis

Encoding of semantics via Word Vectors

Learn to combine

The message passing in the similarity flooding algorithm is structurally similar to the information flow in the recurrent neural network

Given the right training data we can refine the update rules to achieve higher accuracy

Page 13: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Slide 13 | © CAMELOT 2019 | Semi-automated mapping of industry standards

Even if corresponding classes and properties are identified in order to complete the mapping on instance level transformations have to be applied.

Mapping on Instance Level

𝑋1

𝑋2

𝑋3

(𝑋1, 𝑌1)

(𝑋2, 𝑌2)

(𝑋3, 𝑌3)

Active Learning of instance mappings The machine decides which instances to

translate for training.

Uses the most informative examples

Requires only very few examples

Page 14: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Slide 14 | © CAMELOT 2019 | Semi-automated mapping of industry standards

Each AI Application stands and falls with the user interface.

User Interface

The system needs to be designed to make the task of schema mapping as efficient and as comfortable as possible. Key ingrediency are A comprehensive visualization of the

mapping Intuitive correction functionalities Simple integration into enterprise process

Additionally the user feedback can becontinously fed back to the system to improvethe machine learning components Improve the matching quality in general Refine your own (e.g. company specific)

matcher

Page 15: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Slide 15 | © CAMELOT 2019 | Semi-automated mapping of industry standards

System Architecture

Architecture Overview

Classification A Classification B

Matching Algorithm

Probabilistic Mapping

User Interface

Matching algorithm Input: Classifications 𝐴,𝐵Output: Probabilistic mapping 𝜌: 𝐴 → 𝐵Algorithm capable of mapping arbitrary

industry standards against each other with minimal need for customization such as additional training (if any)

User interface Finalization of mapping by userGraphical presentation for quick and

convenient validation, e.g. visualization of hierarchy

Model refinement trough user feedback

Active Learning

ML Model

AI Core system

User Interaction

Page 16: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Slide 16 | © CAMELOT 2019 | Semi-automated mapping of industry standards

Envisioning workshop leading to a creation of a company specific AI innovation roadmap

Intelligent Data Management Camelot Approach

REASONING

Your specific needs for the application of AI in information management

Kickstart for newcomers in the AI MDM Community

Enrichment of a team event, innovation day or a training

2-day workshop moderated by Camelot

Extensive data intelligence experience and data science basics training

AI innovation roadmap Management

presentation

DELIVERABLES CONTENT

Design thinking methodology

Rapid Prototyping Insights from Camelot

data science and domain experts

Lessons learned from other customers and the AI MDM Community

INVESTMENT

Regular price: € 12.000 For AI MDM

Community members at their location: € 8.000, in one of Camelotoffices: € 6.000

Possible cost sharing with other members of the AI MDM Community

Page 17: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Slide 17 | © CAMELOT 2019 | Semi-automated mapping of industry standards

Entering, changing, approving data records in a system

Analyzing and processing of leading to identification of patterns,

records or classifications

Identifying relevant information in heterogenous sources including

unstructured documents and web

Extraction of classification attributes from data sheets

Gathered expertise and validated data science models enable accelerated delivery of intelligent data management projects

Intelligent Data Management Camelot Approach

Data understanding Data maintenance Data extraction

Optical Character Recognition

Table Extraction

Web crawling

Rule mining

Natural language understanding

Data classification

Fuzzy matching

Vectorization

Highlighting of found attributes on the data sheet

Editing of extracted values

Robotic process automation

Automated validation in the vendor data management in a shared service center

Extraction of relevant information and consistency check

Identification of supply chain scenarios and scenario based rules

Re

aliz

ed

p

roje

cts

Are

aA

pp

roac

he

s

Page 18: Semi-automated mapping of industry standards · the first AI PoC. Follow us, share with us! Intelligent Data Management Camelot Approach Proven track record of over 15 years in data

Contact

Aleksandra BaumannSenior Consultant – AI for IM

Camelot MC AGTheodor-Heuss-Anlage 1268165 Mannheim, Germany

Tel: +49 621 86298 154Mob: +49 1732338419

[email protected]

Thomas GeyerPrincipal – EDM

Camelot MC AGTheodor-Heuss-Anlage 1268165 Mannheim, Germany

Tel: +49 621 86298 372Mob: +49 172 7412248

[email protected]

Dr. Faried Abu ZaidSenior Consultant – AI for IM

Camelot MC AGRadlkoferstr. 281373 Munich, Germany

Tel: +49 621 86298 431Mob: +49 1724966404

[email protected]