knowledge graphs webinar- 11/7/2017

52
Knowledge Graphs #1 Database for Connected Data Jeff Morris Head of Product Marketing [email protected] 11/7/17

Upload: neo4j-the-fastest-and-most-scalable-native-graph-database

Post on 23-Jan-2018

804 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Knowledge Graphs Webinar- 11/7/2017

Knowledge Graphs

#1 Database for Connected Data

Jeff Morris

Head of Product

Marketing

[email protected]

11/7/17

Page 2: Knowledge Graphs Webinar- 11/7/2017

Agenda

• Introduction to Neo4j

• Neo4j Definition of Knowledge Graph

• Examples

Page 3: Knowledge Graphs Webinar- 11/7/2017

Who We Are: The Graph Platform for Connected Data

Neo4j is an enterprise-grade native graph platform that enables you to:

• Store, reveal and query data relationships

• Traverse and analyze any levels of depth in real-time

• Add context and connect new data on the fly

• Performance• ACID Transactions• Agility• Graph Algorithms

3

Designed, built and tested natively for graphs from the start for:

• Developer Productivity• Hardware Efficiency• Global Scale• Graph Adoption

Page 4: Knowledge Graphs Webinar- 11/7/2017

CONSUMER

DATA

PRODUCT

DATA

PAYMENT

DATA

SOCIAL

DATA

SUPPLIER

DATA

The next wave of competitive advantage will be all about

using connections to identify and build knowledge

Knowledge Graphs in The Age of Connections

Page 5: Knowledge Graphs Webinar- 11/7/2017

Discrete Data Problems Connected Data Problems

Perspective

SELECT fooFROM emp

SQL(Ann)-[:LOVES]->(Dan)

CypherQuery

Language

RDBMS GRAPH DB

DBMSArchitectur

e

Page 6: Knowledge Graphs Webinar- 11/7/2017

Neoj4’s Amazing Customers

NASA explores graph database for deep insights into space

International Consortium of Investigative Journalists Wins Pulitzer Prize

Page 7: Knowledge Graphs Webinar- 11/7/2017

Business Problem

• Find relationships between people, accounts, shell companies and offshore accounts

• Journalists are non-technical

• Biggest “Snowden-Style” document leak ever; 11.5 million documents, 2.6TB of data

Solution and Benefits

• Pulitzer Prize winning investigation resulted in robust coverage of fraud and corruption

• PM of Iceland & Pakistan resigned, exposed Putin, Prime Ministers, gangsters, celebrities (Messi)

• Led to assassination of journalist in Malta

Background

• International Consortium of Investigative Journalists (ICIJ), small team of data journalists

• International investigative team specializing in cross-border crime, corruption and accountability of power

• Works regularly with leaks and large datasets

ICIJ Panama Papers INVESTIGATIVE JOURNALISM

Fraud Detection / Knowledge Graph7

Page 8: Knowledge Graphs Webinar- 11/7/2017

Business Problem

• Find relationships between people, accounts, shell companies and offshore accounts

• Journalists are non-technical

• 2017 Leak from Appleby tax sheltering law firm matched 13.4 million account records with public business registrations data from across Caribbean

Solution and Benefits

• Exposed tax sheltering practices of Apple, Nike

• Revealed hidden connections among politicians and nations, like Wilbur Ross & Putin’s son in law

• Triggered government tax evasion investigations in US, UK, Europe, India, Australia, Bermuda, Canada and Cayman Islands within 2 days.

Background

• International Consortium of Investigative Journalists (ICIJ), Pulitzer Prize winning journalists

• Fourth blockbuster investigation using Neo4j to reveal connections in text-based, and account-based data leaked from offshore law firms and government records about the “1% Elite”

ICIJ Paradise Papers INVESTIGATIVE JOURNALISM

Fraud Detection / Knowledge Graph8

Page 9: Knowledge Graphs Webinar- 11/7/2017

“Graph analysis is possibly the single most effective competitive differentiator for organizations pursuing data-driven operations and decisions after the design of data capture.”

By the end of 2018, 70% of leading organizations will have one or more pilot or proof-of-concept efforts underway utilizing graph databases.

“Forrester estimates that over 25% of enterprises will be using graph databases by 2017”

IT Market Clock for Database Management Systems, 2014https://www.gartner.com/doc/2852717/it-market-clock-database-management

TechRadar™: Enterprise DBMS, Q1 2014http://www.forrester.com/TechRadar+Enterprise+DBMS+Q1+2014/fulltext/-/E-RES106801

Making Big Data Normal with Graph Analysis for the Masses, 2015

http://www.gartner.com/document/3100219

Analyst Expectations Three Years Ago

9

Page 10: Knowledge Graphs Webinar- 11/7/2017

The Largest Graph Innovation Network

10,000,000+ Downloads & Docker pullsNeo4j Downloads

250+ customers, 500+ startups50% from Global 2000

100+ Technology and Services Partners

450+ annual events & 10k attendees Graph and Neo4j awareness and training

1,000+ Neo4j GraphConnect NYC Attendees

100,000+ Online and Classroom Education Registrants & Meetup Members

Page 11: Knowledge Graphs Webinar- 11/7/2017

SOFTWAREFINANCIAL

SERVICESRETAIL MEDIA

SOCIAL

NETWORKSTELECOM HEALTH

Neo4j Adoption

Page 12: Knowledge Graphs Webinar- 11/7/2017

Users Love Neo4j

Page 13: Knowledge Graphs Webinar- 11/7/2017

13

“Neo4j continues to dominate the graph database market.”

Noel YuhannaForrester Market Overview:

Graph Database VendorsOctober, 2017

Page 14: Knowledge Graphs Webinar- 11/7/2017

Why is Neo4j Succeeding?

Focus on Simplifying the Adoption, Awareness and Success of Graphs

Open Source business model• Commitment to developers – DevRel, Training, Events, etc. • Commitment to sharing Cypher, the SQL for graphs, on Apache

Native Graph Technology Leadership• Commitment to data integrity, scale and performance• Expanding User Communities to Data Scientists, IT, Analysts & Business Users

Highest Investment in Customer Success• Applications offer real impact, and we spread these success stories

Page 15: Knowledge Graphs Webinar- 11/7/2017

15

Neo4j Graph Platform

Development & Administration

AnalyticsTooling

BUSINESS USERS

DEVELOPERS

ADMINS

GraphAnalytics

GraphTransactions

Data Integration

Discovery & Visualization

DATAANALYSTS

DATASCIENTISTS

Drivers & APIs

APPLICATIONS

AI

BIG DATA IT

Page 16: Knowledge Graphs Webinar- 11/7/2017

16

Grow Graphs by reaching deeper into the enterprise with support for more users, roles and use cases

Page 17: Knowledge Graphs Webinar- 11/7/2017

Connecting Roles in the Enterprise

Data Scientists

Real-timeGraph traversal

Application

Data Lake & DWHS

Big Data IT & Architecture

Developers& Prod Mgrs

AI

Analysts andBusiness Users

Chief Officers of …

Knowledge Graphs

Digital Transformation

Initiatives

Compliance, Data, Digital, Information, Innovation, Marketing, Operations, Risk & Security…

Page 18: Knowledge Graphs Webinar- 11/7/2017

Real-Time Recommendations

Dynamic PricingArtificial Intelligence

& IoT-applications

Fraud DetectionNetwork

ManagementCustomer

Engagement

Supply Chain Efficiency

Identity and Access Management

Relationship-Driven Applications

Page 19: Knowledge Graphs Webinar- 11/7/2017

Sample of Connected Graphs

Organization Identity & Access Network & IT Ops

Page 20: Knowledge Graphs Webinar- 11/7/2017

The Knowledge Graph Problem

Organizations have difficulty maintaining their corporate memory due to a variety of reasons:

• Growth which drives need for new and continuous education

• Digitalization / Digital Transformation initiatives to identify new markets

• Turnover where long term knowledge is lost

• Aging infrastructures and siloed information

Page 21: Knowledge Graphs Webinar- 11/7/2017

Negative Consequences

• Lack of knowledge sharing slows project progress, and creates inconsistencies even among team members.

• Organizations don’t know what they don’t know, nor do they know what they know.

• Data Scientists, and therefore the organization, are slow to recognize or react to changing market conditions, therefore they miss opportunities to innovate

• Bad information is spread inadvertently which erodes corporate trust

• Brand damage when using this info in front of customers

Page 22: Knowledge Graphs Webinar- 11/7/2017

Purchases

RELATIONAL DB WIDE COLUMN

STORE

Views

DOCUMENT

STORE

User Review

RELATIONAL DB

In-Store

Purchase

Shopping

Cart

KEY VALUE

STORE

Product

Catalogue

DOCUMENT

STORE

Category Price ConfigurationsLocation Purchase ViewReviewReturn In-store PurchasesInventory LocationCategory Price ConfigurationsLocation Purchase ViewReviewReturn In-store PurchasesInventory

Products Customers / Users

Location

Data Lives Across the Enterprise

Page 23: Knowledge Graphs Webinar- 11/7/2017

Data Lake

Purchases

RELATIONAL DB

Product

Catalogue

DOCUMENT

STORE

WIDE COLUMN

STORE

Views

DOCUMENT

STORE

User Review

RELATIONAL DB

In-Store

Purchase

Shopping

Cart

KEY VALUE

STORE

Recommendations require an operational

workload — it’s in the moment, real-time!

Good for Analytics, BI, Map Reduce

Non-Operational, Slow Queries

Page 24: Knowledge Graphs Webinar- 11/7/2017

Purchases

RELATIONAL DB

Product

Catalogue

DOCUMENT

STORE

WIDE COLUMN

STORE

Views

DOCUMENT

STORE

User Review

RELATIONAL DB

In-Store

Purchase

Shopping

Cart

KEY VALUE

STORE

Connector

Apps and Systems

Real-Time

Queries

Page 25: Knowledge Graphs Webinar- 11/7/2017

Customer

Adress

Store

Phone

Customer

EmailEmailAdress

Phone

Product

Product

CategoryY

Street

Region

Product

Store

Street

CategoryX

Simple Enterprise Knowledge Graphs

Customer Graph

Product Graph

Supply Graph

Page 26: Knowledge Graphs Webinar- 11/7/2017

Customer Graph

Customer

Adress

Store

Phone

Customer

EmailEmailAdress

Phone

Product

Product

CategoryY

Street

Region

Product

Store

Street

CategoryX

Product Graph

Supply Graph

Simple Enterprise Knowledge Graph

Page 27: Knowledge Graphs Webinar- 11/7/2017

Customer Graph

Customer

Adress

Store

Phone

Customer

EmailEmailAdress

Phone

Product

Product

CategoryY

Street

Region

Product

Store

Street

CategoryX

Product Graph

Supply Graph

Unlock the Institutional Memory

Real-time product recommendations

Fraud Detection

Real-time supply chain management

Risk Management

Page 28: Knowledge Graphs Webinar- 11/7/2017

How it should be

• Information, especially in Analytics, Research departments and customer service should have a searchable, consistent repository, or representation of a repository, from which to store and draw institutional knowledge.

• Corporations who maintain a knowledge graph will develop higher degrees of consistency across all areas of business.

• Improving long term corporate memory should be a mandate from the C-suite

Page 29: Knowledge Graphs Webinar- 11/7/2017

What’s required to get there

• Institutional memory requires a solution that can integrate diverse data sets, often in text due to the legacy nature of that information and return “Context” as a result.

• Connections and relationships, cause and effect correlation needs to be materialized and persisted permanently.

• All information must be indexed, searchable and shareable.

• The solution must be agile, easily expandable and adaptable to changing business conditions

• The solution needs to be a combination of text-based NLP, ElasticSearch and Graphs.

• Information must be easy to visualize and leverage in your processes and workflows

Page 30: Knowledge Graphs Webinar- 11/7/2017

Money Transferring

Purchases Bank Services

Neo4j powers

360° view and

update of

information in

real-time

Neo4j

Cluster

SENSETransaction

stream

RESPONDAlerts &

notification

SETS Context for Traversals

Relational

database

ElasticSearch &

Data Lake

Visualization UI

Fine Tune Patterns

Develop PatternsData Science-team

Merchant Data

Credit Score Data

Other 3rd Party Data

Data-set used to

explore new

insights and

develop new

algorithms as

graph expands

Neo4j In Action

Page 31: Knowledge Graphs Webinar- 11/7/2017

31

Graph Boosted Artificial Intelligence

Knowledge GraphsProvide Rich

Context for AI

AI VisibilityHuman-Friendly

Graph Visualization

Graph Enhanced AI ModelsFaster, More Accurate Development

Graph Execution of AIOperationalize Real-Time OLAP and Monitoring

Graph Analytics Enrich AI Inputs with Graph Algorithms

Graph System of RecordMaintain a Source of

Connected AI Truth

Page 32: Knowledge Graphs Webinar- 11/7/2017

Case Studies

Neo4j Case Studies

Page 33: Knowledge Graphs Webinar- 11/7/2017

Background

• Brazil's largest bank, #38 on Forbes G2000

• $61B annual sales 95K employees

• Most valuable brand in Brazil

• 28.9M credit card & 25.6M debit card accounts

• High integrity, customer-centric values

Business Problem

• Data silos made assessing credit worthiness hard

• High sensitivity to fraud activity

• 73% of all transactions over internet and mobile

• Needed real-time detection for 2,000 analysts

• Scale to trillions of relationships

Solution and Benefits

• Credit monitoring and fraud detection application

• 4.2M nodes & 4B relationships for 100 analysts

• Grow to 93T relationships for 2000 analysts by 2021

• Real time visibility into money flow across multiple customers

Itau Unibanco FINANCIAL SERVICES

Fraud Detection / Credit Monitoring 33

CE Customer since 2016 Q1EE Customer since Q2 2017

Page 34: Knowledge Graphs Webinar- 11/7/2017

Background

• Large global bank

• Deploying Reference Data to users and systems

• 12 data domains, 18 datasets, 400+ integrations

• Complex data management infrastructure

Business Problem

• Master data silos were inflexible and hard to consume

• Needed simplification to reduce redundancy

• Reduce risk when data is in consumers’ hands

• Dramatically improve efficiency

Solution and Benefits

• Data distribution flows improved dramatically

• Knowledge Base improves consumer access

• Ad-hoc analytics improved

• Governance, lineage and trust improved

• Better service level from IT to data consumers

UBS FINANCIAL SERVICES

Master Data Management / Knowledge Graph34

CE Customer since 2016 Q1EE Customer since 2015

Page 35: Knowledge Graphs Webinar- 11/7/2017

Background

• SF-based C2C rental platform

• Dataportal democratizes data access for growing number of employees while improving discoverability and trust

• Data strewn everywhere—in silos, in segmented departments, nothing was universally accessible

Business Problem

• Data-driven culture hampered by variety and dependability of data, tribal knowledge and word-of-mouth distribution

• Needed visibility into information usage, context, lineage and popularity across company of 3,000+

Solution and Benefits

• Offers search with context & metadata, user & team-centric pages for origin & lineage

• Nodes are resources: data tables, dashboards, reports, users, teams, business outcomes, etc.

• Relationships reflect consumption, production, association, etc.

• Neo4j, Elasticsearch, Python

Airbnb Dataportal TRAVEL TECHNOLOGY

Knowledge Graph, Metadata Management35

CE users since 2017

Page 36: Knowledge Graphs Webinar- 11/7/2017

Background

• 5 year long drug discovery research

• Parse & Navigate over 25 Million scientific papers

• Sourced from National Library of Research and tagging of “Medical Subject Headers” (MeSH tags)

Business Problem

• Seeking to automate phenotype, compound and protein cell behavior research by using previously documented research more effectively

• Text mining for research elements like DNA strings, proteins, RNA, chemicals and diseases

Solution and Benefits

• Found ways to identify compound interaction behavior from millions of research documents

• Relations between biological entities can be identified and validated by biologic experts

• Still very challenging to keep up-to-date, add genomics data, and find a breakthrough

Novartis PHARMACEUTICAL RESEARCH

Content Management / Biomedical Research36

CE Customer since 2016 Q1CE Customer since 2012

Page 37: Knowledge Graphs Webinar- 11/7/2017

Background

• How Neo4j is used in investigations

• Non-technical reporters manually gather data

• “Low-tech” data curation

• Journalists want to model data as a story, not as data

Business Problem

• Identify repeated business relationships among individuals and their holdings and accounts

• Scan documents and identify possible entities, then create relationships between people and documents.

• Names and alias variances

Solution and Benefits

• Uses Neo4j in “story discovery” phase

• Uncovers shortest paths for leads for reporters

• Many investigations underway now

Columbia University EDUCATION

Investigative Journalism / Fraud Detection37

CE Customer since 2016 Q1EE Customer since 2015 Q4

Page 38: Knowledge Graphs Webinar- 11/7/2017

Background

• Large Nordic Telecom Provider

• 1M Broadband routers deployed in Sweden

• Half of subscribership are over 55yrs old

• Each household connects 10 devices

• Goal to improve customer experience

Business Problem

• Broadband router enhancement to improve customer experience

• Context-based in home services

• How to build smart home platform that allows vendors to build new “home-centric” apps

Solution and Benefits

• New Features deployed to 1M homes

• API-based platform for easy apps that:

• Automatically assemble Spotify playlists based on who is in the house

• Notify parents when children get home

• Build smart shopping lists

TELIA ZONE TELECOMMUNICATIONS

Smart Home / Internet of Things38

EE Customer since 2016 Q4

Page 39: Knowledge Graphs Webinar- 11/7/2017

Business Problem

• Needed new asset management backbone to handle scheduling, ads, sales and pushing linear streams to satellites

• Novell LDAP content hierarchy not flexible enough to store graph-based business content

Solution and Benefits

• Neo4j selected for performance and domain fit

• Flexible, native storage of content hierarchy

• Graph includes metadata used by all systems: TV series-->Episodes-->Blocks with Tags-->Linked Content, tagged with legal rights, actors, dubbing et al

Background

• Nashville-based developer of lifestyle-oriented content for TV, digital, mobile and publishing

• Web properties generate tens of millions of unique visitors per month

Scripps Networks MEDIA AND ENTERTAINMENT

Knowledge Graph / Asset Management39

Page 40: Knowledge Graphs Webinar- 11/7/2017

Business Problem

• Needed to reimagine existing system to beat competition and provide 360-degree view of customers

• Channel complexity necessitated move to graph database

• Needed an enterprise-ready solution

Solution and Benefits

• Leapfrogged competition and increased digital business by 23%

• Handles new data from mobile, social networks, experience and governance sources

• After launch of new Neo4j MDM, Pitney Bowes stock declared a Buy

Background

• Connecticut-based leader in digital marketingcommunications

• Helps clients provide omni-channel experience with in-context information

Pitney Bowes MARKETING COMMUNICATIONS

Master Data Management40

Page 41: Knowledge Graphs Webinar- 11/7/2017

Background

• Large Public University – “U-Dub”

• IT staff for 80K+ students and employees

• Transforming IT systems from mainframe to cloud

• Providing IT & data warehousing services to 3 campuses, 6 hospitals, and 6,300 EDW users

Business Problem

• Old Sharepoint metadata was too complicatedfor users, not flexible and not transparent

• $1B project to migrate HR system from mainframe to Workday needed to be smooth

• Future projects needed repeatable predictability

• Needed new glossary, impact analysis, analytics

Solution and Benefits

• Consulted with NDU peers, built simple model

• Built Visualizer with Elasticsearch, Neo4j & D3.js

• Improved predictability, lineage, and impact understanding for over 6,300 users

University of Washington EDUCATION & RESEARCH

Metadata Management, IT & Network Operations41

CE Customer since 2016 Q1

Page 42: Knowledge Graphs Webinar- 11/7/2017

Background

• World's largest hospitality / hotel company

• 7th largest web site on internet

• 1.5 M hotel rooms offered online by 2018

• Revenue Management System that allows property managers to update their pricing rates

Business Problem

• Provide the right room & price at the right time

• Old rate program was inflexible and bogged down as they increased the pricing options per property per day

• Lay the path to be an innovator in the future

Solution and Benefits

• 2016-era rate program embeds Neo4j as "cache"

• Created a graph per hotel for 4500 properties in 3 clusters

• 1000% increase in volume over 4 years

• 50% decrease in infrastructure costs

• "Use Neo4j Support!"

MARRIOTT TRAVEL & HOSPITALITY SERVICES

Pricing Recommendations Engine42

EE Customer since 2014 Q2

Page 43: Knowledge Graphs Webinar- 11/7/2017

Case Studies for Knowledge Graphs and Recommendation Engines

eBay ShopBot

Page 44: Knowledge Graphs Webinar- 11/7/2017

Background

• Personal shopping assistant

• Converses with buyer via text, picture and voice to provide real-time recommendations

• Combines AI and natural language understanding (NLU) in Neo4j Knowledge Graph

• First of many apps in eBay's AI Platform

Business Problem

• Improve personal context in online shopping

• Transform buyer-provided context into ideal purchase recommendations over social platforms

• "Feels like talking to a friend"

Solution and Benefits

• 3 developers, 8M nodes, 20M relationships

• Needed high-performance traversals to respond to live customer requests

• Easy to train new algorithms and grow model

• Generating revenue since launch

eBay ShopBot ONLINE RETAIL

Knowledge Graph powers Real-Time Recommendations 44

EE Customer since 2016 Q3

Page 45: Knowledge Graphs Webinar- 11/7/2017

Case Study: Knowledge Graphs at eBay

Page 46: Knowledge Graphs Webinar- 11/7/2017

Case Study: Knowledge Graphs at eBay

Page 47: Knowledge Graphs Webinar- 11/7/2017

Case Study: Knowledge Graphs at eBay

Page 48: Knowledge Graphs Webinar- 11/7/2017

Case Study: Knowledge Graphs at eBay

Page 49: Knowledge Graphs Webinar- 11/7/2017

Bags

Case Study: Knowledge Graphs at eBay

Page 50: Knowledge Graphs Webinar- 11/7/2017

Men’s Backpack

Handbag

Case Study: Knowledge Graphs at eBay

Page 51: Knowledge Graphs Webinar- 11/7/2017

https://shopbot.ebay.com/

Try it out at:

Case Study: Knowledge Graphs at eBay

Page 52: Knowledge Graphs Webinar- 11/7/2017

Case Studies for Knowledge Graphs and Recommendation Engines

eBay ShopBot