all you need to know to build a
TRANSCRIPT
![Page 1: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/1.jpg)
All You Need to Know to Build a Product Knowledge Graph
Nasser ZalmoutAmazon
Chenwei ZhangAmazon
Xian LiAmazon
Yan LiangAmazon
Xin Luna DongAmazon→Facebook
![Page 2: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/2.jpg)
OutlineOverview and Introduction
Knowledge Extraction
Knowledge Cleaning
Break
Ontology Mining
Applications
Conclusion and Future Directions
20 min
40 min
25 min
20 min
25 min
20 min
10 min
![Page 3: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/3.jpg)
Overview and Introduction
Overview and Introduction 20 min
Knowledge Extraction
Knowledge Cleaning
Q&A
Break
Ontology Mining
Applications
Conclusion and Future Directions
Q&A
![Page 4: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/4.jpg)
Knowledge Graph Example for 2 Songs
artist mid345
mid346
mid127
mid129
mid128
genre
song_writer
name
name
name
name
name
“Shake it off”
“Love Story”
“Taylor Alison Swift”
“Taylor Swift”
“Country pop”
artist
“Dance-pop”
“Pop”name
name
12/13/1989birth_date
Recording type
type
GenretypeEntity type
Entity
Relationship
genre
![Page 5: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/5.jpg)
Product Graph Example for 2 Songs
artist
mid345
mid346
mid127
mid129
mid128
genre
song_writer
name
name
name
name
name
“Shake it off”
“Love Story”
“Taylor Alison Swift”
“Taylor Swift”
“Country pop”
artist
“Dance-pop”
“Pop”name
name
12/13/1989birth_date
Genretype
genre
![Page 6: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/6.jpg)
Product Graph Example for 2 Songs
artist
mid345
mid346
mid127
mid129
mid128
genre
song_writer
name
name
name
name
name
“Shake it off”
“Love Story”
“Taylor Alison Swift”
“Taylor Swift”
“Country pop”
artist
“Dance-pop”
“Pop”name
name
12/13/1989birth_date
Genretype
genre
mid568
mid570
ASIN
ASIN
B0035QUXWR
B0067XLIG8
type
B0035QUXWQ
B0067XLIG4
ASIN
ASIN
mid567
mid569
mid571
product
product
product
product
product
Release
Track
![Page 7: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/7.jpg)
Product Graph Example for 2 Products
![Page 8: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/8.jpg)
Use Case I: Providing Information
![Page 9: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/9.jpg)
Use Case II: Providing Choices
![Page 10: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/10.jpg)
Use Case III: Improving Search
![Page 11: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/11.jpg)
Use Case III: Improving Search
![Page 12: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/12.jpg)
Use Case III: Improving Search
![Page 13: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/13.jpg)
Use Case IV: Improving Recommendation
![Page 14: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/14.jpg)
Product Graph vs. Knowledge Graph
Generic KG
PG
Generic KG
PG
Generic KG
PGPG
(A) (B) (C)
![Page 15: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/15.jpg)
Generic KG
PG
Generic KG
PG
(A) (B) (C)
Generic KG
Product KG
(Hardline, softline, consumables, etc.)
Movie,Music,Book,etc.
✅
Product Graph vs. Knowledge Graph
![Page 16: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/16.jpg)
But, Is The Problem Harder?
Generic KG
Product KG
(Hardline, softline, consumables, etc.)
Movie,Music,Book,Aetc.
We focus on retail products in
this tutorial
![Page 17: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/17.jpg)
Challenges in Building Product Graph I
❑Sparse and noisy structured data
![Page 18: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/18.jpg)
Challenges in Building Product Graph II
❑Extremely complex domains
❑How to identify the millions of product types?
❑How to organize types into a taxonomy tree?
Sellers’ view Buyers’ view
![Page 19: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/19.jpg)
Challenges in Building Product Graph III
❑Big variety across product types
❑Different attributes apply to different product types
❑Different value vocabularies and different patterns
![Page 20: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/20.jpg)
Challenges in Building Product Graph III
❑Big variety across product types
❑Different attributes apply to different product types
❑Different value vocabularies and different patterns
![Page 21: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/21.jpg)
Scale Up in 3 Dimensions
Thousands of attributes
Millions of categories
Hundreds of languages
Big challenge: Limited training labels for large-scale, rich data
![Page 22: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/22.jpg)
Can We Build A Self-Driving Product Understanding System?
![Page 23: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/23.jpg)
Our Goal: Self-Driving Product Knowledge Collection
Product Type Flavor Color
Product 1 Snacks Cherry
Product 2 Candy ? ?
Product 3 Candy Choc. Gold
AutoKnowUser logs
Grocery
Snacks Drinks
Candy
Product KG Grocery
Snacks Drinks
CandyPretzels
Catalog
Taxonomy
Prod. 1 Prod. 2
Gold
color
Prod. 3
Choc.Chocolatesynonym
flavor flavor
hasType
Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
![Page 24: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/24.jpg)
Our Goal: Self-Driving Product Knowledge Collection
Product Type Flavor Color
Product 1 Snacks Cherry
Product 2 Candy ? ?
Product 3 Candy Choc. Gold
AutoKnowUser logs
Grocery
Snacks Drinks
Candy
Product KG Grocery
Snacks Drinks
CandyPretzels
Catalog
Taxonomy
Prod. 1 Prod. 2
Gold
color
Prod. 3
Choc.Chocolatesynonym
flavor flavor
hasType
New Type
New Value
Corrected Value
Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
![Page 25: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/25.jpg)
Amazon AutoKnow: Self-Driving Product Knowledge Collection
Product Type Flavor Color
Product 1 Snacks Cherry
Product 2 Candy ? ?
Product 3 Candy Choc. Gold
AutoKnowUser logs
Grocery
Snacks Drinks
Candy
Product KG Grocery
Snacks Drinks
CandyPretzels
● #Types ↑ 3X ● Defect rate ↓up to 68 percent points
Catalog
Taxonomy
Prod. 1 Prod. 2
Gold
color
Prod. 3
Choc.Chocolatesynonym
flavor flavor
hasType
Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
![Page 26: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/26.jpg)
Input Data
PT Taxonomy
Catalog
Behavioral Signals (e.g., search logs,
reviews, Q&A)
Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
Amazon AutoKnow: Self-Driving Product Knowledge Collection
![Page 27: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/27.jpg)
Input Data
PT Taxonomy
Catalog
Behavioral Signals (e.g., search logs,
reviews, Q&A)
Taxonomy Enrichment
Relation Discovery
Ontology Suite
Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
Amazon AutoKnow: Self-Driving Product Knowledge Collection
![Page 28: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/28.jpg)
Synonym Discovery
Data Imputation
DataCleaning
Data SuitePT Taxonomy
Catalog
Behavioral Signals (e.g., search logs,
reviews, Q&A)
Taxonomy Enrichment
Relation Discovery
Ontology Suite
Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
Amazon AutoKnow: Self-Driving Product Knowledge Collection
![Page 29: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/29.jpg)
Broad Graph
{product, attribute,
value}
{value, synonym,
value}
Ontology
Synonym Discovery
Data Imputation
DataCleaning
Data SuitePT Taxonomy
Catalog
Behavioral Signals (e.g., search logs,
reviews, Q&A)
Taxonomy Enrichment
Relation Discovery
Ontology Suite
Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
Amazon AutoKnow: Self-Driving Product Knowledge Collection
![Page 30: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/30.jpg)
Self Driving to Navigate a Large Space
• Automatic: Fully ML-based• Annotation free: Weak learning based on existing
Catalog data and user behavior• One-size-fits-all: Few taxonomy-aware models• Self guidance: Identify important attributes and
categories to focus efforts
![Page 31: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/31.jpg)
Key Intuition I. Learning w. Limited Labels Generated from Existing Catalog Data
Product Type Flavor Color
Product 1 Snacks Cherry
Product 2 Candy ? ?
Product 3 Candy Choc. Gold
Grocery
Snacks Drinks
Candy
Catalog
Taxonomy
Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
![Page 32: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/32.jpg)
Key Intuition II. Rich Customer Behavior
Grocery
Snacks Drinks
CandyPretzels
CandyPretzels
Prod. 1 Prod. 2
Gold
color
Prod. 3
Choc.Chocolatesynonym
flavor flavor
hasType
Query 1 Query 2 Query 3
Prod. 1 Prod. 2 Prod. 3
purchase purchase purchase
co-purchase co-view
mention
mention
mention
mention
![Page 33: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/33.jpg)
Key Intuition III. Product Categories as First-Class Citizen in Modeling
Grocery
Snacks Drinks
CandyPretzels
Prod. 1 Prod. 2 Prod. 3
hasType
![Page 34: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/34.jpg)
Key Intuition IV. Leverage Graph StructureTaxonomy is a tree
Grocery
Snacks Drinks
CandyPretzels
CandyPretzels
Prod. 1 Prod. 2
Gold
color
Prod. 3
Choc.Chocolatesynonym
flavor flavor
hasType
KG is a graph
Query 1 Query 2 Query 3
Prod. 1 Prod. 2 Prod. 3
purchase purchase purchase
co-purchase co-view
Customer behavior forms a graph
![Page 35: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/35.jpg)
Key Intuition V. Leverage Different Modals
![Page 36: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/36.jpg)
Key Techniques in AutoKnow
![Page 37: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/37.jpg)
Deliver the Data Business
1000 000000, , ,
![Page 38: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/38.jpg)
Deliver the Data Business
1High
precisionmodels
![Page 39: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/39.jpg)
Deliver the Data Business
1000,High
precisionmodels
E2E pipeline + AutoMLto reduce
modeling cost
![Page 40: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/40.jpg)
Deliver the Data Business
1000 000, ,High
precisionmodels Scale-up to
reduce #models
100s attributes
1000s categories
10s languages
E2E pipeline + AutoMLto reduce
modeling cost
![Page 41: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/41.jpg)
Deliver the Data Business
1000 000000, , ,High
precisionmodels Scale-up to
reduce #modelsHigher yield from
multi-modal models
100s attributes
1000s categories
10s languages
E2E pipeline + AutoMLto reduce
modeling cost
![Page 42: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/42.jpg)
Broad Graph
{product, attribute,
value}
{value, synonym,
value}
Ontology
Synonym Discovery
Data Imputation
DataCleaning
Data SuitePT Taxonomy
Catalog
Behavioral Signals (e.g., search logs,
reviews, Q&A)
Taxonomy Enrichment
Relation Discovery
Ontology Suite
Dong et al., AutoKnow: Self-driving knowledge collection for products of thousands of types, SigKDD, 2020.
Tutorial Structure
Sec 2
Sec 3
Sec 4
Sec 5. Applications
![Page 43: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/43.jpg)
Section Structure
• Problem DefinitionWhat are unique challenges for PG beyond generic KGs?
• Short answer -- key intuitionWhat are key intuitions for building product KGs?
• Long answer -- detailsWhat are practical tips?
• Reflection/short-answer Can we apply the techniques to other domains?
![Page 44: All You Need to Know to Build a](https://reader031.vdocuments.net/reader031/viewer/2022021906/620f1dd053c13261bf15833c/html5/thumbnails/44.jpg)
Key Questions We Answer in This Tutorial
• Q1. What are unique challenges to build a product knowledge graph and what are solutions?
• Q2. Are these techniques applicable to building other domain knowledge graphs?
• Q3. What are practical tips to make this to production?