elasticsearch suggesters...context suggester •used with completion suggester •two types of...
TRANSCRIPT
![Page 1: elasticsearch suggesters...CONTEXT SUGGESTER •Used with completion suggester •Two types of context: • Category • Geo •Requires special mapping • Increased index size of](https://reader036.vdocuments.net/reader036/viewer/2022071114/5febb8caa65ccd6904719261/html5/thumbnails/1.jpg)
ELASTICSEARCH SUGGESTERS
Beyond Autocomplete
![Page 2: elasticsearch suggesters...CONTEXT SUGGESTER •Used with completion suggester •Two types of context: • Category • Geo •Requires special mapping • Increased index size of](https://reader036.vdocuments.net/reader036/viewer/2022071114/5febb8caa65ccd6904719261/html5/thumbnails/2.jpg)
BRETT HAZEN
• Principal Consultant at ILM
• C#/.NET & JavaScript
• Runner
• Very Amateur Musician
• Bird Owner
![Page 3: elasticsearch suggesters...CONTEXT SUGGESTER •Used with completion suggester •Two types of context: • Category • Geo •Requires special mapping • Increased index size of](https://reader036.vdocuments.net/reader036/viewer/2022071114/5febb8caa65ccd6904719261/html5/thumbnails/3.jpg)
WHY THIS TALK?
![Page 4: elasticsearch suggesters...CONTEXT SUGGESTER •Used with completion suggester •Two types of context: • Category • Geo •Requires special mapping • Increased index size of](https://reader036.vdocuments.net/reader036/viewer/2022071114/5febb8caa65ccd6904719261/html5/thumbnails/4.jpg)
OUR PLAN
• Cover some Elasticsearch basics
• Index and mapping
• Analyzers
• Tour through each type of suggester with example
• Q&A
![Page 5: elasticsearch suggesters...CONTEXT SUGGESTER •Used with completion suggester •Two types of context: • Category • Geo •Requires special mapping • Increased index size of](https://reader036.vdocuments.net/reader036/viewer/2022071114/5febb8caa65ccd6904719261/html5/thumbnails/5.jpg)
![Page 6: elasticsearch suggesters...CONTEXT SUGGESTER •Used with completion suggester •Two types of context: • Category • Geo •Requires special mapping • Increased index size of](https://reader036.vdocuments.net/reader036/viewer/2022071114/5febb8caa65ccd6904719261/html5/thumbnails/6.jpg)
WHAT DOES AN ANALYZER DO
• Start with the text “Domain Driven Design by Example”
• Standard analyzer
• No character filters
• Tokenizer: standard
• Token filters: standard, lower case, stop
• API for helping debug
![Page 7: elasticsearch suggesters...CONTEXT SUGGESTER •Used with completion suggester •Two types of context: • Category • Geo •Requires special mapping • Increased index size of](https://reader036.vdocuments.net/reader036/viewer/2022071114/5febb8caa65ccd6904719261/html5/thumbnails/7.jpg)
WHAT DOES AN ANALYZER DO
• After tokenizer
• [ “Domain”, “Driven”, “Design”, “by”, “Example”]
• After standard filter
• [ “Domain”, “Driven”, “Design”, “by”, “Example”]
• After lowercase filter
• [ “domain”, “driven”, “design”, “by”, “example”]
• After stop filter (actually disabled by default)
• [ “domain”, “driven”, “design”, “by”, “example”]
• If not disabled
• [“domain”, “driven”, “design”, “example”]
![Page 8: elasticsearch suggesters...CONTEXT SUGGESTER •Used with completion suggester •Two types of context: • Category • Geo •Requires special mapping • Increased index size of](https://reader036.vdocuments.net/reader036/viewer/2022071114/5febb8caa65ccd6904719261/html5/thumbnails/8.jpg)
WHAT IS A SUGGESTER?
• Suggest similar looking terms
• what that means depends on suggester
• Parts of suggest features still under development
• documentation doesn’t say which parts
• Three types of suggesters
• Completion
• Term
• Phrase
![Page 9: elasticsearch suggesters...CONTEXT SUGGESTER •Used with completion suggester •Two types of context: • Category • Geo •Requires special mapping • Increased index size of](https://reader036.vdocuments.net/reader036/viewer/2022071114/5febb8caa65ccd6904719261/html5/thumbnails/9.jpg)
COMPLETION SUGGESTER
• Provide autocomplete
• Only works based on prefix
• Stored as special data structure for speed
• Costly to build
• Stored in memory
![Page 10: elasticsearch suggesters...CONTEXT SUGGESTER •Used with completion suggester •Two types of context: • Category • Geo •Requires special mapping • Increased index size of](https://reader036.vdocuments.net/reader036/viewer/2022071114/5febb8caa65ccd6904719261/html5/thumbnails/10.jpg)
TERM SUGGESTER
• Correct a user’s spelling
• Suggestions are based on edit distance
![Page 11: elasticsearch suggesters...CONTEXT SUGGESTER •Used with completion suggester •Two types of context: • Category • Geo •Requires special mapping • Increased index size of](https://reader036.vdocuments.net/reader036/viewer/2022071114/5febb8caa65ccd6904719261/html5/thumbnails/11.jpg)
WHAT IS EDIT DISTANCE?
• How many characters need to be changed for a term to match
• Example
• User provided text is “domian”
• Corrected term “domain” has an edit distance of 1
• Only one character had to change for the terms to match
![Page 12: elasticsearch suggesters...CONTEXT SUGGESTER •Used with completion suggester •Two types of context: • Category • Geo •Requires special mapping • Increased index size of](https://reader036.vdocuments.net/reader036/viewer/2022071114/5febb8caa65ccd6904719261/html5/thumbnails/12.jpg)
PHRASE SUGGESTER
• Like term suggester, only fancier
• Based on n-gram language model
![Page 13: elasticsearch suggesters...CONTEXT SUGGESTER •Used with completion suggester •Two types of context: • Category • Geo •Requires special mapping • Increased index size of](https://reader036.vdocuments.net/reader036/viewer/2022071114/5febb8caa65ccd6904719261/html5/thumbnails/13.jpg)
WHAT IS N-GRAM LANGUAGE MODEL?
• A way of tokenizing a string
• Example
• We are indexing the string “domain”
• We are using a gram size of 2
• We’ll get these tokens:
• do, om, ma, ai, in
![Page 14: elasticsearch suggesters...CONTEXT SUGGESTER •Used with completion suggester •Two types of context: • Category • Geo •Requires special mapping • Increased index size of](https://reader036.vdocuments.net/reader036/viewer/2022071114/5febb8caa65ccd6904719261/html5/thumbnails/14.jpg)
N-GRAM LANGUAGE FOR WORDS
• When we want groups of words instead of groups of characters they’re referred to as shingles
• Example
• “Domain Driven Design”
• Minimum shingle size of 2
• Maximum shingle size of 3
• We would get
• “Domain Driven”, “Driven Design”, and “Domain Driven Design”
![Page 15: elasticsearch suggesters...CONTEXT SUGGESTER •Used with completion suggester •Two types of context: • Category • Geo •Requires special mapping • Increased index size of](https://reader036.vdocuments.net/reader036/viewer/2022071114/5febb8caa65ccd6904719261/html5/thumbnails/15.jpg)
CONTEXT SUGGESTER
• Used with completion suggester
• Two types of context:
• Category
• Geo
• Requires special mapping
• Increased index size of completion field
• It’s entirely on the heap, tread lightly
![Page 16: elasticsearch suggesters...CONTEXT SUGGESTER •Used with completion suggester •Two types of context: • Category • Geo •Requires special mapping • Increased index size of](https://reader036.vdocuments.net/reader036/viewer/2022071114/5febb8caa65ccd6904719261/html5/thumbnails/16.jpg)
QUESTIONS