elasticsearch for westcoast
TRANSCRIPT
Charlie Hull, Managing Director, FlaxNick Gushlow, Systems Architect, WestcoastElasticsearch London Meetup
[email protected]/blog+44 (0) 8700 118334Twitter: @FlaxSearch
Search is never SimpleElasticsearch for Westcoast
Building open source search applications since 2001
Independent, honest advice and analysis
Expert design & development, Apache Solr committers
Test-driven relevancy and performance tuning
Custom training & mentoring for your staff
Flexible support up to 24/7/365 with SLAs
Come to our Meetups!
Charlie Hull, Managing Director & co-founder of Flax
Nick Gushlow, Systems Architect at Westcoast
Who are we?@FlaxSearch
Why Westcoast needed a new search engine
The source data & the plan
The trouble with....
Building an admin panel for search The search goes live
Lessons learned
What we'll cover today@FlaxSearch
Largest privately owned IT distributor in UK & Ireland
£1.5 billion turnover
Apple, HP, Lenovo, Microsoft, Samsung, Toshiba
Includes XMA / QC Supplies and Viglen
Who are Westcoast?@FlaxSearch
Old SQL based search not accurate enough
8000 searches per day, 90% SKU based
You searched iPad …. did you actually want an iPad?
Customers used Google / competitors to find part numbers,
Static traffic numbers – 3500 user per day 7am to 7pm
Increase web revenue further, currently £40m
Why a new search engine?@FlaxSearch
Business approved a project to implement a change to ‘improve search’
Google Search Appliance
SLI
Apptus
Fredhopper
Elasticsearch
Time for a change@FlaxSearch
Live pricing
XML data sheets
Business user management interface
Synonyms / Exclusions
Boosts
Search vs Search vs Search
Requirements@FlaxSearch
0.5m products
Nested data (attributes)
Supplied as XML, one file per product
The source data@FlaxSearch
0.5m products
Nested data (attributes)
Supplied as XML, one file per product
BUT!– Live Pricing API will restrict results at search time
– Different for every end customer
– Based on hard to explain business rules
The source data@FlaxSearch
Elasticsearch
Java client
Custom Java indexer (Dropwizard)
Search application (Dropwizard)
Admin panels (AngularJS)
Agile process
The plan@FlaxSearch
First, do your search
Send 5000 results to legacy pricing API
Merge the pricing information with search results
Now build your facets (including on price)
Hang on, doesn't Elasticsearch do facets for you?
The trouble with facets@FlaxSearch
Front end systems built by third party– Solution: Search app with JSON API (defined by them)– Encrypted JSON for use during sessions
More trouble with facets @FlaxSearch
Front end systems built by third party– Solution: Search app with JSON API (defined by them)– Encrypted JSON for use during sessions
Data for all the facets must be supplied to the UI– Full result counts for applied facets need to be returned, in
the order they were applied– Solution: lots of searches
More trouble with facets @FlaxSearch
Front end systems built by third party– Solution: Search app with JSON API (defined by them)– Encrypted JSON for use during sessions
Data for all the facets must be supplied to the UI– Full result counts for applied facets need to be returned, in
the order they were applied– Solution: lots of searches
Custom facets for some customers– Solution: an index of facet definitions
More trouble with facets @FlaxSearch
Boost for individual items– Easy! Define in the source data
Term boosts– e.g. some Macbooks over other Macbooks– Harder – but still defined in source data
The trouble with boosting@FlaxSearch
A great way to run search projects!
...unless not everyone can do Agile
The trouble with Agile@FlaxSearch
Allows Westcoast to adjust– Synonyms / Exclusions– Remove items from index– Test searches– Test synonyms then push to live• Synonyms are index side as default query is AND
Built in AngularJS
Building an Admin panel@FlaxSearch
@FlaxSearch
@FlaxSearch
@FlaxSearch
A single node for Elasticsearch
A single node for index & search applications
Ultimately mirrored for failover
Query load very low (1 QPS)– But this may change!
Business hours support by Flax
The search goes live@FlaxSearch
@FlaxSearch
@FlaxSearch
Elasticsearch results were good
Business maintenance, large, boring, never ending work
Changing customer behaviour is slow
Search results over 30% faster on average
Time savings for sales staff
Post Live@FlaxSearch
Integrating with legacy systems is hard
Business rules can be hard to understand & harder to explain
Not everything can be done with search
If you want to do Agile, make sure everyone else can
Lessons learned@FlaxSearch
Plug
3rd & 4th February 2016, Cambridge UK
Open source search for Bioinformatics
Free event near Cambridge on Wellcome Genome Campus covering both Solr & Elasticsearch, talks & hands-on workshops
http://www.ebi.ac.uk/pdbe/about/events
@FlaxSearch
Plug #2
20th March 2016, Padua, Italy
First International Workshop on Recent Trends in News Information Retrieval
One-day workshop as part of the European Conference on Information Retrieval (ECIR 2016) – submission deadline end of January
http://research.signalmedia.co/newsir16/index.html
(including a free test dataset of 1m news articles!)
@FlaxSearch