finding cars and hunting down logs - elasticsearch @autoscout24

Post on 12-Jan-2017

150 Views

Category:

Internet

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Finding Cars and Hunting Down Logs: Elasticsearch @ AutoScout24

AutoScout24

24 Nov 2016

Philipp Garbe Lead developer (philipp.garbe@scout24.com)

Juri Smarschevski Team lead (juri.smarschevski@scout24.com)

SearchAutoScout24 search journey in nutshell

2

Who we are ?Unique Monthly Visitors in Europa

3

… 10 more

Some numbers

Search index contains ~2.6M classifieds

4

Unique visitors (monthly): ~10M

Search requests per day: ~36M

Index update rate per day ~400.000 classifieds

Status quo. March 2013.

Endeca used as a search engine

5

Use case: providing search results and facets for the entire AS24 platform

Problems: • New product requirements, performance of Endeca becomes slower• Time to market of our required features is not sufficient• Maintenance is complex / expensive

Possible candidates

Solr ?

• <feeling> too complex installation / configuration </feeling>

6

Sphinx ?• Support situation is unclear

Elasticsearch ?• Fresh buzzword• From beginning on built for distributed systems (rumors)• Easy installation / configuration (fact)

POC

Goals

• Performance should be comparable with Endeca• The solution should be scalable

7

8

Rollout plan. 03.2013 - 11.2013

07.2013 11.201302.2013 03.2013 05.2013

POC

Implementation & migration

Training

Go live phase

#real_project_picture_squeezed

9

Endeca Elasticsearch(0.9.x)

Amount of machines 60 20

[Re]index time ~180 min ~45 min

Deploy to Live up to 2 days < 3 hours

Effort for testing an issue on local machine 4 h 1 h

Performance = =

Product / dev guys satisfaction :( :)

300%

400%

1000%

400%

% ?

Results after 8 months of working.

No problems after migration ?

Cluster split brain

Has in fact nothing to do with Elasticsearch, is more related to learn phase at AS24

10

Deep pagination

Elasticsearch 5.x release notes: “Deep pagination of search results is now possible with the search_after feature, which efficiently skips over previously returned results to return just the next page.“

11

Status quo. November 2014.

Project “Tatsu” has started.NET => JVM

C# => Scala

IIS / Windows => Play / Linux

Local data center => AWS

Monolith => Micro services

Windows workstations => Mac notebooks

... => ...

12

Status quo. November 2014.

Project “Tatsu” has started.NET => JVM

C# => Scala

IIS / Windows => Play / Linux

Local data center => AWS

Monolith => Micro services

Windows workstations => Mac notebooks

? => ?

=> 2015

13

Elasticsearch clusters “lift & shift” to AWS ?

AWS Elasticsearch Service ?

Elasticsearch as a service (SaaS) ?

Own hosting in AWS ?

16

Rolling update in details (possible scenario).

Time1

Initial state

17

Rolling update in details (possible scenario).

Node has been replaced

Time1 2

Initial state

~ 60 sec

18

Rolling update in details (possible scenario).

Master has been killed

Node has been replaced

Time1 2 3

Initial state

19

Rolling update in details (possible scenario).

Master has been killed

Node has been replaced

Master election

Time1 2 3 4

Initial state

20

Rolling update in details (possible scenario).

Master has been killed

Node has been replaced

Master election

Time1 2 3 4 5

Initial state Last node has been replaced

21

Rolling update findingsMaster has been killed

?Outage=

22

Rolling update findings

LoggingContinuously deployed, immutable and stateful

23

7.4 billion documents

Some numbers

36 TB EBS

18 nodes á m4.4xlarge

(64GB / 53.5 cpu units)

Unified Logs

25

Challenge: Deployment time

Rolling updates

27

Challenge: Costs

First setup

● 18x m4.4xlarge● 18x 2TB gp2

● 3TB/day cross-zone traffic

Cost/Usage Optimized Setup

● 15x m4.x2large● 15x 384GB gp2

● 6x SpotFleet● 6x 4TB st1

● 9TB/day cross-zone traffic

Savings: ~40%

Future. What next ?

Percolator (saved search)

36

Elastic Graph (recommendations)

Freetext search

37

Conclusion

Here is a simple question - if we had the possibility to go back in the time and start the same journey with Elasticsearch,

would we do it the same way ?

Q & A

38

top related