deploying immutable infrastructures with rabbitmq and solr

Deploying Immutable Infrastructures

RabbitMQ and Solr

@jordillonch@eloipoch

March 2017

Xavier Sanchez

Who are we?

Eloi Poch Jordi LlonchXavier Sanchez

Agenda

• What is an immutable infrastructure?

• How do we deploy a RabbitMQ cluster?

• How do we deploy a Solr cluster?

• Conclusions

What is an immutable infrastructure?

ChangeChange

Change

once you instantiate something you never

change it

replace instances with other ones with the desired changes already applied

State A

State B

State E

State D

State F

State C

Key benefits• no differences between servers (eventually)

• lower failures & easier troubleshooting

• no downtime

• more confident update (up, running & tested)

• simple update & rollback

...but this is a simplification

we should be aware that infrastructure is divided into “data”

and “everything else”

How do we deploy a RabbitMQ cluster?

What is RabbitMQ?

RabbitMQ @ Wallapop

Domain Event Bus

• preferred system of communication between services

• avoids coupling between origin (publisher) and destiny/destinies (subscribers)

• failure tolerant architecture

Characteristics

• transient data

• no control over the publishers or consumers

• sync publish and async consumption

• soft/near realtime

How do we migrate to the new cluster?

Initial status

Great Service

Bored ServiceWTF Service

RabbitMQ Cluster

Crazy Service

Suicidal Service

Mainstream Service

Create new cluster

Great ServiceBored ServiceWTF Service

RabbitMQ Cluster

Crazy ServiceSuicidal Service Mainstream Service

RabbitMQ Cluster

Copy definitions

RabbitMQ Cluster

newconfigconfig

Create federated queues

Queue A

Queue B

Queue C

Queue A

Queue B

Queue C

message flow

Upstream Downstream

Only when a downstream queue has idle consumers

pull messages

RabbitMQ Cluster

newconfig

Federated Queues

config

Change DNS

RabbitMQ Cluster

Federated Queues

configconfig

Close connections

RabbitMQ Cluster

Federated Queues

configconfig

RabbitMQ Cluster

RabbitMQ ClusterFederated Queues

oldconfigconfig

Delete federated queues

RabbitMQ Cluster

oldconfigconfig

Federated Queues

Delete old cluster

Great Service

Bored ServiceWTF Service

RabbitMQ Cluster

Crazy Service

Suicidal Service

Mainstream Service

How do we deploy a Solr cluster?

What is Apache Solr?

• Solr is an open source enterprise search platform built on Apache Lucene providing advanced full-text search capabilities

• distributed search

• faceted search

• real-time indexing

• index replication

• hit highlighting

• suggesters

• ...

Why Solr?• NRT indexing

• Fast replication

• Handle large volumes of data (>25 millions docs)

• Handle high traffic rates

• Easy to scale

• Geospatial search

• ...

Solr @ Wallapop

Multiple uses• Wall index

• General Search index

• Specialized verticals indexes

• Search suggesters

• Upload suggesters

• Filter suggesters

• ...

Characteristics

• Non-volatile data

• Total control over the publishers and consumers

• Sync/async publish and sync consumption

• Near real-time

How do we deploy a Solr cluster?

• Combining live (sync) indexing and async reindexing

• Approach used in our NRT clusters

• Search service is not affected by the process

• Fast and easy roll-back

Why reindexing?• Changes in the index format and schema require reindexing the

whole corpus!!!

• New features might require changes in schema

• Changes in data type

• Changes in tokenization and analysis chain

• Add new fields or update fields to already indexed documents

• Solr/Lucene version upgrades require reindexing

• We need seamless tools for reindexing

Initial status

Create new cluster

Activate double NRT indexing

Reindex old data

Switch search traffic

Deactivate double NRT indexing

now we can kill our old cluster...

... or maybe we can use both!

Search traffic doubling: Functional testing

• Duplicate search requests

• Send sync request to main cluster

• Send async request to alternative cluster

• Check that everything works correctly

• use real traffic without serving to real users.

Traffic segmentation• A/B testing

• relevance tests

• new ranking algorithms

• changes in tokenization/analysis chain

• new search components

• Performance testing

• Changes in machine configuration

• Changes in Solr configuration

• Changes in java configuration

• Changes in index/schema format

Much more than a traffic split

• DNS traffic splitting has some limitations

• Unable to send different search queries to each cluster

• Splitting traffic at the API level allows us to change Solr queries

• Removing barriers for testing new Solr features.

Conclusions

• there is no silver bullet

• actually know your service

• you are not the only one

• repeat it (even when not necessary)

Thanks!We are hiring!

Mobile Test Automation Engineer Web Test Automation Engineer

Functional QA Software Engineer Junior - Mobile iOS

Android Engineer

deploying immutable infrastructures with rabbitmq and solr

Technology

immutable laws

celery-rabbitmq documentation

rabbitmq and easynetq

rabbitmq on openvms

apache solr + ajax solr

interoperability with rabbitmq

optimizing solr to improve...

generated doc - rabbitmq

amqp and beyond - rabbitmq rabbitmq-jsonrpc-channel...

spring rabbitmq

using the jms client for vfabric rabbitmq - vmware€¦ ·...

functional immutable css

nyc lucene/solr meetup: spark / solr

rabbitmq hands on

rabbitmq install한글

rabbitmq + couchdb = awesome

immutable servers

rabbitmq & postgresql

solr fusion a solr proxy

the immutable journey