overview of message queues
TRANSCRIPT
Vanity slide
• Senior...whataver
• http://techblog.bozho.net
• http://twitter.com/bozhobg
Why queues?
• Decoupling:• service.performAction(params);
• sender.send(params);
• receiver.subscribe(actionType);
• Async processing• Background, redundant
• Integration• Multiple systems, multiple programming languages
Why queues (2)?
• Fault-tolerace• Stopping the processing component doesn’t stop
the application
• Spikability• (limited to specific types of requests)
• Event-sourcing friendly
Use-cases
• Gathering statistics• Visitor stats, used functionalities
• Sending emails
• Push
• Load balancing workers (e.g. video
encoding, text analysis)
• System integration
Types of queues
• Message Queue Brokers• Usually used as a synonym of “message queue”
• Distributed / brokerless
• In-memory
• Database-based
In-memory
• Simple queues, without the need to• distribute
• guarantee execution
• Brokerless in-memory• whether messages are stored on disk or in memory
is an implementation detail
In the Java world• JMS - JavaEE broker standard
• AMQP (RabbitMQ, ActiveMQ, Qpid)
• Kafka
• ZeroMQ
• JGroups
• Hazelcast
• Redis
• Spring @Async
Complications
• Exactly-once delivery• Consumer acknowledgments, publisher confirms
• Deliver order
• Persistence of data
• Transactional queues
There are only two hard problems in distributed
systems:
2. Exactly-once delivery
1. Guaranteed order of messages
2. Exactly-once delivery
(author: unknown)
Broker clusters
• No single point of failure
• How clients connect to the cluster?• Hard-coded IPs
• DNS round-robin
• Load-balancer
• Multi-datacenter performance; split brain
ZeroMQ
• Allows both broker and brokerless
• Not a “system” but a “library”
• Sockets++
• Messaging patterns - “do-it-yourself”
• => more work
• => learning curve
It’s complicated…
“Simplicity is prerequisite for reliability”
-- E.Dijkstra
The more components to administer, the more
“breakable” everything is.
Do we need a complex queues?
• Often - no. • If you don’t need order guarantees, integration of
multiple systems and languages, transactions
• Alternative:• Simple synchronous calls
• Async calls within a single app node
• Database queue + batch processing
• Hazelcast, JGroups
• akka
Synchronous calls
• If:• You don’t have multiple appliations to integrate
• You don’t need background processing
• Compile-time decoupling doesn’t give you much
• You just don’t need a queue
Asynchronous
• Spring @Async / ExecutorService• Risk of losing a message if an app node dies
• Option 1: you don’t care
• Option 2: client retry
• akka – akka cluster
Database queues
• You already have a distributed component
– the database• (regardless of whether it’s done by using sharding,
consistent hashing, master-slave)
• Application node – stores in a table
• Batch processor – reads the table
Distributed batch processing
• How to have multiple worker nodes distribute
their work?
• Do we need multiple? Failover workers• Spring batch
• Distributed locks• Hazelcast, Redisson, ZooKeeper
• lock.tryLock(uniqueId){ process(uniqueId); }
Queues and scalability?
• Often the two are connected
• But queues don’t always give scalability
• Scaling a broker isn’t trivial
Scalability
• Horizontal scaling: just the application nodes
• Taking extra load (spikes):• Message queues vs auto-scaling
• Distributing resource-heavy tasks• Message queues vs spot instances
Options, options...
• Before choosing what to use:• List all the mandatory features you need
• Try to simplify your use-case
• Why do you need a queue?• Distributing load
• Integrating various components
• “Because it’s cool”?
Conclusions
• No silver bullet
• It’s important to know all the options
• Use the simplest possible solution, but not
simpler