siphon - near real time databus using kafka, eric boyd, nitin kumar

Thursday, April 14, 2016

Siphon – Near Real Time Databus Using KafkaEric Boyd – CVP Engineering – Microsoft

Nitin Kumar – Principal Eng Manager - Microsoft

Linux is a

cancer

Thursday, April 14, 2016

Ads Oslo Schedule

Ads Oslo Feature List

Bing Ads Execution

• Shipped once every 6 months

• Averaged 3 marketplace experiments per month

• Big bets on marketplace features that didn’t work.

• Focused teams on 6 tracks with

independent metrics.

• Pushed teams to ship as quickly as they

could, focusing only on moving their

metric.

• Built/borrowed infrastructure to enable

much more rapid experimentation.

• Over 3 years got to a rate of >1000

experiments a month

Profitability!!

Eric joinsMSFT

What drove the turnaround?

• Focus on small teams with clear metrics each team was driving.

• Pushing each team to experiment and iterate as fast as possible. Data alone determines what gets shipped.

• Iterated on key metrics until we found the ones with the most impact.

• Commitment that we would get 1.5-2% better each month, and ship a package of experimentally tested improvements each month.

Relationship with Open Source

• From “Linux is a cancer…”

• To contributing to open source • Storm with C# - SCP.NET (http://www.nuget.org/packages/Microsoft.SCP.Net.SDK/)

• Spark with C# - Mobius (https://github.com/Microsoft/Mobius)

• Kafka with C# - C# Client for Kafka (https://github.com/Microsoft/Kafkanet)

• BOND (https://github.com/Microsoft/bond)

• Across MSFT• C#• VSCode• Hyper-V drivers for Linux• https://github.com/Microsoft/ with 18 pages of repositories!

Microsoft Big Data History

• Massive batch oriented systems• Hundreds of thousands of machines• Exabytes of storage• SQL-like language with C# extensions

Moving to streaming

Data Bus

Devices Services

Streaming Processing

BatchProcessing

Applications

Scalable pub/sub for NRT data streams

Interactive analytics

Vision

• A Databus for all Near Real Time (NRT) data in an organization.

• Quick and Easy Publication, Discovery and Subscription of NRT dataset.

• Compatibility with various Stream Processing systems like Storm, Spark, Splunk.

Siphon Adoption

15 months since launch

Excel Word Outlook

Windows 10

Bing Ads Campaign perf

Bing Live site telemetryCortana

Office 365

Siphon Data Volume (Ingress and Egress)

Volume published (GBps) Volume subscribed (GBps) Total Volume (GBps)

Siphon Events per second (Ingress and Egress)

EPS In Eps Out Total EPS

1.3 millionEVENTS PER SECOND INGRESS AT PEAK

~1 trillionEVENTS PER DAY PROCESSED AT PEAK

3.5 petabytesPROCESSED PER DAY

100 thousandUNIQUE DEVICES AND MACHINES

1,300PRODUCTION KAFKA BROKERS

Scale: Kafka at Microsoft (Ads, Bing, Office)

Kafka Brokers 1300+ across 5 Datacenters

Operating System Windows Server 2012 R2

Hardware Spec 12 Cores, 32 GB RAM, 4x2 TB HDD (JBOD), 10 GB Network

Incoming Events 1.3 million per sec, (112 Billion per day, 500 TB per day)

Outgoing Events 5 million per sec, (~1 Trillion per day, 3.5 PB per day)

Kafka Topics/Partitions 50+/5000+

Kafka version 0.8.1.1 (3 way replication)

Siphon Architecture

Asia DC

Zookeeper Canary

Collector

Services Data Pull (Agent)

Services Data Push

Device Proxy Services

Consumer API (Push/

Europe DC

Zookeeper Canary

Streaming

Audit Trail

Open Source

Microsoft Internal

Siphon

Multiple sources and schemas

Siphon Bond

Schema

A Main Header

MessageId

AuditId

TimeStamp

B Extended HeaderKey-Value[]

C Payload

JSONJSON

CSVSiphon Bond

Schema

Bond (https://github.com/Microsoft/bond) Cross platform framework for working with schematized data. Cross language (de) serialization. Similar to Protobuf, Thrift and AVRO.

Collector – Data Ingestion (Producer)

• Http(s) Server • Restful API with SSL support.• Abstraction from Kafka

internals (Partition, Kafka version)• Throttling, QPS Monitoring• PII scrubbing• Load balancing/failover to multiple DCs• Supported for both Windows and Linux

servers.

Collector

Kafka Brokers

Broker

Collector

Services Data Push

Open Source

Microsoft Internal

Siphon

URL : http://localhost/produce/<version>?topic=<toipic>Method : POST

Pull & Push Consumers

Virtual Network A

Kafka Brokers

Broker

P1Collector

Collector

Virtual Network B

Pull• RESTful API with SSL support• Works for out of network consumers• Supports metadata and data operation• Implement Simple consumer APIs• Spark streaming receiver for Kafka REST

Push• Configurable push to destinations like HDFS,

Cosmos, Kafka.• Utilizes KafkaNet - .NET High Level Consumer

(https://github.com/Microsoft/Kafkanet)

High Level Consumer

Monitoring using Canary

Collector

Kafka Brokers

Broker

Collector

rServices Data Push

Synthetic message

Audit Trail

Canary - https://github.com/Microsoft/Availability-Monitor-for-Kafka

High Level Consumer

Collector

Kafka Brokers

Broker

Collector

rServices Data Push

Audit Trail

Sampled vs Full Auditing support

Data completeness – Audit Trail

Production Experience – Telemetry Charts

• Monitoring using ELK• E2E Latency

• Data Completeness

• Processing Lag

• EPS breakdown by data center.

Key Takeaways

• Scale out with Kafka (50K -> 1M -> multi-million Events Per sec)

• Ability to build tunable Auditing/Monitoring

• Producer/Consumer Restful API provides a nice abstraction

• Config driven Pub/Sub system

siphon - near real time databus using kafka, eric boyd, nitin kumar

Engineering

cetpa databus fall 2013

sa m14 siphon printv

the databus

arinc 629 avionics databus

siphon storyboard

siphon ganda (melintas jalan)

61273276 arinc 629 avionics databus

cetpa databus fall 2009

perencanaan bangunan siphon

application of modular building block databus to air force

inverted siphon

general information -...

databus - linkedin's change data capture pipeline

cetpa databus spring 2011

straw bottle siphon

all aboard the databus

heat siphon caldera

siphon 1908 - bridgestone

bangunan pembawa siphon 2

fb siphon review + bonus