webinar: data streaming with apache kafka & mongodb
Post on 07-Jan-2017
639 Views
Preview:
TRANSCRIPT
Data Streaming with Apache Kafka &
MongoDB
Andrew Morgan – MongoDB Product MarketingDavid Tucker – Director, Partner Engineering
and Alliances at Confluent
8th November 2016
Agenda
Target Audience
Apache Kafka
MongoDB
Integrating MongoDB and Kafka
Kafka – What’s Next
Next Steps
Target Audience
Target Audience
Target Audience
Target Audience
Target Audience
Target Audience
Apache Kafka / Confluent Platform
What does Kafka do?
Producers
Consumers
Kafka Connect
Kafka Connect
Topic
Your interfaces to the world
Connected to your systems in real time
What is Streaming Data
Synchronous Req/Response0 – 100s ms
Near Real Time> 100s ms
Offline Batch> 1 hour
KAFKAStream Data Platform
Search
RDBMS
Apps Monitoring
Real-time AnalyticsNoSQL Stream Processing
HADOOPData Lake
Impala
DWH
Hive
Spark Map-Reduce
Confluent’s OfferingsCore
Connect
Streams
Java Client
Kafka
Confluent Platform EnterpriseConfluent Platform
Multi-data-center ReplicationMore Clients
Advanced Data BalancingREST Proxy
Stream MonitoringSchema Registry
Connector ManagementPre-Built Connectors
Confluent Platform: It’s Kafka ++Feature Benefit Apache Kafka Confluent Open
Source Confluent Enterprise
Apache Kafka High throughput, low latency, high availability, secure distributed message system
Kafka Connect Advanced framework for connecting external sources/destinations into Kafka
Kafka Streams Simple library that enables streaming application development within the Kafka framework
Additional Clients Supports non-Java clients; C, C++, Python, etc.
REST Proxy Provides universal access to Kafka from any network connected device via HTTP
Schema Registry Central registry for the format of Kafka data – guarantees all data is always consumable
Pre-Built Connectors HDFS, JDBC, elasticsearch and other connectors fully certified and fully supported by Confluent
Confluent Control Center Enables easy connector management and stream monitoring
Data Center & Cloud MDC Replication, auto-data balancing
Support Enterprise class support to keep your Kafka environment running at top performance Community Community 24x7x365
Free Free Subscription
Common Kafka Use Cases
Data transport and integration• Log data• Database changes• Sensors and device data• Monitoring streams• Call data records• Stock ticker data
Real-time stream processing• Monitoring• Asynchronous applications• Fraud and security
Kafka Adoption in Large Enterprises
6 of the top 10 travel companies
8 of the top 10 insurance companies
7 of the top 10 global banks
9 of the top 10telecom companies
People Using Kafka TodayFinancial Services
Entertainment & Media
Consumer Tech
Travel & Leisure
Enterprise Tech
Telecom Retail
MongoDB
Relational
Expressive Query Language& Secondary Indexes
Strong Consistency
Enterprise Management& Integrations
The World Has ChangedData Risk
Time Cost
NoSQL
Scalability& Performance
Always On,Global Deployments
FlexibilityExpressive Query Language& Secondary Indexes
Strong Consistency
Enterprise Management& Integrations
Nexus Architecture
Scalability& Performance
Always On,Global Deployments
FlexibilityExpressive Query Language& Secondary Indexes
Strong Consistency
Enterprise Management& Integrations
Integrating MongoDB and Kafka
Where MongoDB Fits
Prod324
123...
Topic A
Prod967
123...
Topic B
Filter
Filter
Merge534
123...
Topic C
Analyze496
123...
Topic D
TakeAction
Take Action
Where MongoDB Fits
Prod324
123...
Topic A
Prod967
123...
Topic B
Filter
Filter
Merge534
123...
Topic C
Analyze496
123...
Topic D
TakeAction
StoreResults
Operational Database
Where MongoDB Fits
Prod324
123...
Topic A
Prod967
123...
Topic B
Filter
Filter
Merge534
123...
Topic C
Analyze496
123...
Topic D
TakeAction
StoreResults
KeyEvents
Operational Database
Where MongoDB Fits
Prod324
123...
Topic A
Prod967
123...
Topic B
Filter
Filter
Merge534
123...
Topic C
Analyze496
123...
Topic D
TakeAction
StoreResults
KeyEvents
Operational Database
Where MongoDB Fits
Prod324
123...
Topic A
Prod967
123...
Topic B
Filter
Filter
Merge534
123...
Topic C
Analyze496
123...
Topic D
TakeAction
StoreResults
KeyEvents
Operational Database
Reference Data
Where K-Streams Fits
Prod324
123...
Topic A
Prod967
123...
Topic B
534
123...
Topic C
Analyze496
123...
Topic D
TakeAction
StoreResults
KeyEvents
Operational Database
Reference Data
Kafka Streams
MongoDB As a Kafka Producer
Mes
sage
Que
ue
Customer Data Mgmt Mobile App IoT App Live Dashboards
Raw Data
Processed Events
Distributed Processing Frameworks
Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations
Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model
Sensors
User Data
Clickstreams
Logs
Churn Analysis
Enriched Customer Profiles
Risk Modeling
Predictive Analytics
Real-Time Access
Batch Processing, Batch Views
Design Pattern: Operationalized Data Lake
Kafka Streams
Mes
sage
Que
ue
Customer Data Mgmt Mobile App IoT App Live Dashboards
Raw Data
Processed Events
Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations
Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model
Sensors
User Data
Clickstreams
Logs
Churn Analysis
Enriched Customer Profiles
Risk Modeling
Predictive Analytics
Real-Time Access
Batch Processing, Batch Views
Design Pattern: Operationalized Data LakeConfigure where to land incoming data
Distributed Processing Frameworks
Kafka Streams
Mes
sage
Que
ue
Customer Data Mgmt Mobile App IoT App Live Dashboards
Raw Data
Processed Events
Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations
Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model
Sensors
User Data
Clickstreams
Logs
Churn Analysis
Enriched Customer Profiles
Risk Modeling
Predictive Analytics
Real-Time Access
Batch Processing, Batch Views
Design Pattern: Operationalized Data Lake
Raw data processed to generate analytics models
Distributed Processing Frameworks
Kafka Streams
Mes
sage
Que
ue
Customer Data Mgmt Mobile App IoT App Live Dashboards
Raw Data
Processed Events
Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations
Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model
Sensors
User Data
Clickstreams
Logs
Churn Analysis
Enriched Customer Profiles
Risk Modeling
Predictive Analytics
Real-Time Access
Batch Processing, Batch Views
Design Pattern: Operationalized Data LakeMongoDB exposes analytics models to operational apps. Handles real time
updates
Distributed Processing Frameworks
Kafka Streams
Mes
sage
Que
ue
Customer Data Mgmt Mobile App IoT App Live Dashboards
Raw Data
Processed Events
Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations
Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model
Sensors
User Data
Clickstreams
Logs
Churn Analysis
Enriched Customer Profiles
Risk Modeling
Predictive Analytics
Real-Time Access
Batch Processing, Batch Views
Design Pattern: Operationalized Data LakeCompute new
models against MongoDB &
HDFS
Distributed Processing Frameworks
Kafka Streams
https://www.mongodb.com/presentations/replacing-traditional-technologies-mongodb-single-platform-all-financial-data-ahl
http://www.slideshare.net/danharvey/change-data-capture-with-mongodb-and-kafka
Kafka – What’s Next
Kafka Connectors• Confluent-supported connectors (included in CP)
• Community-written connectors (just a sampling)
JDBC
Kafka Futures• Apache Core
• Admin API (KIP-4)• Exactly-once delivery semantics• Time-based topic indexing
• Kafka Streams• Exactly-once processing semantics• Interactive Queries: enable real-time sharing of application state with
other applications• Confluent Platform Enterprise
• Multi-cluster views and expanded alerting added to Control Center
Next Steps
MongoDB AtlasDatabase as a service for MongoDB
MongoDB Atlas is…
• Automated: The easiest way to build, launch, and scale apps on MongoDB
• Flexible: The only database as a service with all you need for modern applications
• Secured: Multiple levels of security available to give you peace of mind
• Scalable: Deliver massive scalability with zero downtime as you grow
• Highly available: Your deployments are fault-tolerant and self-healing by default
• High performance: The performance you need for your most demanding workloads
MongoDB Atlas Features
• Spin up a cluster in minutes
• Replicated & always-on deployments
• Fully elastic: scale out or up in a few clicks with zero downtime
• Automatic patches & simplified upgrades for the newest MongoDB features
• Authenticated & encrypted
• Continuous backup with point-in-time recovery
• Fine-grained monitoring & custom alerts
Safe & SecureRun for You
• On-demand pricing model; billed by the hour
• Multi-cloud support (AWS available with others coming soon)
• Part of a suite of products & services designed for all phases of your app; migrate easily to different environments (private cloud, on-prem, etc) when needed
No Lock-In
Database as a service for MongoDB
MongoDB Enterprise Advanced
• MongoDB Ops Manager or MongoDB Cloud Manager Premium
• MongoDB Compass
• MongoDB Connector for BI
• Cloud Foundry Integration
• Encrypted Storage Engine
• LDAP / Kerberos Integration
• DDL & DML Auditing
• FIPS 140-2 Support
SecurityTooling
• 24 x 7 Support
• 1 hr SLA
• Emergency Patches
• Customer Success Program
• On-Demand Training
Support License
• Commercial License
Resources• Data Streaming with Apache Kafka & MongoDB
• https://www.mongodb.com/collateral/data-streaming-with-apache-kafka-and-mongodb
• Implementing a Kafka Consumer for MongoDB• https://
www.mongodb.com/blog/post/mongodb-and-data-streaming-implementing-a-mongodb-kafka-consumer
• Tailing the Oplog on a sharded MongoDB Cluster• https://www.mongodb.com/blog/post/tailing-mongodb-oplog-sharded
-clusters
Old Billingsgate, London15th November
mongodb.com/europe
Use my discount code for 20% off: andrewmorgan20
top related