spark data streaming pipeline

Post on 16-Apr-2017

242 Views

Category:

Software

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Spark DSMData Streaming PipelineORCHESTRATING DATA STORAGE, PROCESSING, AND MOVEMENT

Background

Today’s data landscape for enterprises continues to grow exponentially in volume, variety, and complexity.

Multiple geographic locations, on-premises and cloud Combination of open source, commercial solutions and custom processing code Can be expensive, hard to integrate and maintain. Ever increasing volumes of data (terabytes, petabytes) New ways of processing data (Hadoop, Spark etc.)

.NET Developers write large amounts of custom point-solution logic Difficult to maintain and orchestrate Performance bottlenecks

SparkPipe Framework

A development framework to deliver a .NET information production system that co-ordinates all of this data and processing.

Familiar technologies for .NET developers including .NET Framework 4.0 Windows Workflow Foundation Task Parallel Library Dataflow

Drag and drop business process pipeline modeling Designed for performance to scale across processor cores and servers

from the local data center to cloud providers such as Microsoft Azure

Build Solutions

Build data-driven workflows (pipelines) that join, aggregate and transform data sourced from on-premises, cloud-based, and internet data stores.

Transform semi-structured, unstructured and structured data from diverse data sources into trusted information.

Produce data that can be easily consumed by using business intelligence (BI), analytics tools, and other applications.

Set up complex data processing through simple composing.

Visual Pipeline Design

Built for “Cloud Scale”

Support for Microsoft Azure offerings including: Azure SQL Server HDInsight (HADOOP) Blob, Tables, Queues and ServiceBus

Automatically spin-up cloud servers, process data and then shut down to for cost-effective processing.

Support for Healthcare

Out of the box components include: HL7 v2 Clinical Document Architecture EDI 834 PGP Encryption Secure FTP

Typical Process Flow

top related