what's new in ibm streams v4.1

22
© 2015 IBM Corporation What’s New With Streams Mike Spicer Lead Architect IBM Streams

Upload: lisanl

Post on 07-Jan-2017

417 views

Category:

Data & Analytics


4 download

TRANSCRIPT

Page 1: What's New in IBM Streams V4.1

© 2015 IBM Corporation

What’s New With Streams

Mike Spicer

Lead Architect IBM Streams

Page 2: What's New in IBM Streams V4.1

• IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal

without notice at IBM’s sole discretion.

• Information regarding potential future products is intended to outline our general product direction

and it should not be relied on in making a purchasing decision.

• The information mentioned regarding potential future products is not a commitment, promise, or

legal obligation to deliver any material, code or functionality. Information about potential future

products may not be incorporated into any contract.

• The development, release, and timing of any future features or functionality described for our

products remains at our sole discretion.

Performance is based on measurements and projections using standard IBM benchmarks in a

controlled environment. The actual throughput or performance that any user will experience will vary

depending upon many factors, including considerations such as the amount of multiprogramming in the

user’s job stream, the I/O configuration, the storage configuration, and the workload processed.

Therefore, no assurance can be given that an individual user will achieve results similar to those stated

here.

Please Note:

2

Page 3: What's New in IBM Streams V4.1

Streams Releases in 2015

2

Streams V4.0• Excel user access

• Automated High Availability

• Next Generation Architecture

• Multi instance management

• Guaranteed processing

• Toolkits enhancements

March

Page 4: What's New in IBM Streams V4.1

Streams Releases in 2015

3

Streams V4.0• Excel user access

• Automated High Availability

• Next Generation Architecture

• Multi instance management

• Guaranteed processing

• Toolkits enhancements

March

July

Streams V4.0.1• Resource Manager support

including YARN

• Toolkit enhancements

• Guaranteed Processing

• Geospatial toolkit

Page 5: What's New in IBM Streams V4.1

Streams Releases in 2015

4

Streams V4.0• Excel user access

• Automated High Availability

• Next Generation Architecture

• Multi instance management

• Guaranteed processing

• Toolkits enhancements

March

July

Streams V4.0.1• Resource Manager support

including YARN

• Toolkit enhancements

• Guaranteed Processing

• Geospatial toolkit

October

Streaming Analytics Service• Fully managed BlueMix Service

• Create an instance in seconds

• Easily deploy and manage applications

• Analytics & Connectors

• Messaging, stores, geospatial,

timeseries, SPSS and more

Page 6: What's New in IBM Streams V4.1

Streams Releases in 2015

5

Streams V4.0• Excel user access

• Automated High Availability

• Next Generation Architecture

• Multi instance management

• Guaranteed processing

• Toolkits enhancements

March

July

Streams V4.1• Data Lineage and Governance

• Java application API

• Customizable & application dashboards

• Support for Spark MLLib

• New DPS (Redis) & CyberSecurity Toolkits

• Stronger and Simpler security

• Platform Symphony Support

• New Platforms including Docker

Streams V4.0.1• Resource Manager support

including YARN

• Toolkit enhancements

• Guaranteed Processing

• Geospatial toolkit

November

October

Streaming Analytics Service• Fully managed BlueMix Service

• Create an instance in seconds

• Easily deploy and manage applications

• Analytics & connectors

• Messaging, stores, geospatial,

timeseries, SPSS and more

Page 7: What's New in IBM Streams V4.1

What’s New in V4.1

Page 8: What's New in IBM Streams V4.1

IBM Streams V4.1 Overview

• Simpler Development

Generating connectors and schemas based on Governance catalog

Java Application API to develop Streams apps entirely in Java

• Simpler Management

Customizable & application centric dashboards

Lineage and Governance of Streams applications

• Spark integration and new analytics & toolkits

Support for using Spark MLLib analytics in Streams applications

New DPS (Redis) & CyberSecurity Toolkits

• Stronger and Simpler security

• New Resource Managers & Platforms

• Improved Guaranteed processing

7

Page 9: What's New in IBM Streams V4.1

Java Application API

• Allow Java developers to create IBM Streams applications in Java

Natural for Java developer, No SPL knowledge required

• Note: This is not a replacement for SPL, its an additional way to develop streaming applications in the Streams ecosystem

Open Source https://github.com/IBMStreams/streamsx.topology

• Functional Style with operations on Streams and Windows

First class objects are TStream and TWindow

• SPL has operator invocation as its first class object

• Utilize existing features & analytics

Parallel Regions, Checkpointing, PE and Host Placement (with host tags)

Support for SPL Streams, Composites and Toolkits

• Executes in embedded and distributed modes

Deployable to Bluemix streaming analytic service

• Dynamic connections with pub/sub mechanism

• Supports Scala through the Java API with implicit conversions

https://github.com/IBMStreams/streamsx.topology/wiki/Scala-Support

8

Page 10: What's New in IBM Streams V4.1

Building & DeployingJava Streaming Applications

1. Declare Topology

Page 11: What's New in IBM Streams V4.1

Building & DeployingJava Streaming Applications

1. Declare Topology

2. Submit Topology

Page 12: What's New in IBM Streams V4.1

Lineage and Governance - Asset Discovery

• IBM Governance Catalog asset discovery in Streams studio

• Drag and drop assets from the catalog directly onto the graph

• Code generation based on assets available properties

• Connection info, Schemas etc.

11

Page 13: What's New in IBM Streams V4.1

Lineage and Governance – Asset Registration

• Registration of new assets in the Catalog from Streams studio

• Automatic asset and flow registration of Streams applications

• Streams instance properties to mark instance as governed

• Applications automatically registered, source, sink operators connected to the underlying catalog assets they use

12

Source and sink operators registered as Input and Output Streams in application

Flows registered to the files and directories used

Details about the application execution

Page 14: What's New in IBM Streams V4.1

Lineage and Governance – Data Lineage

• Streams application data lineage available in Catalog

13

By Application to see data source and targets for a Streams application

By data source/target to see who uses or modifies specific data

By Stream Import & export to see which streams applications connect to each other.

Page 15: What's New in IBM Streams V4.1

Support for SparkMLLib

• Toolkit provides the ability to use SparkMLLib models on Streams

On Spark use SparkMLib to create and save a model

On Streams SparkMLLib operators load the model and perform scoring on the stream

Open Source https://github.com/IBMStreams/streamsx.sparkMLLib

• Operators/Models supported

Classification : Linear SVM and Naïve Bayes

Clustering : Clustering KMeans

Collaborative filtering : Collaborative Filtering ALS

Regression : Isotonic, Linear and Logistic Regression

Tree : Decision tree, Ensemble Gradient Boosted Trees and Ensemble Random Forest

14

Page 16: What's New in IBM Streams V4.1

• Spark

Toward a better Hadoop ecosystem

Better data access & programming language support

• Spark Streaming

Time-based batch processing

Simple integration with Spark

• Streams

Complements both capabilities

Interoperate via Kafka & HDFS

Can Score Spark MLlib models faster (as well as SPSS, R, PMML and Matlab models)

Enterprise integration

Low latency

Event processing

Extended analytics

Spark and Streams: Better Together

Batch

Processing

Stream

ProcessingMicro

Batch

IBM InfoSphere

Streams

Apache Kafka

Streams

Apache Spark

Data Producersand Consumers

Page 17: What's New in IBM Streams V4.1

New Toolkits & Accelerators

• Cybersecurity Accelerator

Provides composites for accelerating development of cybersecurityapplications

• Sources to read packet data

• Models for analyzing DNS traffic :

– DomainProfiling: Detects if any domains are behaving suspiciously

– HostProfiling: Detects if individual hosts are behaving suspiciously

– PredictiveBlacklisting: Predicts if a domain should be added to an internal blacklist

• Distributed Process Store Toolkit

Provides the ability to manage application state using Redis key value store

• State can be shared across PEs, applications and hosts

– Based on open Source https://github.com/IBMStreams/streamsx.dps

– Store API: Managing stores and CRUD operations on KV data

– Time to Live API: A global store where KV data expires after a specified time

– Distributed Locking API: Managing and using distributed locks

16

Page 18: What's New in IBM Streams V4.1

Other Features

• Customizable & Application Centric Dashboard

Create custom dashboards for the data you need, in the format you want

Application dashboard focuses on application management

• Health summary, streams graph, flow rate chart, resource load, PE scatter chart

• Improved Guaranteed processing

Messaging toolkit support

Incremental checkpointing

SPL operator exception handling

• IBM Platform Symphony Resource Manager Support

Supports allocation and release of resources managed by Platform Symphony

• Security Enhancements

Authentication via PKI certificates and pluggable authentication modules

• X.509 certificates and Java Authentication and Authorization Service (JAAS) modules

LDAP Authorization without Anonymous Bind

REST API support for Cross-Origin Resource Sharing (CORS)

• Support for Docker, Power8 Little Endian & RHEL 7

17

Page 19: What's New in IBM Streams V4.1

Streaming Analytics Service on Bluemix

• The capabilities of IBM Streams on the cloud

• Fully managed BlueMix Service

Multi-tenant, Highly available and elastic service

Create an instance in seconds

Easily deploy and manage applications

Grow and Shrink resources available

• Integration with other blue mix services

Internet of Things, Cloudant, Message Hub,BigInsights HDFS

More coming soon

• Analytics & connectors

HTTP, Web Sockets, TCP, FTP, Apache Kafka, MQTT, JMS, and XMS (MQ)

Geospatial, TimeSeries, SPSS, R, CEP, Text and Telco

18

Page 20: What's New in IBM Streams V4.1

19

Notices and Disclaimers

Copyright © 2015 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form

without written permission from IBM.

U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM.

Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for

accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to

update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO

EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO,

LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted

according to the terms and conditions of the agreements under which they are provided.

Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice.

Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as

illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other

results in other operating environments may vary.

References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services

available in all countries in which IBM operates or does business.

Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the

views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or

other guidance or advice to any individual participant or their specific situation.

It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the

identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the

customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will

ensure that the customer is in compliance with any law.

Page 21: What's New in IBM Streams V4.1

20

Notices and Disclaimers (con’t)

Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly

available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance,

compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the

suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to

interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT NOT

LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.

The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights,

trademarks or other intellectual property right.

• IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DB2® , DOORS®, Emptoris®, Enterprise

Document Management System™, FASP®, FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™,

IBM SmartCloud®, IBM Social Business®, IMS™, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®,

OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®,

pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®,

StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are

trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names

might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark

information" at: www.ibm.com/legal/copytrade.shtml.

Page 22: What's New in IBM Streams V4.1

© 2015 IBM Corporation

Thank You