multi user data science with zeppelin
Post on 16-Apr-2017
2.484 Views
Preview:
TRANSCRIPT
Vinay Shukla Twitter: @neomythosFeb 17th, 2016
Multi User Data Science with Zeppelin® ®
Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
DisclaimerThis document may contain product features and technology directions that are under development, may be under development in the future or may ultimately not be developed.
Project capabilities are based on information that is publicly available within the Apache Software Foundation project websites ("Apache"). Progress of the project capabilities can be tracked from inception to release through Apache, however, technical feasibility, market demand, user feedback and the overarching Apache Software Foundation community development process can all effect timing and final delivery.
This document’s description of these features and technology directions does not represent a contractual commitment, promise or obligation from Hortonworks to deliver these features in any generally available product.
Product features and technology directions are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.
Since this document contains an outline of general product development plans, customers should not rely upon it when making purchasing decisions.
Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Introducing Apache Zeppelin Web-based Notebook for interactive analyticsFeaturesAd-hoc experimentation
Spark, Hive, Shell, Flink, Tajo, Ignite, Lens, etc
Deeply integrated with Spark + HadoopCan be managed via Ambari Stacks
Supports multiple language backendsPluggable “Interpreters”
Incubating at Apache100% open source and open community
Use CaseData exploration and discoveryVisualization
tables, graphs and charts
Interactive snippet-at-a-time experienceCollaboration and publishing“Modern Data Science Studio”
Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache Zeppelin
Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
PySpark / Spark SQL
Page 6 © Hortonworks Inc. 2014
Spark & Zeppelin Pace of InnovationHDP 2.2.4
Spark 1.2.1GA
HDP 2.3.2Spark 1.4.1
GA
HDP 2.3.0Spark 1.3.1
GA
HDP 2.3.4Spark 1.5.2*
GA
Spark
Spark 1.3.1 TP
5/2015
Spark 1.4.1 TP
8/2015
Spark 1.5.1 TP
Nov/2015 Now
ZeppelinTP
Oct/2015
Apache Zeppelin
Zeppelin TP Refresh
March 1st 2016
Dec 2015
HDP 2.4.0Spark 1.6
GA
Zeppelin GA
Q1, 2016
Spark 1.6 TP
Jan/2015
March 1st 2016
HDP 2.5.xSpark 1.6.1*
GAQ1, 2016
© Hortonworks Inc. 2015. All Rights Reserved
What’s New in HDP 2.4.0?
• Spark 1.6 GA – GA of Dynamic Resource Allocation*
• Zeppelin TP#2– Notebook import/export features– LDAP Authentication*
Marketing announcement coming March 1st
© Hortonworks Inc. 2015. All Rights Reserved
Requirements for Zeppelin in a M/T Env• Support multiple users • Security - Provide security sandbox by default• Authentication – LDAP – Integrate with Corporate Identity
Store• Authorization – Access Control for both Data & Notebooks• Encryption – Work with both Wire & encrypted data• Audit – Keep track of who did, what, when & what results
with non-repudiation• Manageability• Sharing/Collaboration of both data & notebooks
Page 9 © Hortonworks Inc. 2014
Zeppelin GA – Features
•Ambari Managed Install/Configuration
•Runs in a Kerberos Cluster
•LDAP Authentication
•SSL
•Notebook Import/Export
Coming April, 2016
Page 10 © Hortonworks Inc. 2014
Zeppelin Missing Features
•R Interpreter
•Better Visualizations–GGPlot,, Shiny equivalent visualizations
•Access Control on Notebooks
•Library Management
Page 11 © Hortonworks Inc. 2014
What is coming later? – H2, 2016•Zeppelin Improvements –Zeppelin Access Control–Ambari managed LDAP Configuration–Pluggable Visualization–R Interpreter
Page 12 © Hortonworks Inc. 2014
Various Apache Zeppelin JIRA/Pull Requests–Identity Propagation: https://issues.apache.org/jira/browse/ZEPPELIN-645
–LDAP Authentication: https://github.com/apache/incubator-zeppelin/pull/625
–Notebook Access Control: https://github.com/apache/incubator-zeppelin/pull/681
–Notebook Import/Export: https://issues.apache.org/jira/browse/ZEPPELIN-372
–R Interpreter: https://issues.apache.org/jira/browse/ZEPPELIN-156
Page 13 © Hortonworks Inc. 2014
Thank YouTwitter:@neomythos
top related