visualize. explore. transform. anaconda mosaic
TRANSCRIPT
© 2015 Continuum Analytics- Confidential & Proprietary
VISUALIZE. EXPLORE. TRANSFORM. ANACONDA MOSAIC
Lance Ransom, Anaconda Mosaic Product Manager Christine Doig, Senior Data Scientist
2
• Anaconda Overview • What is Anaconda Mosaic? • Flat File Repositories • Heterogeneous Data • Interactive Data Visualization • Summary • Q&A
Agenda
4
is…. Leading Open Data Science PlatformPowered by Python, the fastest growing data science language
• Accelerate Time-to-Value • Connect Data, Analytics & Compute • Empower Data Science Teams
© 2015 Continuum Analytics- Confidential & Proprietary 5
ACCELERATE Time-to-Value
INNOVATE faster through managed agile experimentation MOVE from analysis to deployment immediately DELIVER high performance analytics processing
CONNECT Data, Analytics & Compute
LEVERAGE innovative open source analytics to extract value from data MAXIMIZE your computational power to easily analyze all your data CONNECT and integrate all your data sources for predictive models
EMPOWER Data Science Teams
ITERATE quickly to create powerful analysis and predictive models COLLABORATE and share with your data science team PUBLISH interactive results to the business
© 2015 Continuum Analytics- Confidential & Proprietary 6
Data ScientistBiz Analyst Data EngineerDeveloper DevOps
Deploy & Operate
Explore & Analyze
Collaborate & Publish
Data Science Team
8
Anaconda Mosaic
• Create PORTABLE transformations • Interactively EXPLORE heterogeneous
data • Easily ANALYZE large flat file repositories • ELIMINATE data movement and
redundant storage • CATALOG datasets and transformations • ESTABLISH data lineage
9
ANALYTICS
DATA
Built on Anaconda Open Source Technology
BokehBlaze
DBsExcel Flat file repositories
10
SQL
CSV
REST
JSON
SQL
CSV
REST
JSON
SQL
CSV
SQL
CSV
Loss of Data Visibility
Poor data visibility inhibits valuable insights
11
Oracle MySQL MSSQL KDB ZIPCSV
SQL
Python
DSL
R Excel
C++ Java
Storage
ETL
Analysis
REST
Heterogenous Environments
13
AccumulatesQuickly
DisparateStorageDifferentVendors
FormatChanges
Ad-hocUsage
Urgent!
Compounding Data Challenges
15
Demo 1: Flat file repositories
• COMBINE individual CSV files • ELIMINATE data movement and
redundant storage • INCORPORATE names into dataset • OPTIMIZE computations
16
Flat file repositoriessource: "lux://global-equities/data/daily/us/nasdaq stocks" extractor: "{}/{Symbol}.{Region}.txt"
Date,Open,High,Low,Close,Volume,OpenInt,Symbol,Region20151111,18.5,25.9,18,24.5,1584600,0,aaap,us20151112,24.25,27.12,22.5,25,83000,0,aaap,us20151113,25.47,26.2,24.55,25.26,67300,0,aaap,us…20160322,11.56,11.98,10.8894,11.09,517604,0,zyne,us20160323,11.3,11.72,9.5,9.75,489743,0,zyne,us20160324,9.5,10.24,9.22,9.64,188512,0,zyne,us
Onedatasetwith~5.5millionrows
19
Demo 2: Heterogenous Data
• COMPOSE expressions independent of storage system
• PUSH transformations to the data • Lazily EVALUATE • FAMILIAR Pandas like API
20
Mosaic Ecosystem
Expression
ComputeData
Oracle MySQL MSSQL KDB ZIPCSV
SQL
Python
DSL
R Excel
C++ Java
Storage
ETL
Analysis
REST
Oracle MySQL MSSQL KDB ZIPCSV
SQL
Python
DSL
R Excel
C++ Java
Storage
ETL
Analysis
REST
Oracle MySQL MSSQL KDB ZIPCSV
SQL
Python
DSL
R Excel
C++ Java
Storage
ETL
Analysis
REST
24
Demo 3: Interactive Data Visualization
• EXPLORE datasets visually • Easily CHANGE plot types • MOVE around and zoom in or out • PLOT large datasets with DataShader
http://go.continuum.io/datashader/
Learn more about datashader
26
Why Anaconda Mosaic?
EMPOWER Data Science Teams
ACCELERATE Time-to-Value
CONNECT Data, Analytics & Compute
27
https://www.continuum.io/anaconda-subscriptions
Anaconda Mosaic is available with Anaconda Enterprise Subscription
28
Demo Documentation Learn more
Request a private demo for your team
Review Anaconda Mosaic
Documentation
Learn more about all the features in Anaconda
Enterprise subscriptions
www.continuum.io/anaconda-subscriptionsdocs.continuum.io/anaconda/mosaic/[email protected]
Next steps