the open analytics platform - github pages · title knime_intro_bwiswedel_22oct2014 author:...

Post on 28-Jul-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

The Open Analytics Platform

Bernd Wiswedel

KNIME.com AG

Copyright © 2014 KNIME.com AG

KNIME.com AG

Agenda

• KNIME.com AG

• The KNIME Platform

• Recognition

• Small Sales Pitch

Copyright © 2014 KNIME.com AG

• KNIME and R – the best of two worlds

• KNIME (Node) Development

2

A Brief History of KNIME

• 2004: KNIME development commences

• 2006: KNIME v1 released

• 2006: Spin-off in Konstanz, Germany

• 2008: KNIME moves to Zurich

• 2010: Enterprise products released

Copyright © 2014 KNIME.com AG

• 2010: Enterprise products released

• 2011: KNIME.com AG founded

• 2013: KNIME opens San Francisco office

• 2014: KNIME opens Berlin office

„KNIME saved my

life in a world of scripts

that I do not want to learn!

3

The KNIME Platform

Copyright © 2014 KNIME.com AG 4

The KNIME Platform

Copyright © 2014 KNIME.com AG 5

KNIME loads and integrates data from diverse data sources:

• Different data bases

• Various file formats (CSV, XML, SDF, etc.)

Copyright © 2014 KNIME.com AG

Data Loading

6

KNIME provides huge repository of

modules for easy-to-use, modular

• Data preprocessing

• Data fusion

• Data transformation

Copyright © 2014 KNIME.com AG

Data Loading ETL

7

In addition to standard data

mining techniques, KNIME

adds cutting edge data

analysis algorithms.

(…thanks to its academic

roots)

Copyright © 2014 KNIME.com AG

Data Loading ETL Data Mining

8

Interactive views provide data overviews

and insights into the learned models.

Interactive linking&brushing techniques

allow for powerful exploration of models

and data.

Copyright © 2014 KNIME.com AG

Data Loading ETL Data Mining Visualization

and data.

9

Due to its open API and “node-in-a-sandbox”-approach

additional (also external) tools are easily integrated,

e.g.

• Access to the R Project

(statistical analysis/visualizations)

• Complete integration of the

machine learning library WEKA

• Application area specific integration, e.g. CDK

(Chemical Development Kit), RDKit, ImageJ, …

Copyright © 2014 KNIME.com AG

Data Loading ETL Data Mining Visualization External Tools

(Chemical Development Kit), RDKit, ImageJ, …

KNIME is Eclipse-based: Integrating other Eclipse

projects such as BIRT, DTP, etc. provides even more

functionality

10

Over 1000 native and embedded nodes included:

Copyright © 2014 KNIME.com AG

StatisticsData MiningMachine LearningWeb AnalyticsText MiningNetwork AnalysisSocial Media AnalysisWEKARCommunity / 3rd

MySQL, Oracle, etc.SAS, SPSS, etc.Excel, Flat, etc.Hive etc.XML, PMMLText, Doc, ImageWeb CrawlersIndustry SpecificCommunity / 3rd

ETLRow, ColumnMatrixText, ImageTime SeriesJavaPythonCommunity / 3rd

RJFreeChartCommunity / 3rd

via BIRTPMMLXMLDatabasesExcel, Flat, etc.Hive etc.Text, Doc, ImageIndustry SpecificCommunity / 3rd

11

Commercial Partners integrate their proprietary tools

⇒ KNIME serves as an integration platform for tools of

various vendors (or your inhouse/legacy applications)

Copyright © 2014 KNIME.com AG

Visualization External Tools3rd Party Tools

various vendors (or your inhouse/legacy applications)

12

Small KNIME Demo

Copyright © 2014 KNIME.com AG 13

Who’s Using KNIME?

>25,000 Individuals using KNIME

>3,000 Organizations using KNIME

>300 Customers paying for KNIME

as of January 2014

60kAnnual Unique Downloads

Copyright © 2014 KNIME.com AG

2011 2012 2013

40k

Annual Unique Downloads

20kOpen Source Users

14

Advanced

Pharma

Health CareManu-

facturing

Broad Range of KNIME Application Areas

Copyright © 2014 KNIME.com AG

Advanced

Analytics

Finance

Retail

Customer

Intelligence

15

Top in User Satisfaction

Copyright © 2014 KNIME.com AG

2012 & 2013 Rexer Analytics Survey

16

Sales Pitch: The KNIME Server at Work

Copyright © 2014 KNIME.com AG 17

KNIME in Action: Big Data

“As long as your machine can handle it, KNIME will

play along.”

Copyright © 2014 KNIME.com AG 18

KNIME and Big Data

Copyright © 2014 KNIME.com AG 19

KNIME and Big Data

Copyright © 2014 KNIME.com AG 20

KNIME and R

The best of two worlds

Copyright © 2014 KNIME.com AG 21

Why use KNIME and R?

• Powerful statistics

• (b)Leading edge algorithms

• Powerful GUI

• Good Extract/Transform/Load

• Open source analytics

R KNIME

Copyright © 2014 KNIME.com AG

• (b)Leading edge algorithms

• Powerful/flexible graphics

• Widely accepted language

• Good Extract/Transform/Load

• Integrates diverse tools

• Enterprise grade solutions

• Cross platform

• Vibrant communities

Two Integrations

• Community (RServe Integration)

Copyright © 2014 KNIME.com AG

• R Interactive (Today's topic)

Overview of new nodes

Copyright © 2014 KNIME.com AG

• Different input and output options

• Grey ports enable workspace branching

Columns

Workspace

The Interactive Editor

Copyright © 2014 KNIME.com AG

VariablesCode Editor

Workspace

Overview

Console

Templates

List

Copyright © 2014 KNIME.com AG

Preview

List

Summary

R Source nodes

• Get data from an R data frame

Copyright © 2014 KNIME.com AG

• Get data from an R data frame

• Assign output to a data frame named knime.out

• Use with foreign, RCurl, or ...

R Snippet nodes

• Generic data manipulation

Copyright © 2014 KNIME.com AG

• Edit tables or workspaces

• Derive knime.out from knime.in

• Use for cumulative stats, plyr, or ...

R Data Mining nodes

• Use R models in KNIME

• Learner (knime.model) &

Copyright © 2014 KNIME.com AG

• Learner (knime.model) & Predictor motif

• R to PMML support for model portability

R View nodes

• Generic R plots

Copyright © 2014 KNIME.com AG

• Plot(knime.in)

• Use with many packages including ggplot2

R in Action: Choropleth Generation

Copyright © 2014 KNIME.com AG

R in Action: Dose Response modeling

Copyright © 2014 KNIME.com AG

A Peak under the Hood:

KNIME (Node) Development

Copyright © 2014 KNIME.com AG 33

KNIME Workflow Manager & User Interface

KNIME

I/O

KNIME

Native

Algorithms

Open Source

Integrations

(R, BIRT, …)

Partner

Extensions

Node InterfaceNode Interface Node Interface Node Interface

Community

Extensions

Node Interface

Data Mgmt

&

Data Mgmt

&

Data Mgmt

&

Data Mgmt

&

Data Mgmt

&

KNIME Analytics Platform: Technology Overview

Copyright © 2014 KNIME.com AG

KNIME Data Management and Execution Layer

Execution ControlMeta Data

Handling Data Management

&

Execution Ctrl

&

Execution Ctrl

&

Execution Ctrl

&

Execution Ctrl

&

Execution Ctrl

Clu

ste

r

Exe

cuti

on

Mu

lti

Co

re

Exe

cuti

on

Dis

trib

ute

d

Da

ta S

tora

ge

Dis

trib

ute

d

Exe

cuti

on

In M

em

ory

Da

ta H

an

dli

ng

Au

tom

ati

c

Da

ta C

ach

ing

Da

ta T

yp

e

Ext

en

sio

ns

Node Architecture

• KNIME interacts only with a Node

• Node takes care of embedding the node in

class Node(final)

class Node-

class class

Copyright © 2014 KNIME.com AG

embedding the node in the infrastructure

• New nodes implement Model/View/Dialog

Node-Dialog-Pane

(abstract)

class Node-View

(abstract)

class Node-Model

(abstract)

class NodeFactory (abstract)

35

Node Extension Wizard

• Included in the KNIME Developer Version

• Allows creation of plugin projects including

functioning KNIME nodes (with sample code)

Copyright © 2014 KNIME.com AG

• Helpful to easily create all node classes

– Generates all Java classes

– Node is registered with the plugin project

– Launch KNIME and enjoy the new node working!

36

Node Extension Wizard

Copyright © 2014 KNIME.com AG 37

Node Extension Wizard

• Specify all settings to create a new KNIME node

– In a completely new plugin project, or

– Into an existing project

Copyright © 2014 KNIME.com AG

• Node type: Sink, Source, Learner, Predictor, Manipulator, Visualizer, Meta, or Other

• Include sample code or not

38

Node Extension Wizard

• Contains all Java

classes (including

sample code)

• Node is registered in

the plugin.xml

Copyright © 2014 KNIME.com AG

the plugin.xml

• NodeDialog and

NodeView class are

also created and

registered to the

NodeFactory

39

Resources

• KNIME pages (www.knime.org)

– APPLICATIONS for example workflows

– LEARNING HUB under RESOURCES www.knime.org/learning-hub

• KNIME Tech pages(tech.knime.org)

Copyright © 2014 KNIME.com AG

(tech.knime.org)

– FORUM for questions and answers

– DOCUMENTATION for documentation, FAQ, changelogs, ...

– LABS where to find new experimental nodes

– COMMUNITY CONTRIBUTIONS for development instructionsand third party nodes

• KNIME TV channel on

40

Thank you

Copyright © 2014 KNIME.com AG

Thank you

top related