data science with spotfire for opening government data for innovators and entrepreneurs dr. brand...

46
Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community http://semanticommunity.info/ AOL Government Blogger http://gov.aol.com/bloggers/brand-niemann/ February 18, 2012 1

Upload: allen-pearson

Post on 16-Dec-2015

223 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

1

Data Science With Spotfire for Opening Government Data for Innovators and

EntrepreneursDr. Brand Niemann

Director and Senior Enterprise Architect – Data ScientistSemantic Community

http://semanticommunity.info/AOL Government Blogger

http://gov.aol.com/bloggers/brand-niemann/February 18, 2012

Page 2: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

2

TIBCO Spotfire 4.0

Page 3: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

TIBCO Spotfire 4.0

http://spotfire.tibco.com/

The Top 5 Reasons Why Spotfire Analytics is Better and Smarter:• Clarity of Visualization• Freedom of Spreadsheets• Relevance of Applications• Confidence of Statistics• Reach of Reports

Page 4: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

4

TIBCOSilver Spotfire Features Matrix

https://silverspotfire.tibco.com/us/get-spotfire/feature-matrix

Free

Page 5: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

5

TIBCOSilver Spotfire Tutorials

https://silverspotfire.tibco.com/us/tutorials

Page 6: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

6

My Silver SpotfireData Science Library in the Cloud

Data Science Library in the Cloud

I will show you examplesof how I built these later.

Federal Budget 2013in a day!

Page 7: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

7

Federal Budget 2013 Dashboard

PC Desktop Spotfire

Page 9: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

9

The Value Proposition of Spotfire• More to Do with Less?

– Take Control of Your Business Data– Visualize Your Data - Drag and

Drop Your Spreadsheets– Customize Your Dashboards -

Instantly Add New Visualization– Share Your Insights - Publish Your

Dashboard– Get Trial of Silver Spotfire

• Agile Analysis:– Fastest to Actionable Insight– Insight Into the Unknown– Self-Service Discovery– Universal Analytics Platform

Source: http://www.gartner.com/technology/reprints.do?id=1-196U0P5&ct=120207&st=sg Source: https://silverspotfire.tibco.com/us/home

Page 10: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

10

The Value Proposition of Agile Analysis - Invert your "bath tub" with Spotfire Analytics

Spotfire offers dimension-free data exploration, data mashups, predictive and event driven, contextual collaboration and enterprise class technology.Source: Jim Hawley, Spotfire Federal Government

Page 11: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

11

Spotfire is Part of TIBCO

Video Presentation

Page 12: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

12

The Value Proposition of Data Science

• We are interested in learning about Taxonomy and Enterprise Vocabulary for fundamental architectural elements to enable interoperability and provide consistent understanding of shared architecture information across the enterprise.– Source: Walt Okon, Senior Architect Engineer, Enterprise

Architecture & Standards, Department of Defense Chief Information Officer, October 4th, Email.

• Aneesh Chopra: Government’s Big Data Opportunity. “The Federal Government needs Data Science and Data Scientists!”– Source: O’Reilly STRATA Conference New York, September 20,

2011.

Page 13: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

13

What is Data Science?• Data science enables the creation of data products.• Data science is a holistic approach.• The first step of any data analysis project is “data conditioning,” or getting

the data into a state where it is usable.• Statistics is the “grammar of data science.”• Edward Tufte’s Visual Display of Quantitative Information is a foundational

text for anyone practicing data science. He calls himself a data scientist!• Data scientists are patient, inherently interdisciplinary, and can think

outside the box.• Some References:

– Data Science Graduate Class at RPI, Troy, NY– Data Science– AOL Government

Page 14: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

14

Data Science Architecture

• 1. Create an inventory of documents and data sets.• 2. Build that inventory in an Excel spreadsheet so it

supports faceted search in a Spotfire dashboard.• 3. Provide a sample knowledgebase of each of the four

types of documents (Word, PDF, PowerPoint, and Excel).• 4. Provide the multiple sample knowledgebases in a

Spotfire dashboard so they can be seen, compared, merged, harmonized, sorted, searched, downloaded, and shared on mobile devices (e.g. iPad).

• 5. Scale the previous architectural pattern with more content volume and types if necessary.

Page 15: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

15

Knowledgebase

• What is a knowledgebase?– Knowledgebase = Model + Instances– Model = Vocabulary, Taxonomy, and Ontology/Rules– Instances = Linked Data Semantically Linked to the Model

• How is a knowledgebase built?– Model = Vocabulary – Glossary in MindTouch– Taxonomy – Contents and Resources in MindTouch– Ontology/Rules in Be Informed 4– Instances = Linked Data Semantically Linked to Model –

MindTouch, Excel and Spotfire

Page 16: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

16

The Knowledgebase in MindTouch

• MindTouch is often referred to as the “Swiss Army Knife” of collaboration tools! See MindTouch Web Site.

• So I make MindTouch look like a “Knowledge Hub” (e.g., on top of SharePoint Portal like the Army Corps of Engineers Knowledge Hub) and feature key documents and data sets.

• Relating one or more Spotfire dashboards to the key document and data sets points to the ability to track progress. It’s all about metrics!

Page 17: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

17

MindTouch Social Knowledge Base – Social Help Center

http://www.mindtouch.com/solutions/knowledge_base

MindTouch provides exceptional, purpose-built social help desks and knowledge bases for some of the world’s largest and most respected technology and media brands. Our solutions layer social and collaborative capabilities over existing systems and deliver strategic value to our customers. Product help is strategic for user assistance teams, product and marketing teams, community managers, and product evangelists as they look to build engaged communities around their brands to increase top and bottom line revenues.

Page 18: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

18

Army Corps of Engineers Knowledge Hub

• The Knowledge Hub is a dynamic online destination to feature products developed by the US Army Corps of Engineers as well as to engage end-users and others in innovative and intuitive interaction. Within the Knowledge Hub is a Navigation Community which provides a forum on which navigation personnel can discuss, share, learn, explore and search products, project and programs of concern them. One goal of the Hub is to be a web-based framework for enterprise decision support and tech transfer within the Corps of Engineers.– POC: Marty Kittrell, [email protected].

Source: http://chl.erdc.usace.army.mil/Media/1/2/2/0/Nav_eNews_Mar-2011.pdf

Page 19: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

19

MindTouch Knowledgebase

http://semanticommunity.info/Budget_of_the_United_States_Government_Fiscal_Year_2013

AOL Government StorySpotfire DashboardResearch Notes (Metadata)Complete Budget DocumentAttachments (see next slide)Comments (see next slide)

Page 20: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

20

MindTouch Knowledgebase

http://semanticommunity.info/Budget_of_the_United_States_Government_Fiscal_Year_2013

Page 21: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

21

Data Science is Part of My System of Systems Architecture

SSemantic Index ofLinked Data (e.g. Excel)

Dynamic Case Management (e.g. Be Informed)

Data Science Library (e.g. Spotfire)

Data Science Products (e.g. Spotfire)

Page 22: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

22

Agile Methods:Questions on Our Minds

• What Should We Do with Enterprise Architecture?– Be like a building architect that provides a blueprint with

building specifications and a scale (able) model.• How Should We Do That?

– With Be Informed, an internationally operating, independent software vendor that has been recognized recently by Gartner and Forrester.

• What is Be Structured?– It is complimentary to various well-known development,

compliance and architecture frameworks, including ITIL, Cobit, Prince II, RUP, TOGAF, Zachman, SCRUM, Cogniam, DEMO, and Pronto. Note: See my tutorials.

Page 23: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

23

Working Within A Broader Context• Begin with the End in Mind (see Next Slide):

– Open Innovator's Toolkit• President Obama emphasizes a “bottom-up” philosophy that taps citizen expertise to make government

smarter and more responsive to private sector demands. This philosophy of “open innovation” has already delivered tangible results in public and regulated sectors of the economy – areas like health IT, learning technologies, and smart grid – that are poised to deliver productivity growth and grow the jobs of the future. We have surfaced new or improved policy tools deployed by our government to achieve them. We’ve posted the Open Innovator’s Toolkit as a roster of 20 leading practices that an “open innovator” should consider when confronting any policy challenge – at any level of government. Our aspiration is to build upon this list, adding new tools and case studies to form an evidence base that will help to scale “open innovation” across the public sector.

• Follow 5 Easy Steps:– 1. Build an table of contents-like index of complex documents with well-defined web addresses

in MindTouch.– 2. Build that index in an Excel spreadsheet so it supports faceted search in a Spotfire dashboard.– 3. Build a Spotfire knowledgebase with that Excel spreadsheet.– 4. Build multiple knowledgebases in a Spotfire dashboard so they can be seen, compared,

merged, harmonized, sorted, searched, downloaded, and shared on mobile devices (e.g. iPad).– 5. Scale the previous architectural pattern with more content volume and types if necessary.

Page 24: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

24

Open Government Initiative:Opening Data For Innovators and Entrepreneurs

http://www.whitehouse.gov/open/toolkit

Our aspiration is to build upon this list, adding new tools and case studies to form an evidence base that will help to scale “open innovation” across the public sector.

Page 25: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

25

Step 1. Build an table of contents-like index of complex documents with well-defined web addresses in MindTouch.

http://semanticommunity.info/AOL_Government/Open_Innovator's_Toolkit-Taking_the_Challenge

Page 26: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

26

2. Build that index in an Excel spreadsheet so it supports faceted search in a Spotfire dashboard.

http://semanticommunity.info/AOL_Government/Open_Innovator's_Toolkit-Taking_the_Challenge#Data_Table_(Excel)

Note: This MindTouch tablecopies directly to Excel in thenext slide.

Page 27: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

27

2. Build that index in an Excel spreadsheet so it supports faceted search in a Spotfire dashboard.

http://semanticommunity.info/@api/deki/files/17378/=OpenInnovator'sToolkitTable02182012.xlsx

Page 28: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

28

3. Build a Spotfire knowledgebase with that Excel spreadsheet.

PC Desktop Spotfire

Page 29: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

29

3. Build a Spotfire knowledgebase with that Excel spreadsheet.

• Building Steps:– 1 - Drag and Drop Spreadsheet Onto Spotfire (see Scatter Plot

automatically).– 2 - Add New Table to Display Spreadsheet Data (make any

adjustments/corrections and Refresh Data).– 3 - Adjust Scatter Plot Axes, Color by, Shape by, Size by to produce

desired display.– 4 - Add New Test Area, Rename Page as Dashboard and Add MindTouch

and Excel with Web links to sources of metadata and data.– 5 - Insert Action Controls to Reset All Filters and Unmark Marked Rows.– 6 - Save Spotfire file to hard drive with desired name and then save to

Library.– 7 - Test Web Player version and embed in MindTouch.

Page 31: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

31

Follow 5 Easy Steps

• Step 4. Build multiple knowledgebases in a Spotfire dashboard so they can be seen, compared, merged, harmonized, sorted, searched, downloaded, and shared on mobile devices (e.g. iPad).– Another example: How To Simplify Benefits Website For

Veterans (AOL Government, MindTouch, Excel, Spotfire, and PowerPoint Tutorial).

• Step 5. Scale the previous architectural pattern with more content volume and types if necessary.– My Silver Spotfire Library in the Cloud!

Page 32: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

32

New Features in Spotfire 4.0

http://stn.spotfire.com/stn/Site/News40.aspx

Page 33: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

33

New Features in Spotfire 4.0• Information At A Glance:

– Dynamic Values– Conditional Icons– Sparklines– Graphical Summary Table

• Look and Feel:– All New Graphical Profile– Pop-Over Filter Panel– Pop-Over Legend– Individual Control Over Axis Label Visibility– More Control Over Legend Contents and Placement– Fixed Size Layout– Mix Filters and Controls on the Page– Nicer Looking Tables– Combine Different Slices of Data on the Same Page– Toolbars and Information

Page 34: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

34

New Features in Spotfire 4.0(Continued)

• Navigation and Interaction:– Actions– Page History Navigation– Embed Interactive Controls

• Building Dashboards:– Preserve Information When Switching Visualizations– Change All Fonts in One Place– Easier Access to Toggling Visualization Features– More Predefined Categorical Coloring Schemes– Manage Document Color Schemes– Better Defaults When Creating Visualizations– Toggle Auto Column Additions Off– Analysis Previews– Control Over Table Header Font

Page 35: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

35

New Features in Spotfire 4.0

• Collaboration:– Share with TIBBR– Add TIBBR Discussions to the Analysis– Embed Dashboards in Other Web Pages

• Other Enhancements:– Export Footer– Stepped Linecharts– Automation Services 4.0

Page 36: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

36

New Features in Spotfire 4.0• Information At A Glance:

– Dynamic Values:• What – Dynamically display single values in text areas that responds to filtering and parameter

changes.• Why – Look at most important numbers first before diving into more details.• My Note: See Next Slides.

– Conditional Icons:• What – Dynamically calculated conditional icons that respond to filtering and parameter changes.• Why – Indicate change, comparisons to target and highlight important events.• My Note: See Next Slides.

– Sparklines:• What – Dynamically calculated sparklines that respond to filtering and parameter changes.• Why – Show at a glance and when drilling in whether a metric is trending down, up or varies a lot.• My Note: See Next Slides.

– Graphical Summary Table:• What – Dynamic values, conditional icons and sparklines in one compact table broken down by some

category.• Why – Visually show everything you need on a single screen.• My Note: See Next Slides.

Page 37: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

37

New Features in Spotfire 4.0:Dynamic Values, Conditional Icons, Sparklines, and Graphical Summary Table

PC Desktop Spotfire

Page 38: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

38

New Features in Spotfire 4.0:Dynamic Values, Conditional Icons, Sparklines, and Graphical Summary Table

PC Desktop Spotfire

Filter forDebt Service

Page 39: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

39

New Features in Spotfire 4.0• Collaboration:

– Share with TIBBR:• What – Right Click on any visualization or page and share the view in tibbr with a link back

to the analysis• Why – Easy sharing of insights and findings.• My Note: Tibbr host name has to be set by Administrator.

– Add TIBBR Discussions to the Analysis:• What – Integrated tibbr discussions right in the analysis filtered to a particular subject.• Why – Discuss insights and findings with colleagues directly in the analysis. Subscribe and

get notified when someone posts a comment on an analysis you are interested in.• My Note: See Next Slide.

– Embed Dashboards in Other Web Pages:• What – One click access to HTML fragments that displays a Spotfire page that can be pasted

directly into portals and other web pages.• Why – Put a link to the analysis in your corporate blog or wiki. Integrate Spotfire analysis

displays into SharePoint WebPart and other portals.• My Note: I was already doing this!

Page 40: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

40

New Features in Spotfire 4.0Collaboration: Add TIBBR Discussions to the Analysis

Page 41: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

41

New Features in Spotfire 4.0• Other Enhancements:

– Export Footer:• What – Include a footer when exporting or printing pages.• Why – Make it clear where the printout came from or indicate to the reader that the contents is

confidential.• My Note: See Next Slide.

– Stepped Linecharts:• What – Draw stepped linecharts that only show a change in value at the exact point where the

value changed.• Why – Better representation of discrete data that avoids misleading the user by interpolating

values in between data points.• My Note: See Next Slide.

– Automation Services 4.0:• What – New task added to remap Information Services catalogs and schemas during an

automated Library import.• What – Allow for the automation of migrating a Spotfire Information Model from a test to

production environment in instances when the test and production instances of the data source are in different database catalogs or schemas.

• My Note: See Slides That Follow.

Page 42: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

42

New Features in Spotfire 4.0:Other Enhancements

Stepped LinechartsExport Footer

Page 43: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

43

New Features in Spotfire 4.0:Other Enhancements: Stepped Linecharts

PC Desktop Spotfire

Page 44: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

44

New Features in Spotfire 4.0Other Enhancements: Automation Services 4.0

http://stn.spotfire.com/stn/Site/News.aspx

TIBCO Spotfire Automation ServicesSelecting a "Set data source credentials" task in the job builder will now allow you to go back and select a different certificate if the first one selected is invalid.

Page 45: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

45

New Features in Spotfire 4.0Other Enhancements: Automation Services 4.0

http://stn.spotfire.com/stn/Platform/InformationServices.aspx

Page 46: Data Science With Spotfire for Opening Government Data for Innovators and Entrepreneurs Dr. Brand Niemann Director and Senior Enterprise Architect – Data

46

New Features in Spotfire 4.0Other Enhancements: Automation Services 4.0

http://semanticommunity.info/Build_DoD_in_the_Cloud/Enterprise_Information_Web_for_Semantic_Interoperability_at_DoD/Spotfire_Information_Designer

My Note: Customize Spotfire Documentation in MindTouch