end user semantic web applications
TRANSCRIPT
End User Semantic Web Applications
David KargerMIT CSAIL
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how those tools work for them
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
Back Storybull New prof in 1995 aiming to do research in IR bull Got a big grant to buy powerful IR machinesbull Then decided computation wasnrsquot the problembull People canrsquot find information because their
applications wonrsquot let them storeorganizeview it the way they wantndash Hard-coded schemasndash Fixed visualizationsndash Information fragmentation
Problem-Driven Agendabull Create a UI that would let end user
ndash Collect arbitrary informationndash Define their own schemandash Design their own visualizations in that schema
bull Under the hood used techniques fromndash HCIndash Machine Learningndash Programming languagesndash Databases
bull But all driven by need to solve a specific problem
Haystackbull Semantic Web app before the Semantic Web
ndash Research paid for by the machines I didnrsquot buybull Entity-relation data modelbull ldquoLensesrdquo to display individual items
ndash Specification of which properties and their layoutbull ldquoViewsrdquo of collections
ndash Eg lists thumbnails tabularbull ldquoFacetsrdquo to filter itemsbull When RDF invented became Haystack model
ndash And haystack became an early ldquosemantic desktoprdquo
Writing a Brain Research Paper
Adding ldquoThings to Dordquo Region
Revised Environment
Role of Semantic Webbull Was Haystack a Semantic Web Application
ndash How could it be if created before Semantic Webbull Wasnrsquot ldquoworking on the semantic webrdquo
ndash Rather was working on a problem users havendash Seeking solutions from any discipline
bull What exactly can the Semantic Web contributebull What makes a ldquoSemantic Web applicationrdquobull What use is the Semantic Web anyway
Semantic Web Applications
bull What is novel about semantic web applicationsndash Use of triple stores RDF inferencendash Any such app can be emulated with ldquooldrdquo technology
bull Mottandash An application that leverages the semantics of its data
bull Kargerndash An application whose schema is expected to changendash Into whatever its user desiresndash Paradox cannot leverage semantics of datandash Creates major challenges for user interface design
Role of Semantic Webbull The Semantic Web holds a big part of the answer
to a major problem in end-user information management
bull Key contribution is ldquomutable schemardquo paradigmbull So yes Haystack was a Semantic Web Application
Rest of Talkbull Continue end-user problem-centric perspective
ndash And thesis that Semantic Web can solve itbull Convince you end-user data problem is serious
ndash Channel a CHI talk by Voida Harman Al-Anibull 3 Semantic Web applications that chip away at it
ndash All designed around ldquoopen schemardquo principlendash Supercharge spreadsheets for data interactionndash A standard for data amp visualization in HTMLndash An end-user programmable data-handling agent
bull Wrap-up thoughts about SW and ESWC
Homebrew Databasesbull A paper by Voida Harman Al Anibull Published at CHI 2010bull Highlights how bad this problem isbull An embedded user study
ndash No tools builtndash Went where users werendash Watched what they didndash Tried to understand whyndash SW community needs this badly
ldquoI WANT MY SPREADSHEET DATABASE TO WORK BETTERrdquo
SUPERCHARGING SPREADSHEETS FOR DATA MANAGEMENT
Eirik Bakke David Karger Rob MillerA spreadsheet-based user interface for managing plural relationships in structured data [CHI 2011]
Spreadsheetsbull As wersquove seen a dominant tool for databull But limited
ndash Flat tablendash Hard to represent entity-relationship graphsndash No typesndash No support for many-many relationshipsndash No joins
bull Can we add power but preserve look and feelndash Yes with nested cells and data wormholes
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how those tools work for them
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
Back Storybull New prof in 1995 aiming to do research in IR bull Got a big grant to buy powerful IR machinesbull Then decided computation wasnrsquot the problembull People canrsquot find information because their
applications wonrsquot let them storeorganizeview it the way they wantndash Hard-coded schemasndash Fixed visualizationsndash Information fragmentation
Problem-Driven Agendabull Create a UI that would let end user
ndash Collect arbitrary informationndash Define their own schemandash Design their own visualizations in that schema
bull Under the hood used techniques fromndash HCIndash Machine Learningndash Programming languagesndash Databases
bull But all driven by need to solve a specific problem
Haystackbull Semantic Web app before the Semantic Web
ndash Research paid for by the machines I didnrsquot buybull Entity-relation data modelbull ldquoLensesrdquo to display individual items
ndash Specification of which properties and their layoutbull ldquoViewsrdquo of collections
ndash Eg lists thumbnails tabularbull ldquoFacetsrdquo to filter itemsbull When RDF invented became Haystack model
ndash And haystack became an early ldquosemantic desktoprdquo
Writing a Brain Research Paper
Adding ldquoThings to Dordquo Region
Revised Environment
Role of Semantic Webbull Was Haystack a Semantic Web Application
ndash How could it be if created before Semantic Webbull Wasnrsquot ldquoworking on the semantic webrdquo
ndash Rather was working on a problem users havendash Seeking solutions from any discipline
bull What exactly can the Semantic Web contributebull What makes a ldquoSemantic Web applicationrdquobull What use is the Semantic Web anyway
Semantic Web Applications
bull What is novel about semantic web applicationsndash Use of triple stores RDF inferencendash Any such app can be emulated with ldquooldrdquo technology
bull Mottandash An application that leverages the semantics of its data
bull Kargerndash An application whose schema is expected to changendash Into whatever its user desiresndash Paradox cannot leverage semantics of datandash Creates major challenges for user interface design
Role of Semantic Webbull The Semantic Web holds a big part of the answer
to a major problem in end-user information management
bull Key contribution is ldquomutable schemardquo paradigmbull So yes Haystack was a Semantic Web Application
Rest of Talkbull Continue end-user problem-centric perspective
ndash And thesis that Semantic Web can solve itbull Convince you end-user data problem is serious
ndash Channel a CHI talk by Voida Harman Al-Anibull 3 Semantic Web applications that chip away at it
ndash All designed around ldquoopen schemardquo principlendash Supercharge spreadsheets for data interactionndash A standard for data amp visualization in HTMLndash An end-user programmable data-handling agent
bull Wrap-up thoughts about SW and ESWC
Homebrew Databasesbull A paper by Voida Harman Al Anibull Published at CHI 2010bull Highlights how bad this problem isbull An embedded user study
ndash No tools builtndash Went where users werendash Watched what they didndash Tried to understand whyndash SW community needs this badly
ldquoI WANT MY SPREADSHEET DATABASE TO WORK BETTERrdquo
SUPERCHARGING SPREADSHEETS FOR DATA MANAGEMENT
Eirik Bakke David Karger Rob MillerA spreadsheet-based user interface for managing plural relationships in structured data [CHI 2011]
Spreadsheetsbull As wersquove seen a dominant tool for databull But limited
ndash Flat tablendash Hard to represent entity-relationship graphsndash No typesndash No support for many-many relationshipsndash No joins
bull Can we add power but preserve look and feelndash Yes with nested cells and data wormholes
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Back Storybull New prof in 1995 aiming to do research in IR bull Got a big grant to buy powerful IR machinesbull Then decided computation wasnrsquot the problembull People canrsquot find information because their
applications wonrsquot let them storeorganizeview it the way they wantndash Hard-coded schemasndash Fixed visualizationsndash Information fragmentation
Problem-Driven Agendabull Create a UI that would let end user
ndash Collect arbitrary informationndash Define their own schemandash Design their own visualizations in that schema
bull Under the hood used techniques fromndash HCIndash Machine Learningndash Programming languagesndash Databases
bull But all driven by need to solve a specific problem
Haystackbull Semantic Web app before the Semantic Web
ndash Research paid for by the machines I didnrsquot buybull Entity-relation data modelbull ldquoLensesrdquo to display individual items
ndash Specification of which properties and their layoutbull ldquoViewsrdquo of collections
ndash Eg lists thumbnails tabularbull ldquoFacetsrdquo to filter itemsbull When RDF invented became Haystack model
ndash And haystack became an early ldquosemantic desktoprdquo
Writing a Brain Research Paper
Adding ldquoThings to Dordquo Region
Revised Environment
Role of Semantic Webbull Was Haystack a Semantic Web Application
ndash How could it be if created before Semantic Webbull Wasnrsquot ldquoworking on the semantic webrdquo
ndash Rather was working on a problem users havendash Seeking solutions from any discipline
bull What exactly can the Semantic Web contributebull What makes a ldquoSemantic Web applicationrdquobull What use is the Semantic Web anyway
Semantic Web Applications
bull What is novel about semantic web applicationsndash Use of triple stores RDF inferencendash Any such app can be emulated with ldquooldrdquo technology
bull Mottandash An application that leverages the semantics of its data
bull Kargerndash An application whose schema is expected to changendash Into whatever its user desiresndash Paradox cannot leverage semantics of datandash Creates major challenges for user interface design
Role of Semantic Webbull The Semantic Web holds a big part of the answer
to a major problem in end-user information management
bull Key contribution is ldquomutable schemardquo paradigmbull So yes Haystack was a Semantic Web Application
Rest of Talkbull Continue end-user problem-centric perspective
ndash And thesis that Semantic Web can solve itbull Convince you end-user data problem is serious
ndash Channel a CHI talk by Voida Harman Al-Anibull 3 Semantic Web applications that chip away at it
ndash All designed around ldquoopen schemardquo principlendash Supercharge spreadsheets for data interactionndash A standard for data amp visualization in HTMLndash An end-user programmable data-handling agent
bull Wrap-up thoughts about SW and ESWC
Homebrew Databasesbull A paper by Voida Harman Al Anibull Published at CHI 2010bull Highlights how bad this problem isbull An embedded user study
ndash No tools builtndash Went where users werendash Watched what they didndash Tried to understand whyndash SW community needs this badly
ldquoI WANT MY SPREADSHEET DATABASE TO WORK BETTERrdquo
SUPERCHARGING SPREADSHEETS FOR DATA MANAGEMENT
Eirik Bakke David Karger Rob MillerA spreadsheet-based user interface for managing plural relationships in structured data [CHI 2011]
Spreadsheetsbull As wersquove seen a dominant tool for databull But limited
ndash Flat tablendash Hard to represent entity-relationship graphsndash No typesndash No support for many-many relationshipsndash No joins
bull Can we add power but preserve look and feelndash Yes with nested cells and data wormholes
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Problem-Driven Agendabull Create a UI that would let end user
ndash Collect arbitrary informationndash Define their own schemandash Design their own visualizations in that schema
bull Under the hood used techniques fromndash HCIndash Machine Learningndash Programming languagesndash Databases
bull But all driven by need to solve a specific problem
Haystackbull Semantic Web app before the Semantic Web
ndash Research paid for by the machines I didnrsquot buybull Entity-relation data modelbull ldquoLensesrdquo to display individual items
ndash Specification of which properties and their layoutbull ldquoViewsrdquo of collections
ndash Eg lists thumbnails tabularbull ldquoFacetsrdquo to filter itemsbull When RDF invented became Haystack model
ndash And haystack became an early ldquosemantic desktoprdquo
Writing a Brain Research Paper
Adding ldquoThings to Dordquo Region
Revised Environment
Role of Semantic Webbull Was Haystack a Semantic Web Application
ndash How could it be if created before Semantic Webbull Wasnrsquot ldquoworking on the semantic webrdquo
ndash Rather was working on a problem users havendash Seeking solutions from any discipline
bull What exactly can the Semantic Web contributebull What makes a ldquoSemantic Web applicationrdquobull What use is the Semantic Web anyway
Semantic Web Applications
bull What is novel about semantic web applicationsndash Use of triple stores RDF inferencendash Any such app can be emulated with ldquooldrdquo technology
bull Mottandash An application that leverages the semantics of its data
bull Kargerndash An application whose schema is expected to changendash Into whatever its user desiresndash Paradox cannot leverage semantics of datandash Creates major challenges for user interface design
Role of Semantic Webbull The Semantic Web holds a big part of the answer
to a major problem in end-user information management
bull Key contribution is ldquomutable schemardquo paradigmbull So yes Haystack was a Semantic Web Application
Rest of Talkbull Continue end-user problem-centric perspective
ndash And thesis that Semantic Web can solve itbull Convince you end-user data problem is serious
ndash Channel a CHI talk by Voida Harman Al-Anibull 3 Semantic Web applications that chip away at it
ndash All designed around ldquoopen schemardquo principlendash Supercharge spreadsheets for data interactionndash A standard for data amp visualization in HTMLndash An end-user programmable data-handling agent
bull Wrap-up thoughts about SW and ESWC
Homebrew Databasesbull A paper by Voida Harman Al Anibull Published at CHI 2010bull Highlights how bad this problem isbull An embedded user study
ndash No tools builtndash Went where users werendash Watched what they didndash Tried to understand whyndash SW community needs this badly
ldquoI WANT MY SPREADSHEET DATABASE TO WORK BETTERrdquo
SUPERCHARGING SPREADSHEETS FOR DATA MANAGEMENT
Eirik Bakke David Karger Rob MillerA spreadsheet-based user interface for managing plural relationships in structured data [CHI 2011]
Spreadsheetsbull As wersquove seen a dominant tool for databull But limited
ndash Flat tablendash Hard to represent entity-relationship graphsndash No typesndash No support for many-many relationshipsndash No joins
bull Can we add power but preserve look and feelndash Yes with nested cells and data wormholes
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Haystackbull Semantic Web app before the Semantic Web
ndash Research paid for by the machines I didnrsquot buybull Entity-relation data modelbull ldquoLensesrdquo to display individual items
ndash Specification of which properties and their layoutbull ldquoViewsrdquo of collections
ndash Eg lists thumbnails tabularbull ldquoFacetsrdquo to filter itemsbull When RDF invented became Haystack model
ndash And haystack became an early ldquosemantic desktoprdquo
Writing a Brain Research Paper
Adding ldquoThings to Dordquo Region
Revised Environment
Role of Semantic Webbull Was Haystack a Semantic Web Application
ndash How could it be if created before Semantic Webbull Wasnrsquot ldquoworking on the semantic webrdquo
ndash Rather was working on a problem users havendash Seeking solutions from any discipline
bull What exactly can the Semantic Web contributebull What makes a ldquoSemantic Web applicationrdquobull What use is the Semantic Web anyway
Semantic Web Applications
bull What is novel about semantic web applicationsndash Use of triple stores RDF inferencendash Any such app can be emulated with ldquooldrdquo technology
bull Mottandash An application that leverages the semantics of its data
bull Kargerndash An application whose schema is expected to changendash Into whatever its user desiresndash Paradox cannot leverage semantics of datandash Creates major challenges for user interface design
Role of Semantic Webbull The Semantic Web holds a big part of the answer
to a major problem in end-user information management
bull Key contribution is ldquomutable schemardquo paradigmbull So yes Haystack was a Semantic Web Application
Rest of Talkbull Continue end-user problem-centric perspective
ndash And thesis that Semantic Web can solve itbull Convince you end-user data problem is serious
ndash Channel a CHI talk by Voida Harman Al-Anibull 3 Semantic Web applications that chip away at it
ndash All designed around ldquoopen schemardquo principlendash Supercharge spreadsheets for data interactionndash A standard for data amp visualization in HTMLndash An end-user programmable data-handling agent
bull Wrap-up thoughts about SW and ESWC
Homebrew Databasesbull A paper by Voida Harman Al Anibull Published at CHI 2010bull Highlights how bad this problem isbull An embedded user study
ndash No tools builtndash Went where users werendash Watched what they didndash Tried to understand whyndash SW community needs this badly
ldquoI WANT MY SPREADSHEET DATABASE TO WORK BETTERrdquo
SUPERCHARGING SPREADSHEETS FOR DATA MANAGEMENT
Eirik Bakke David Karger Rob MillerA spreadsheet-based user interface for managing plural relationships in structured data [CHI 2011]
Spreadsheetsbull As wersquove seen a dominant tool for databull But limited
ndash Flat tablendash Hard to represent entity-relationship graphsndash No typesndash No support for many-many relationshipsndash No joins
bull Can we add power but preserve look and feelndash Yes with nested cells and data wormholes
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Writing a Brain Research Paper
Adding ldquoThings to Dordquo Region
Revised Environment
Role of Semantic Webbull Was Haystack a Semantic Web Application
ndash How could it be if created before Semantic Webbull Wasnrsquot ldquoworking on the semantic webrdquo
ndash Rather was working on a problem users havendash Seeking solutions from any discipline
bull What exactly can the Semantic Web contributebull What makes a ldquoSemantic Web applicationrdquobull What use is the Semantic Web anyway
Semantic Web Applications
bull What is novel about semantic web applicationsndash Use of triple stores RDF inferencendash Any such app can be emulated with ldquooldrdquo technology
bull Mottandash An application that leverages the semantics of its data
bull Kargerndash An application whose schema is expected to changendash Into whatever its user desiresndash Paradox cannot leverage semantics of datandash Creates major challenges for user interface design
Role of Semantic Webbull The Semantic Web holds a big part of the answer
to a major problem in end-user information management
bull Key contribution is ldquomutable schemardquo paradigmbull So yes Haystack was a Semantic Web Application
Rest of Talkbull Continue end-user problem-centric perspective
ndash And thesis that Semantic Web can solve itbull Convince you end-user data problem is serious
ndash Channel a CHI talk by Voida Harman Al-Anibull 3 Semantic Web applications that chip away at it
ndash All designed around ldquoopen schemardquo principlendash Supercharge spreadsheets for data interactionndash A standard for data amp visualization in HTMLndash An end-user programmable data-handling agent
bull Wrap-up thoughts about SW and ESWC
Homebrew Databasesbull A paper by Voida Harman Al Anibull Published at CHI 2010bull Highlights how bad this problem isbull An embedded user study
ndash No tools builtndash Went where users werendash Watched what they didndash Tried to understand whyndash SW community needs this badly
ldquoI WANT MY SPREADSHEET DATABASE TO WORK BETTERrdquo
SUPERCHARGING SPREADSHEETS FOR DATA MANAGEMENT
Eirik Bakke David Karger Rob MillerA spreadsheet-based user interface for managing plural relationships in structured data [CHI 2011]
Spreadsheetsbull As wersquove seen a dominant tool for databull But limited
ndash Flat tablendash Hard to represent entity-relationship graphsndash No typesndash No support for many-many relationshipsndash No joins
bull Can we add power but preserve look and feelndash Yes with nested cells and data wormholes
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Adding ldquoThings to Dordquo Region
Revised Environment
Role of Semantic Webbull Was Haystack a Semantic Web Application
ndash How could it be if created before Semantic Webbull Wasnrsquot ldquoworking on the semantic webrdquo
ndash Rather was working on a problem users havendash Seeking solutions from any discipline
bull What exactly can the Semantic Web contributebull What makes a ldquoSemantic Web applicationrdquobull What use is the Semantic Web anyway
Semantic Web Applications
bull What is novel about semantic web applicationsndash Use of triple stores RDF inferencendash Any such app can be emulated with ldquooldrdquo technology
bull Mottandash An application that leverages the semantics of its data
bull Kargerndash An application whose schema is expected to changendash Into whatever its user desiresndash Paradox cannot leverage semantics of datandash Creates major challenges for user interface design
Role of Semantic Webbull The Semantic Web holds a big part of the answer
to a major problem in end-user information management
bull Key contribution is ldquomutable schemardquo paradigmbull So yes Haystack was a Semantic Web Application
Rest of Talkbull Continue end-user problem-centric perspective
ndash And thesis that Semantic Web can solve itbull Convince you end-user data problem is serious
ndash Channel a CHI talk by Voida Harman Al-Anibull 3 Semantic Web applications that chip away at it
ndash All designed around ldquoopen schemardquo principlendash Supercharge spreadsheets for data interactionndash A standard for data amp visualization in HTMLndash An end-user programmable data-handling agent
bull Wrap-up thoughts about SW and ESWC
Homebrew Databasesbull A paper by Voida Harman Al Anibull Published at CHI 2010bull Highlights how bad this problem isbull An embedded user study
ndash No tools builtndash Went where users werendash Watched what they didndash Tried to understand whyndash SW community needs this badly
ldquoI WANT MY SPREADSHEET DATABASE TO WORK BETTERrdquo
SUPERCHARGING SPREADSHEETS FOR DATA MANAGEMENT
Eirik Bakke David Karger Rob MillerA spreadsheet-based user interface for managing plural relationships in structured data [CHI 2011]
Spreadsheetsbull As wersquove seen a dominant tool for databull But limited
ndash Flat tablendash Hard to represent entity-relationship graphsndash No typesndash No support for many-many relationshipsndash No joins
bull Can we add power but preserve look and feelndash Yes with nested cells and data wormholes
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Revised Environment
Role of Semantic Webbull Was Haystack a Semantic Web Application
ndash How could it be if created before Semantic Webbull Wasnrsquot ldquoworking on the semantic webrdquo
ndash Rather was working on a problem users havendash Seeking solutions from any discipline
bull What exactly can the Semantic Web contributebull What makes a ldquoSemantic Web applicationrdquobull What use is the Semantic Web anyway
Semantic Web Applications
bull What is novel about semantic web applicationsndash Use of triple stores RDF inferencendash Any such app can be emulated with ldquooldrdquo technology
bull Mottandash An application that leverages the semantics of its data
bull Kargerndash An application whose schema is expected to changendash Into whatever its user desiresndash Paradox cannot leverage semantics of datandash Creates major challenges for user interface design
Role of Semantic Webbull The Semantic Web holds a big part of the answer
to a major problem in end-user information management
bull Key contribution is ldquomutable schemardquo paradigmbull So yes Haystack was a Semantic Web Application
Rest of Talkbull Continue end-user problem-centric perspective
ndash And thesis that Semantic Web can solve itbull Convince you end-user data problem is serious
ndash Channel a CHI talk by Voida Harman Al-Anibull 3 Semantic Web applications that chip away at it
ndash All designed around ldquoopen schemardquo principlendash Supercharge spreadsheets for data interactionndash A standard for data amp visualization in HTMLndash An end-user programmable data-handling agent
bull Wrap-up thoughts about SW and ESWC
Homebrew Databasesbull A paper by Voida Harman Al Anibull Published at CHI 2010bull Highlights how bad this problem isbull An embedded user study
ndash No tools builtndash Went where users werendash Watched what they didndash Tried to understand whyndash SW community needs this badly
ldquoI WANT MY SPREADSHEET DATABASE TO WORK BETTERrdquo
SUPERCHARGING SPREADSHEETS FOR DATA MANAGEMENT
Eirik Bakke David Karger Rob MillerA spreadsheet-based user interface for managing plural relationships in structured data [CHI 2011]
Spreadsheetsbull As wersquove seen a dominant tool for databull But limited
ndash Flat tablendash Hard to represent entity-relationship graphsndash No typesndash No support for many-many relationshipsndash No joins
bull Can we add power but preserve look and feelndash Yes with nested cells and data wormholes
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Role of Semantic Webbull Was Haystack a Semantic Web Application
ndash How could it be if created before Semantic Webbull Wasnrsquot ldquoworking on the semantic webrdquo
ndash Rather was working on a problem users havendash Seeking solutions from any discipline
bull What exactly can the Semantic Web contributebull What makes a ldquoSemantic Web applicationrdquobull What use is the Semantic Web anyway
Semantic Web Applications
bull What is novel about semantic web applicationsndash Use of triple stores RDF inferencendash Any such app can be emulated with ldquooldrdquo technology
bull Mottandash An application that leverages the semantics of its data
bull Kargerndash An application whose schema is expected to changendash Into whatever its user desiresndash Paradox cannot leverage semantics of datandash Creates major challenges for user interface design
Role of Semantic Webbull The Semantic Web holds a big part of the answer
to a major problem in end-user information management
bull Key contribution is ldquomutable schemardquo paradigmbull So yes Haystack was a Semantic Web Application
Rest of Talkbull Continue end-user problem-centric perspective
ndash And thesis that Semantic Web can solve itbull Convince you end-user data problem is serious
ndash Channel a CHI talk by Voida Harman Al-Anibull 3 Semantic Web applications that chip away at it
ndash All designed around ldquoopen schemardquo principlendash Supercharge spreadsheets for data interactionndash A standard for data amp visualization in HTMLndash An end-user programmable data-handling agent
bull Wrap-up thoughts about SW and ESWC
Homebrew Databasesbull A paper by Voida Harman Al Anibull Published at CHI 2010bull Highlights how bad this problem isbull An embedded user study
ndash No tools builtndash Went where users werendash Watched what they didndash Tried to understand whyndash SW community needs this badly
ldquoI WANT MY SPREADSHEET DATABASE TO WORK BETTERrdquo
SUPERCHARGING SPREADSHEETS FOR DATA MANAGEMENT
Eirik Bakke David Karger Rob MillerA spreadsheet-based user interface for managing plural relationships in structured data [CHI 2011]
Spreadsheetsbull As wersquove seen a dominant tool for databull But limited
ndash Flat tablendash Hard to represent entity-relationship graphsndash No typesndash No support for many-many relationshipsndash No joins
bull Can we add power but preserve look and feelndash Yes with nested cells and data wormholes
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Semantic Web Applications
bull What is novel about semantic web applicationsndash Use of triple stores RDF inferencendash Any such app can be emulated with ldquooldrdquo technology
bull Mottandash An application that leverages the semantics of its data
bull Kargerndash An application whose schema is expected to changendash Into whatever its user desiresndash Paradox cannot leverage semantics of datandash Creates major challenges for user interface design
Role of Semantic Webbull The Semantic Web holds a big part of the answer
to a major problem in end-user information management
bull Key contribution is ldquomutable schemardquo paradigmbull So yes Haystack was a Semantic Web Application
Rest of Talkbull Continue end-user problem-centric perspective
ndash And thesis that Semantic Web can solve itbull Convince you end-user data problem is serious
ndash Channel a CHI talk by Voida Harman Al-Anibull 3 Semantic Web applications that chip away at it
ndash All designed around ldquoopen schemardquo principlendash Supercharge spreadsheets for data interactionndash A standard for data amp visualization in HTMLndash An end-user programmable data-handling agent
bull Wrap-up thoughts about SW and ESWC
Homebrew Databasesbull A paper by Voida Harman Al Anibull Published at CHI 2010bull Highlights how bad this problem isbull An embedded user study
ndash No tools builtndash Went where users werendash Watched what they didndash Tried to understand whyndash SW community needs this badly
ldquoI WANT MY SPREADSHEET DATABASE TO WORK BETTERrdquo
SUPERCHARGING SPREADSHEETS FOR DATA MANAGEMENT
Eirik Bakke David Karger Rob MillerA spreadsheet-based user interface for managing plural relationships in structured data [CHI 2011]
Spreadsheetsbull As wersquove seen a dominant tool for databull But limited
ndash Flat tablendash Hard to represent entity-relationship graphsndash No typesndash No support for many-many relationshipsndash No joins
bull Can we add power but preserve look and feelndash Yes with nested cells and data wormholes
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Role of Semantic Webbull The Semantic Web holds a big part of the answer
to a major problem in end-user information management
bull Key contribution is ldquomutable schemardquo paradigmbull So yes Haystack was a Semantic Web Application
Rest of Talkbull Continue end-user problem-centric perspective
ndash And thesis that Semantic Web can solve itbull Convince you end-user data problem is serious
ndash Channel a CHI talk by Voida Harman Al-Anibull 3 Semantic Web applications that chip away at it
ndash All designed around ldquoopen schemardquo principlendash Supercharge spreadsheets for data interactionndash A standard for data amp visualization in HTMLndash An end-user programmable data-handling agent
bull Wrap-up thoughts about SW and ESWC
Homebrew Databasesbull A paper by Voida Harman Al Anibull Published at CHI 2010bull Highlights how bad this problem isbull An embedded user study
ndash No tools builtndash Went where users werendash Watched what they didndash Tried to understand whyndash SW community needs this badly
ldquoI WANT MY SPREADSHEET DATABASE TO WORK BETTERrdquo
SUPERCHARGING SPREADSHEETS FOR DATA MANAGEMENT
Eirik Bakke David Karger Rob MillerA spreadsheet-based user interface for managing plural relationships in structured data [CHI 2011]
Spreadsheetsbull As wersquove seen a dominant tool for databull But limited
ndash Flat tablendash Hard to represent entity-relationship graphsndash No typesndash No support for many-many relationshipsndash No joins
bull Can we add power but preserve look and feelndash Yes with nested cells and data wormholes
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Rest of Talkbull Continue end-user problem-centric perspective
ndash And thesis that Semantic Web can solve itbull Convince you end-user data problem is serious
ndash Channel a CHI talk by Voida Harman Al-Anibull 3 Semantic Web applications that chip away at it
ndash All designed around ldquoopen schemardquo principlendash Supercharge spreadsheets for data interactionndash A standard for data amp visualization in HTMLndash An end-user programmable data-handling agent
bull Wrap-up thoughts about SW and ESWC
Homebrew Databasesbull A paper by Voida Harman Al Anibull Published at CHI 2010bull Highlights how bad this problem isbull An embedded user study
ndash No tools builtndash Went where users werendash Watched what they didndash Tried to understand whyndash SW community needs this badly
ldquoI WANT MY SPREADSHEET DATABASE TO WORK BETTERrdquo
SUPERCHARGING SPREADSHEETS FOR DATA MANAGEMENT
Eirik Bakke David Karger Rob MillerA spreadsheet-based user interface for managing plural relationships in structured data [CHI 2011]
Spreadsheetsbull As wersquove seen a dominant tool for databull But limited
ndash Flat tablendash Hard to represent entity-relationship graphsndash No typesndash No support for many-many relationshipsndash No joins
bull Can we add power but preserve look and feelndash Yes with nested cells and data wormholes
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Homebrew Databasesbull A paper by Voida Harman Al Anibull Published at CHI 2010bull Highlights how bad this problem isbull An embedded user study
ndash No tools builtndash Went where users werendash Watched what they didndash Tried to understand whyndash SW community needs this badly
ldquoI WANT MY SPREADSHEET DATABASE TO WORK BETTERrdquo
SUPERCHARGING SPREADSHEETS FOR DATA MANAGEMENT
Eirik Bakke David Karger Rob MillerA spreadsheet-based user interface for managing plural relationships in structured data [CHI 2011]
Spreadsheetsbull As wersquove seen a dominant tool for databull But limited
ndash Flat tablendash Hard to represent entity-relationship graphsndash No typesndash No support for many-many relationshipsndash No joins
bull Can we add power but preserve look and feelndash Yes with nested cells and data wormholes
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
ldquoI WANT MY SPREADSHEET DATABASE TO WORK BETTERrdquo
SUPERCHARGING SPREADSHEETS FOR DATA MANAGEMENT
Eirik Bakke David Karger Rob MillerA spreadsheet-based user interface for managing plural relationships in structured data [CHI 2011]
Spreadsheetsbull As wersquove seen a dominant tool for databull But limited
ndash Flat tablendash Hard to represent entity-relationship graphsndash No typesndash No support for many-many relationshipsndash No joins
bull Can we add power but preserve look and feelndash Yes with nested cells and data wormholes
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
SUPERCHARGING SPREADSHEETS FOR DATA MANAGEMENT
Eirik Bakke David Karger Rob MillerA spreadsheet-based user interface for managing plural relationships in structured data [CHI 2011]
Spreadsheetsbull As wersquove seen a dominant tool for databull But limited
ndash Flat tablendash Hard to represent entity-relationship graphsndash No typesndash No support for many-many relationshipsndash No joins
bull Can we add power but preserve look and feelndash Yes with nested cells and data wormholes
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Spreadsheetsbull As wersquove seen a dominant tool for databull But limited
ndash Flat tablendash Hard to represent entity-relationship graphsndash No typesndash No support for many-many relationshipsndash No joins
bull Can we add power but preserve look and feelndash Yes with nested cells and data wormholes
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Spreadsheets
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Alternative Related Worksheets
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
One-to-ManyMany-to-ManyRelationships
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
A database with one-to-many and many-to-many relationshipsaccessed through a general-purpose spreadsheet-like UI
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
ldquoRelated Worksheetsrdquo application at startup
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Creating a new worksheet
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
After entering some simple tabular data
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
1st New Concept Data Types for Worksheet Columns
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
2nd New Concept Array Types
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
3rd New Concept Reference Types(ldquoEach cell in this column refers to a row in a different worksheetrdquo)
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
3rd New Concept Reference TypesReference values are displayed recursively as configured
by the user in the ldquoShowHide Columnsrdquo tree
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
1
4th New Concept Relationships are bidirectional
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
2
4th New Concept Relationships are bidirectional
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Teleport Feature(Press Ctrl+Space)
1
2
4th New Concept Relationships are bidirectional
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Result The ability to keep track of one-to-manymany-to-manyrelationships from within a spreadsheet-like user interface
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
User Study
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
User Studybull Hypothesis Excel-proficient users will be faster at lookup
(read-only) tasks on a database stored in normalized form in our system vs Microsoft Excel
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
User Studybull Mechanical Turkbull Remotely screen-recordedbull Lookup tasks on course catalog database in
Excel vs Related Worksheets bull Between-subjects studybull Initial qualification task on Excel only
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
User Study
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
ResultsDemographics
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Results Correctness and Features Used
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Results Timing
p lt 005 for Task 4 only
(41 faster)
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Conclusionbull Spreadsheets are great homebrew databases
ndash But struggle with multiple tables nesting joinsbull Enhance spreadsheet paradigm with
ndash Column type system array types reference typesndash Bidirectional hierarchical views of reference typesndash to handle plural relationships
bull User Study shows system usable without instruction sometimes faster than Excel (more study needed)
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
A Semantic Web Applicationbull Implemented as a JavaNetbeans desktop appbull Which connects to SQL databases using JDBC
bull But key contribution is interaction paradigmndash Which expects arbitrary schemas
bull Could easily be ported to the webbull Eg on Google Spreadsheets
bull So is a Semantic Web application despite absence of Semantic Web technology
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
ldquoI WANT TO PUBLISH MY VOLUNTEER ROSTER ON THE WEBrdquo
Huynh Benson Marcus Karger Miller
58
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
WEB AUTHORING WITH STRUCTURED DATA
Huynh Benson Marcus Karger MillerExhibit Lightweight Structured Data Publishing [WWW 2007]The web page as a WYSIWYG end-user customizable database-backed information management application [UIST 2010]Talking about Data Sharing Richly Structured Information through Blogs and Wikis [ISWC 2010]
58
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
SOME WEB HISTORYMotivation
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
good old days early 1990s
Enrico Motta
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Blog
Forum
Wiki
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
The Virtuous Cycle of Web Authoring
High Benefit Low CostReader bull Find the info I need
bull Discover new thingsbull One click fetchbull Instant availabilitybull No application to master
Author bull Be seenbull Share what I knowbull Impress peoplebull Readersrsquo gratitude
bull No new skills neededbull Easy to author
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Structured Data is Betterbull Easier to manage
ndash Separate content (data) from presentationndash Edit data changes propagate to all uses of itndash Templates help all data look consistent
bull Easier to navigatendash Sorting and filteringndash Faceted browsingndash Aggregate visualizations --- comparecontrast
bull Easier to reusendash Extract from original sourcendash Blend with other datandash Create alternative visualizations
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
sort
filter
search
template
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
today
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Mere mortals just write text or html
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Whybull Professional sites implement a rich data model
ndash Information stored in databasesndash Extracted using complex queriesndash Results feed into templating web frameworks
bull Plain authors left behindndash Canrsquot installoperatedefine a databasendash Canrsquot write the queries to extract the datandash Limited to unstructured text pages (and blogs wikis)ndash Less power to communicate effectivelyndash Less interest in publishing data
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Goalbull Give regular people tools that let them author
structured data and visualizations themselvesbull So can communicate like professional web sites
ndash their incentivebull And their data is available in high fidelity for
combination and reuse with other data ndash social benefit
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Do We Need Thisbull Analyzed 21 Blogs in 2009
ndash Top 10 and Trending 10 from Technoratindash Last 10 articles of each
bull 18 of 21 blogs (30 of articles) had at least one article with a collection of data itemsndash Half described in textndash Half as html table or static info-graphicndash None had interactive data
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Approach
bull Publishing data is easyndash Just put a spreadsheet onlinendash Rows are items columns are properties
bull Identify key elements of interactive visualizationsndash Like spreadsheet charts
bull Add them to the HTML document vocabularyndash Insert them like images or videos today
bull Configure by binding them to underlying data
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Like Spreadsheets
bull Put data in Spreadsheetbull Items are rows properties are columns
bull Pick a chart type (visualization)bull Specify which columns used in chart
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Example HTMLbull Standardized vocabulary for document structure
ndash Paragraphs headings italics quotationsbull A description of the document
ndash Not an imperative program for generating itbull User describes structure
ndash Browser generates presentation based on it
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Generalize to Databull Identify common vocabulary describing
ndash Datandash Visualizations of that datandash Interactions with that data
bull Augment HTML to include data vocabularybull User authors description of data viz interaction
ndash Describe donrsquot programbull Browser implements described visualization of
and interaction with the data
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Can This be Donebull Is there a vocabulary that is
ndash Simple enough for regular people to usendash General enough to capture a good part of what
people want to do with structured data authoringpublishing
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
sort
filter
search
template
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Image
HTMLltimgsrc=hellip
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Data
bull Items (Recipes)bull Each has properties
ndash Titlendash Source magazinendash Publication datendash Ratingndash Ingredients
bull Publish as spreadsheetndash One item per rowndash Columns for properties
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Viewsbull Show a collection
ndash Bar chartndash Sortable list (here)ndash Mapndash Thumbnail set
bull Bound to propertiesndash Sort by propertyndash Plot which property
bull HTML ltdiv exrole=ldquoviewrdquo
exviewClass=ldquolistrdquo exsort=ldquopricerdquogt
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Facetsbull Way to filter a collection
ndash Specify a propertyndash Eg ingredientndash User clicks to pickndash Restrict collection to
matching items
bull HTML ltdiv exrole=ldquofacetrdquo exexpression=ldquoingredientrdquogt
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Lensesbull Template for itembull HTML with ldquofill in the
blanksrdquo
bull HTML ltdiv exrole=ldquolensrdquo
ltbgt ltdiv excontent=ldquotitlerdquogt ltbgt ltdiv excontent=ldquodaterdquogt
ltdivgt
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Key Primitives of a Data Page
bull Datandash A spreadsheet
bull Lensesndash Explain how to display a single itemndash By describing what properties should be shown and how
bull Viewsndash Ways of looking at collections of itemsndash Lists Thumbnails Maps Scatterplotsndash Specify which properties determine layout
bull Facetsndash Elements for filtering or sorting information based on its structure
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
General Enough
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
General Enough
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Impoverished Information Visualization
bull Is the sameness of all these pages a good thingbull Most information presenters are not ambitiousbull Carefully designed domain- and task-specific
information interactions will always be superiorbull But powerful lowest common denominatorbull Peoplersquos experience of it makes it more powerful
ndash Leverage expectationsndash No need to learn new site
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
EXHIBITProof-of-concept implementation
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Prototype Exhibitbull A specific dataviz HTML vocabulary extension
ndash And a Javascript library to interpret itbull Application independent
ndash Fits any tool that takes HTML eg HTML editorbull Pure client side
ndash No need to designadmin serverbull Freely interleave data with other HTML
ndash Complete control of designndash Integration with whatever other elements you like
bull Static feels becomes interactive data visualization
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Usagebull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
EXAMPLES
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Hobby Stores
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Science
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
PhD Theses
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Rental Apartments
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Datagov
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
NGOs
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Newspapers
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Libraries
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Sports
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Strange Hobbyists
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Usage Studybull Deployed 2007
ndash ~1900 exhibits on 800 domainsndash Millions of views
bull Many data sets with no natural site on the web
bull By fetching and analyzing these we can askndash What kind of data do people want to publishndash And how do they want to publish itndash (subject to limitations imposed by Exhibit)
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Domains
SchoolsUniversities 40
PersonalHobby 25
Organization 19
News 6
Commercial 4
Library 4
Conference 2
bull Source 50 of the top trafficked Exhibits in our dataset
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Data Model
Graph 27
Multi-valued Table 32
Table 41
Cyclic 22
Acyclic 78
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Schema Size (Number of Properties)
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Data Format
bull Some exhibits use multiple formats so sum gt 100
JSON 69
Google Spreadsheet 32
Bibtex 2
RDF 1
Excel 01
CSV 006
Freebase 006
Hmmhellip
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Single-View Exhibits
Text-Only(list table title) 59
Timeline 19Map 14Chart 8
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Percentage of Schema in Visualization
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
oops
Authoring by Copying
bull HTML describes visualization
bull Copy it change the data
bull (Maybe change the presentation too)
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Scalabilitybull Javascript slow not designed to implement DBs
bull Fast for lt 1000 itemsbull Some people have used 25000 items or more
bull Not a limitation per sebull Plenty of small data setsbull Wranger [Heer et al] can handle 1M items
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Incentivizing Databull A data-centric web page is better
ndash More effective communication ndash Easier to maintain (like CSS)ndash Creates enthusiasm for working with data
bull Data is exposed as a side effectndash Enabling reusendash Alternative visualizationsndash Critiques
bull Selfish incentives lead to global benefit
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
DATA EXPORT
08
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Summary
bull Anyone who can write HTML can write a data-interactive web pagendash Sorting filtering searchingndash Lists Maps Timelines Plotsndash Item templates
bull Post it on the web and it worksbull Data is explicit can be extracted for reusebull The visualization is the incentive
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
EXTENSIONSWhat if you canrsquot write HTML
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
WibitCollaborative Authoring in a Wiki
bull Exhibit is html filebull Put it in a wikibull Combine data
interaction and collaboration
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Exhibit in a Wiki Wibit
bull Wikitext to describe Exhibit
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Exhibit in a Blog Datapress
bull Wordpress pluginbull Link to data sourcebull Then WYSYWIG
your visualization
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
WordPress + datapress
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Or Just a Document
bull DIDO --- Data Interactive Document
bull Javascript WYSIWYG Editor included with document
bull Edit data and viz in place and save
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
A Semantic Web Applicationbull Doesnrsquot use any Semantic Web technology
ndash Native data format is JSONndash Can read RDF but nobody doesndash Dominant data model likely to be spreadsheetsndash No inferencendash We do generate URIs for exported items
bull But focused on visualizing arbitrary schemasbull So yes
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
I CANrsquoT HANDLE MY INCOMING INFORMATION OVERLOAD HELP
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
END USERS PROGRAMMING INFORMATION STREAM HANDLERS
Van Kleek Moore Karger schraefelAtomate it end-user context-sensitive automation using heterogeneous information sources on the web [WWW lsquo10]
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Motivationbull Wersquore acquiring large structured data
repositories and real-time streamsbull Lots of repetitive labor involved in users
followingreacting to these streamsbull How can users author queriesrules against these
streams to reduce the drudgery
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Examples
bull remind me to take out the trash when I get home on Tuesdays
bull bug my friend who hasnrsquot replied to me in 2 daysbull send me my shopping list when I arrive at the
grocery storebull remind friends of an event Irsquom going to attendbull text me important emails when I am traveling
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
actionsconditionspredicatespropertiesentities
What we need
bull a way for users to express what they want to happen and when in terms of predicates relating
the states and properties of people places + things in their world
Controlled Natural Language Interface (CNLI) for Rules
bull a way to retrieve and interpret data from our many heterogeneous web sources as descriptions of these familiar people places and things
ATOMRSSREST APIs End-user mashups + RDF
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Abraham Bernstein and Esther Kaufmann and Christian Kaiser and Christoph Kiefer Ginseng A Guided Input Natural Language Search Engine for Querying Ontologies Jena User Conference 2008
previous work for the construction of RDF KBs and queries
express behaviors as ruleswhen ltsomething happensgt do ltactiongt
query statement
Controlled Natural Language Interface
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Example 1bull Simple Context-Sensitive Remindingbull Remind me to take the trash out when I get
home on Tuesday evenings
Example 2 Travel Mangementbull When Irsquom traveling warn people who e-mail me
that I might not get back to them for a while
when (one-shot) whenever (repeating)
ANTECEDENT (conditions for execution) AND predicate(subj-pathquery obj-pathquery-or-val)
AND predicate2() AND
CONSEQUENT (what to do)action(arg-path-or-val arg-path-or-val)
Inside a rule
possessives for path queries(eg my current locationrsquos address)
infix english verbs for predicates
eq(numbernumber) =gt ldquoisrdquonear(LocationLocation) =gt ldquonearrdquo
entities represented by their label
(eg ldquoDavid Kargerrdquo ldquohomerdquo and special pronoun ldquomerdquo)
ldquo500 Fayetteville Strdquoaddresscurrent
location
me
Rules in constrained natural language
variables represented with ldquoanynew lttypegtrdquo
x rdftype Person =gt ldquoany Personrdquonewly created Person entity ldquonew Personrdquo
bound variables with ldquothat lttypegtrdquo
ldquoany Personrsquos birthday is today email that person lsquohappy birthdayrsquordquo
Rules in constrained natural language
actions represented as fill-in-the-blank sentences with typed blanks
[ldquoreply tordquo namerdquoemailrdquo typerdquoschemasEmailrdquo ldquowithrdquo name ldquomessagerdquo typerdquoschemasStringrdquo ]
reply to email with message
Actions in constrained natural language
Studybull Can users create rulesbull Perceived difficulty of use bull Pitfalls bull Ideas for fixing these problems
Rule creation study (method)bull Recruited over the webbull Basic demographics sign up 2 minute tutorial
videobull 9 Rule creation exercises
ndash 2 time 3 easy 3 medium 1 difficult bull Short exit survey
ndash On average how difficult was it to create the rulesndash Was there anything that was confusingdifficultndash How useful would such a system be to youndash What would you use this system forndash What else do you wish this system could do
Rule creation studybull November 2009
bull 33 participants recruited (26 completed)bull Ages 25-45bull 14 had some programming experiencebull All experienced with the Web
Rule creation study
bull Correctndash rule expressed perfectly
bull half-correctndash rule insufficiently specificndash will trigger more often than intended
bull Wrongndash 1 or more incorrectly expressed clause ndash will not fire at all or not as intended
bull Missingndash rule not completed
Average time to complete each rule
Perceived difficulty of creating rules
Perceived usefulness
(P4) Identifying when two locations converge (ie mine and a friends are close) This is like social networking but moving it towards actual life People could grant access to their friends to view their locations and thus know if people are close at a given time (P7) Reminding my friends and I that we have a shared event when were both near each other For example Im often meeting with someone and both of us want to go to the same event in an hour but we get into a coding session and we forget about the event
What would you use atomate for
(P15) When I send email to someone and I want a response I can tell atomate to send them a reminder email in 3 days if they havent gotten back to me or something like that
(P24) Emailing or responding to people when I am in transit or unavailable (no network connectivity or in an event where my phones silenced)
What would you use atomate for
Discussionbull A Semantic Web Application
ndash Yes incorporates data in any schemandash CNL adopts any incoming propertiesvalues
bull Inference over RDF storendash Not ldquowhat is the most powerful inference enginerdquondash Rather ldquowhat inferrable language can users writerdquondash Lots of room to investigatedrop in better reasoners
bull Contrast If This Then Thatndash Powerful site opened 2011ndash Over 1000000 rules created
Whatrsquos Wrong With Thisbull IFTTT is hard-coding its channels and rules
ndash Users can only set parametersndash At mercy of developer like applications of yore
bull Semantic WebAtomate visionndash Each channel is an RDF feedndash Rules are RDF queriesndash (must be end-user authorable eg CNL)
bull Power of a distributed systemwebndash Anyone can offer a new channelndash Anyone can build a new rule engineUI
SW Challenge Build SWIFTTT
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
possessives for path queries(eg my current locationrsquos address)
infix english verbs for predicates
eq(numbernumber) =gt ldquoisrdquonear(LocationLocation) =gt ldquonearrdquo
entities represented by their label
(eg ldquoDavid Kargerrdquo ldquohomerdquo and special pronoun ldquomerdquo)
ldquo500 Fayetteville Strdquoaddresscurrent
location
me
Rules in constrained natural language
variables represented with ldquoanynew lttypegtrdquo
x rdftype Person =gt ldquoany Personrdquonewly created Person entity ldquonew Personrdquo
bound variables with ldquothat lttypegtrdquo
ldquoany Personrsquos birthday is today email that person lsquohappy birthdayrsquordquo
Rules in constrained natural language
actions represented as fill-in-the-blank sentences with typed blanks
[ldquoreply tordquo namerdquoemailrdquo typerdquoschemasEmailrdquo ldquowithrdquo name ldquomessagerdquo typerdquoschemasStringrdquo ]
reply to email with message
Actions in constrained natural language
Studybull Can users create rulesbull Perceived difficulty of use bull Pitfalls bull Ideas for fixing these problems
Rule creation study (method)bull Recruited over the webbull Basic demographics sign up 2 minute tutorial
videobull 9 Rule creation exercises
ndash 2 time 3 easy 3 medium 1 difficult bull Short exit survey
ndash On average how difficult was it to create the rulesndash Was there anything that was confusingdifficultndash How useful would such a system be to youndash What would you use this system forndash What else do you wish this system could do
Rule creation studybull November 2009
bull 33 participants recruited (26 completed)bull Ages 25-45bull 14 had some programming experiencebull All experienced with the Web
Rule creation study
bull Correctndash rule expressed perfectly
bull half-correctndash rule insufficiently specificndash will trigger more often than intended
bull Wrongndash 1 or more incorrectly expressed clause ndash will not fire at all or not as intended
bull Missingndash rule not completed
Average time to complete each rule
Perceived difficulty of creating rules
Perceived usefulness
(P4) Identifying when two locations converge (ie mine and a friends are close) This is like social networking but moving it towards actual life People could grant access to their friends to view their locations and thus know if people are close at a given time (P7) Reminding my friends and I that we have a shared event when were both near each other For example Im often meeting with someone and both of us want to go to the same event in an hour but we get into a coding session and we forget about the event
What would you use atomate for
(P15) When I send email to someone and I want a response I can tell atomate to send them a reminder email in 3 days if they havent gotten back to me or something like that
(P24) Emailing or responding to people when I am in transit or unavailable (no network connectivity or in an event where my phones silenced)
What would you use atomate for
Discussionbull A Semantic Web Application
ndash Yes incorporates data in any schemandash CNL adopts any incoming propertiesvalues
bull Inference over RDF storendash Not ldquowhat is the most powerful inference enginerdquondash Rather ldquowhat inferrable language can users writerdquondash Lots of room to investigatedrop in better reasoners
bull Contrast If This Then Thatndash Powerful site opened 2011ndash Over 1000000 rules created
Whatrsquos Wrong With Thisbull IFTTT is hard-coding its channels and rules
ndash Users can only set parametersndash At mercy of developer like applications of yore
bull Semantic WebAtomate visionndash Each channel is an RDF feedndash Rules are RDF queriesndash (must be end-user authorable eg CNL)
bull Power of a distributed systemwebndash Anyone can offer a new channelndash Anyone can build a new rule engineUI
SW Challenge Build SWIFTTT
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
variables represented with ldquoanynew lttypegtrdquo
x rdftype Person =gt ldquoany Personrdquonewly created Person entity ldquonew Personrdquo
bound variables with ldquothat lttypegtrdquo
ldquoany Personrsquos birthday is today email that person lsquohappy birthdayrsquordquo
Rules in constrained natural language
actions represented as fill-in-the-blank sentences with typed blanks
[ldquoreply tordquo namerdquoemailrdquo typerdquoschemasEmailrdquo ldquowithrdquo name ldquomessagerdquo typerdquoschemasStringrdquo ]
reply to email with message
Actions in constrained natural language
Studybull Can users create rulesbull Perceived difficulty of use bull Pitfalls bull Ideas for fixing these problems
Rule creation study (method)bull Recruited over the webbull Basic demographics sign up 2 minute tutorial
videobull 9 Rule creation exercises
ndash 2 time 3 easy 3 medium 1 difficult bull Short exit survey
ndash On average how difficult was it to create the rulesndash Was there anything that was confusingdifficultndash How useful would such a system be to youndash What would you use this system forndash What else do you wish this system could do
Rule creation studybull November 2009
bull 33 participants recruited (26 completed)bull Ages 25-45bull 14 had some programming experiencebull All experienced with the Web
Rule creation study
bull Correctndash rule expressed perfectly
bull half-correctndash rule insufficiently specificndash will trigger more often than intended
bull Wrongndash 1 or more incorrectly expressed clause ndash will not fire at all or not as intended
bull Missingndash rule not completed
Average time to complete each rule
Perceived difficulty of creating rules
Perceived usefulness
(P4) Identifying when two locations converge (ie mine and a friends are close) This is like social networking but moving it towards actual life People could grant access to their friends to view their locations and thus know if people are close at a given time (P7) Reminding my friends and I that we have a shared event when were both near each other For example Im often meeting with someone and both of us want to go to the same event in an hour but we get into a coding session and we forget about the event
What would you use atomate for
(P15) When I send email to someone and I want a response I can tell atomate to send them a reminder email in 3 days if they havent gotten back to me or something like that
(P24) Emailing or responding to people when I am in transit or unavailable (no network connectivity or in an event where my phones silenced)
What would you use atomate for
Discussionbull A Semantic Web Application
ndash Yes incorporates data in any schemandash CNL adopts any incoming propertiesvalues
bull Inference over RDF storendash Not ldquowhat is the most powerful inference enginerdquondash Rather ldquowhat inferrable language can users writerdquondash Lots of room to investigatedrop in better reasoners
bull Contrast If This Then Thatndash Powerful site opened 2011ndash Over 1000000 rules created
Whatrsquos Wrong With Thisbull IFTTT is hard-coding its channels and rules
ndash Users can only set parametersndash At mercy of developer like applications of yore
bull Semantic WebAtomate visionndash Each channel is an RDF feedndash Rules are RDF queriesndash (must be end-user authorable eg CNL)
bull Power of a distributed systemwebndash Anyone can offer a new channelndash Anyone can build a new rule engineUI
SW Challenge Build SWIFTTT
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
actions represented as fill-in-the-blank sentences with typed blanks
[ldquoreply tordquo namerdquoemailrdquo typerdquoschemasEmailrdquo ldquowithrdquo name ldquomessagerdquo typerdquoschemasStringrdquo ]
reply to email with message
Actions in constrained natural language
Studybull Can users create rulesbull Perceived difficulty of use bull Pitfalls bull Ideas for fixing these problems
Rule creation study (method)bull Recruited over the webbull Basic demographics sign up 2 minute tutorial
videobull 9 Rule creation exercises
ndash 2 time 3 easy 3 medium 1 difficult bull Short exit survey
ndash On average how difficult was it to create the rulesndash Was there anything that was confusingdifficultndash How useful would such a system be to youndash What would you use this system forndash What else do you wish this system could do
Rule creation studybull November 2009
bull 33 participants recruited (26 completed)bull Ages 25-45bull 14 had some programming experiencebull All experienced with the Web
Rule creation study
bull Correctndash rule expressed perfectly
bull half-correctndash rule insufficiently specificndash will trigger more often than intended
bull Wrongndash 1 or more incorrectly expressed clause ndash will not fire at all or not as intended
bull Missingndash rule not completed
Average time to complete each rule
Perceived difficulty of creating rules
Perceived usefulness
(P4) Identifying when two locations converge (ie mine and a friends are close) This is like social networking but moving it towards actual life People could grant access to their friends to view their locations and thus know if people are close at a given time (P7) Reminding my friends and I that we have a shared event when were both near each other For example Im often meeting with someone and both of us want to go to the same event in an hour but we get into a coding session and we forget about the event
What would you use atomate for
(P15) When I send email to someone and I want a response I can tell atomate to send them a reminder email in 3 days if they havent gotten back to me or something like that
(P24) Emailing or responding to people when I am in transit or unavailable (no network connectivity or in an event where my phones silenced)
What would you use atomate for
Discussionbull A Semantic Web Application
ndash Yes incorporates data in any schemandash CNL adopts any incoming propertiesvalues
bull Inference over RDF storendash Not ldquowhat is the most powerful inference enginerdquondash Rather ldquowhat inferrable language can users writerdquondash Lots of room to investigatedrop in better reasoners
bull Contrast If This Then Thatndash Powerful site opened 2011ndash Over 1000000 rules created
Whatrsquos Wrong With Thisbull IFTTT is hard-coding its channels and rules
ndash Users can only set parametersndash At mercy of developer like applications of yore
bull Semantic WebAtomate visionndash Each channel is an RDF feedndash Rules are RDF queriesndash (must be end-user authorable eg CNL)
bull Power of a distributed systemwebndash Anyone can offer a new channelndash Anyone can build a new rule engineUI
SW Challenge Build SWIFTTT
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Studybull Can users create rulesbull Perceived difficulty of use bull Pitfalls bull Ideas for fixing these problems
Rule creation study (method)bull Recruited over the webbull Basic demographics sign up 2 minute tutorial
videobull 9 Rule creation exercises
ndash 2 time 3 easy 3 medium 1 difficult bull Short exit survey
ndash On average how difficult was it to create the rulesndash Was there anything that was confusingdifficultndash How useful would such a system be to youndash What would you use this system forndash What else do you wish this system could do
Rule creation studybull November 2009
bull 33 participants recruited (26 completed)bull Ages 25-45bull 14 had some programming experiencebull All experienced with the Web
Rule creation study
bull Correctndash rule expressed perfectly
bull half-correctndash rule insufficiently specificndash will trigger more often than intended
bull Wrongndash 1 or more incorrectly expressed clause ndash will not fire at all or not as intended
bull Missingndash rule not completed
Average time to complete each rule
Perceived difficulty of creating rules
Perceived usefulness
(P4) Identifying when two locations converge (ie mine and a friends are close) This is like social networking but moving it towards actual life People could grant access to their friends to view their locations and thus know if people are close at a given time (P7) Reminding my friends and I that we have a shared event when were both near each other For example Im often meeting with someone and both of us want to go to the same event in an hour but we get into a coding session and we forget about the event
What would you use atomate for
(P15) When I send email to someone and I want a response I can tell atomate to send them a reminder email in 3 days if they havent gotten back to me or something like that
(P24) Emailing or responding to people when I am in transit or unavailable (no network connectivity or in an event where my phones silenced)
What would you use atomate for
Discussionbull A Semantic Web Application
ndash Yes incorporates data in any schemandash CNL adopts any incoming propertiesvalues
bull Inference over RDF storendash Not ldquowhat is the most powerful inference enginerdquondash Rather ldquowhat inferrable language can users writerdquondash Lots of room to investigatedrop in better reasoners
bull Contrast If This Then Thatndash Powerful site opened 2011ndash Over 1000000 rules created
Whatrsquos Wrong With Thisbull IFTTT is hard-coding its channels and rules
ndash Users can only set parametersndash At mercy of developer like applications of yore
bull Semantic WebAtomate visionndash Each channel is an RDF feedndash Rules are RDF queriesndash (must be end-user authorable eg CNL)
bull Power of a distributed systemwebndash Anyone can offer a new channelndash Anyone can build a new rule engineUI
SW Challenge Build SWIFTTT
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Rule creation study (method)bull Recruited over the webbull Basic demographics sign up 2 minute tutorial
videobull 9 Rule creation exercises
ndash 2 time 3 easy 3 medium 1 difficult bull Short exit survey
ndash On average how difficult was it to create the rulesndash Was there anything that was confusingdifficultndash How useful would such a system be to youndash What would you use this system forndash What else do you wish this system could do
Rule creation studybull November 2009
bull 33 participants recruited (26 completed)bull Ages 25-45bull 14 had some programming experiencebull All experienced with the Web
Rule creation study
bull Correctndash rule expressed perfectly
bull half-correctndash rule insufficiently specificndash will trigger more often than intended
bull Wrongndash 1 or more incorrectly expressed clause ndash will not fire at all or not as intended
bull Missingndash rule not completed
Average time to complete each rule
Perceived difficulty of creating rules
Perceived usefulness
(P4) Identifying when two locations converge (ie mine and a friends are close) This is like social networking but moving it towards actual life People could grant access to their friends to view their locations and thus know if people are close at a given time (P7) Reminding my friends and I that we have a shared event when were both near each other For example Im often meeting with someone and both of us want to go to the same event in an hour but we get into a coding session and we forget about the event
What would you use atomate for
(P15) When I send email to someone and I want a response I can tell atomate to send them a reminder email in 3 days if they havent gotten back to me or something like that
(P24) Emailing or responding to people when I am in transit or unavailable (no network connectivity or in an event where my phones silenced)
What would you use atomate for
Discussionbull A Semantic Web Application
ndash Yes incorporates data in any schemandash CNL adopts any incoming propertiesvalues
bull Inference over RDF storendash Not ldquowhat is the most powerful inference enginerdquondash Rather ldquowhat inferrable language can users writerdquondash Lots of room to investigatedrop in better reasoners
bull Contrast If This Then Thatndash Powerful site opened 2011ndash Over 1000000 rules created
Whatrsquos Wrong With Thisbull IFTTT is hard-coding its channels and rules
ndash Users can only set parametersndash At mercy of developer like applications of yore
bull Semantic WebAtomate visionndash Each channel is an RDF feedndash Rules are RDF queriesndash (must be end-user authorable eg CNL)
bull Power of a distributed systemwebndash Anyone can offer a new channelndash Anyone can build a new rule engineUI
SW Challenge Build SWIFTTT
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Rule creation studybull November 2009
bull 33 participants recruited (26 completed)bull Ages 25-45bull 14 had some programming experiencebull All experienced with the Web
Rule creation study
bull Correctndash rule expressed perfectly
bull half-correctndash rule insufficiently specificndash will trigger more often than intended
bull Wrongndash 1 or more incorrectly expressed clause ndash will not fire at all or not as intended
bull Missingndash rule not completed
Average time to complete each rule
Perceived difficulty of creating rules
Perceived usefulness
(P4) Identifying when two locations converge (ie mine and a friends are close) This is like social networking but moving it towards actual life People could grant access to their friends to view their locations and thus know if people are close at a given time (P7) Reminding my friends and I that we have a shared event when were both near each other For example Im often meeting with someone and both of us want to go to the same event in an hour but we get into a coding session and we forget about the event
What would you use atomate for
(P15) When I send email to someone and I want a response I can tell atomate to send them a reminder email in 3 days if they havent gotten back to me or something like that
(P24) Emailing or responding to people when I am in transit or unavailable (no network connectivity or in an event where my phones silenced)
What would you use atomate for
Discussionbull A Semantic Web Application
ndash Yes incorporates data in any schemandash CNL adopts any incoming propertiesvalues
bull Inference over RDF storendash Not ldquowhat is the most powerful inference enginerdquondash Rather ldquowhat inferrable language can users writerdquondash Lots of room to investigatedrop in better reasoners
bull Contrast If This Then Thatndash Powerful site opened 2011ndash Over 1000000 rules created
Whatrsquos Wrong With Thisbull IFTTT is hard-coding its channels and rules
ndash Users can only set parametersndash At mercy of developer like applications of yore
bull Semantic WebAtomate visionndash Each channel is an RDF feedndash Rules are RDF queriesndash (must be end-user authorable eg CNL)
bull Power of a distributed systemwebndash Anyone can offer a new channelndash Anyone can build a new rule engineUI
SW Challenge Build SWIFTTT
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Rule creation study
bull Correctndash rule expressed perfectly
bull half-correctndash rule insufficiently specificndash will trigger more often than intended
bull Wrongndash 1 or more incorrectly expressed clause ndash will not fire at all or not as intended
bull Missingndash rule not completed
Average time to complete each rule
Perceived difficulty of creating rules
Perceived usefulness
(P4) Identifying when two locations converge (ie mine and a friends are close) This is like social networking but moving it towards actual life People could grant access to their friends to view their locations and thus know if people are close at a given time (P7) Reminding my friends and I that we have a shared event when were both near each other For example Im often meeting with someone and both of us want to go to the same event in an hour but we get into a coding session and we forget about the event
What would you use atomate for
(P15) When I send email to someone and I want a response I can tell atomate to send them a reminder email in 3 days if they havent gotten back to me or something like that
(P24) Emailing or responding to people when I am in transit or unavailable (no network connectivity or in an event where my phones silenced)
What would you use atomate for
Discussionbull A Semantic Web Application
ndash Yes incorporates data in any schemandash CNL adopts any incoming propertiesvalues
bull Inference over RDF storendash Not ldquowhat is the most powerful inference enginerdquondash Rather ldquowhat inferrable language can users writerdquondash Lots of room to investigatedrop in better reasoners
bull Contrast If This Then Thatndash Powerful site opened 2011ndash Over 1000000 rules created
Whatrsquos Wrong With Thisbull IFTTT is hard-coding its channels and rules
ndash Users can only set parametersndash At mercy of developer like applications of yore
bull Semantic WebAtomate visionndash Each channel is an RDF feedndash Rules are RDF queriesndash (must be end-user authorable eg CNL)
bull Power of a distributed systemwebndash Anyone can offer a new channelndash Anyone can build a new rule engineUI
SW Challenge Build SWIFTTT
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Average time to complete each rule
Perceived difficulty of creating rules
Perceived usefulness
(P4) Identifying when two locations converge (ie mine and a friends are close) This is like social networking but moving it towards actual life People could grant access to their friends to view their locations and thus know if people are close at a given time (P7) Reminding my friends and I that we have a shared event when were both near each other For example Im often meeting with someone and both of us want to go to the same event in an hour but we get into a coding session and we forget about the event
What would you use atomate for
(P15) When I send email to someone and I want a response I can tell atomate to send them a reminder email in 3 days if they havent gotten back to me or something like that
(P24) Emailing or responding to people when I am in transit or unavailable (no network connectivity or in an event where my phones silenced)
What would you use atomate for
Discussionbull A Semantic Web Application
ndash Yes incorporates data in any schemandash CNL adopts any incoming propertiesvalues
bull Inference over RDF storendash Not ldquowhat is the most powerful inference enginerdquondash Rather ldquowhat inferrable language can users writerdquondash Lots of room to investigatedrop in better reasoners
bull Contrast If This Then Thatndash Powerful site opened 2011ndash Over 1000000 rules created
Whatrsquos Wrong With Thisbull IFTTT is hard-coding its channels and rules
ndash Users can only set parametersndash At mercy of developer like applications of yore
bull Semantic WebAtomate visionndash Each channel is an RDF feedndash Rules are RDF queriesndash (must be end-user authorable eg CNL)
bull Power of a distributed systemwebndash Anyone can offer a new channelndash Anyone can build a new rule engineUI
SW Challenge Build SWIFTTT
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Perceived difficulty of creating rules
Perceived usefulness
(P4) Identifying when two locations converge (ie mine and a friends are close) This is like social networking but moving it towards actual life People could grant access to their friends to view their locations and thus know if people are close at a given time (P7) Reminding my friends and I that we have a shared event when were both near each other For example Im often meeting with someone and both of us want to go to the same event in an hour but we get into a coding session and we forget about the event
What would you use atomate for
(P15) When I send email to someone and I want a response I can tell atomate to send them a reminder email in 3 days if they havent gotten back to me or something like that
(P24) Emailing or responding to people when I am in transit or unavailable (no network connectivity or in an event where my phones silenced)
What would you use atomate for
Discussionbull A Semantic Web Application
ndash Yes incorporates data in any schemandash CNL adopts any incoming propertiesvalues
bull Inference over RDF storendash Not ldquowhat is the most powerful inference enginerdquondash Rather ldquowhat inferrable language can users writerdquondash Lots of room to investigatedrop in better reasoners
bull Contrast If This Then Thatndash Powerful site opened 2011ndash Over 1000000 rules created
Whatrsquos Wrong With Thisbull IFTTT is hard-coding its channels and rules
ndash Users can only set parametersndash At mercy of developer like applications of yore
bull Semantic WebAtomate visionndash Each channel is an RDF feedndash Rules are RDF queriesndash (must be end-user authorable eg CNL)
bull Power of a distributed systemwebndash Anyone can offer a new channelndash Anyone can build a new rule engineUI
SW Challenge Build SWIFTTT
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Perceived usefulness
(P4) Identifying when two locations converge (ie mine and a friends are close) This is like social networking but moving it towards actual life People could grant access to their friends to view their locations and thus know if people are close at a given time (P7) Reminding my friends and I that we have a shared event when were both near each other For example Im often meeting with someone and both of us want to go to the same event in an hour but we get into a coding session and we forget about the event
What would you use atomate for
(P15) When I send email to someone and I want a response I can tell atomate to send them a reminder email in 3 days if they havent gotten back to me or something like that
(P24) Emailing or responding to people when I am in transit or unavailable (no network connectivity or in an event where my phones silenced)
What would you use atomate for
Discussionbull A Semantic Web Application
ndash Yes incorporates data in any schemandash CNL adopts any incoming propertiesvalues
bull Inference over RDF storendash Not ldquowhat is the most powerful inference enginerdquondash Rather ldquowhat inferrable language can users writerdquondash Lots of room to investigatedrop in better reasoners
bull Contrast If This Then Thatndash Powerful site opened 2011ndash Over 1000000 rules created
Whatrsquos Wrong With Thisbull IFTTT is hard-coding its channels and rules
ndash Users can only set parametersndash At mercy of developer like applications of yore
bull Semantic WebAtomate visionndash Each channel is an RDF feedndash Rules are RDF queriesndash (must be end-user authorable eg CNL)
bull Power of a distributed systemwebndash Anyone can offer a new channelndash Anyone can build a new rule engineUI
SW Challenge Build SWIFTTT
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
(P4) Identifying when two locations converge (ie mine and a friends are close) This is like social networking but moving it towards actual life People could grant access to their friends to view their locations and thus know if people are close at a given time (P7) Reminding my friends and I that we have a shared event when were both near each other For example Im often meeting with someone and both of us want to go to the same event in an hour but we get into a coding session and we forget about the event
What would you use atomate for
(P15) When I send email to someone and I want a response I can tell atomate to send them a reminder email in 3 days if they havent gotten back to me or something like that
(P24) Emailing or responding to people when I am in transit or unavailable (no network connectivity or in an event where my phones silenced)
What would you use atomate for
Discussionbull A Semantic Web Application
ndash Yes incorporates data in any schemandash CNL adopts any incoming propertiesvalues
bull Inference over RDF storendash Not ldquowhat is the most powerful inference enginerdquondash Rather ldquowhat inferrable language can users writerdquondash Lots of room to investigatedrop in better reasoners
bull Contrast If This Then Thatndash Powerful site opened 2011ndash Over 1000000 rules created
Whatrsquos Wrong With Thisbull IFTTT is hard-coding its channels and rules
ndash Users can only set parametersndash At mercy of developer like applications of yore
bull Semantic WebAtomate visionndash Each channel is an RDF feedndash Rules are RDF queriesndash (must be end-user authorable eg CNL)
bull Power of a distributed systemwebndash Anyone can offer a new channelndash Anyone can build a new rule engineUI
SW Challenge Build SWIFTTT
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
(P15) When I send email to someone and I want a response I can tell atomate to send them a reminder email in 3 days if they havent gotten back to me or something like that
(P24) Emailing or responding to people when I am in transit or unavailable (no network connectivity or in an event where my phones silenced)
What would you use atomate for
Discussionbull A Semantic Web Application
ndash Yes incorporates data in any schemandash CNL adopts any incoming propertiesvalues
bull Inference over RDF storendash Not ldquowhat is the most powerful inference enginerdquondash Rather ldquowhat inferrable language can users writerdquondash Lots of room to investigatedrop in better reasoners
bull Contrast If This Then Thatndash Powerful site opened 2011ndash Over 1000000 rules created
Whatrsquos Wrong With Thisbull IFTTT is hard-coding its channels and rules
ndash Users can only set parametersndash At mercy of developer like applications of yore
bull Semantic WebAtomate visionndash Each channel is an RDF feedndash Rules are RDF queriesndash (must be end-user authorable eg CNL)
bull Power of a distributed systemwebndash Anyone can offer a new channelndash Anyone can build a new rule engineUI
SW Challenge Build SWIFTTT
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Discussionbull A Semantic Web Application
ndash Yes incorporates data in any schemandash CNL adopts any incoming propertiesvalues
bull Inference over RDF storendash Not ldquowhat is the most powerful inference enginerdquondash Rather ldquowhat inferrable language can users writerdquondash Lots of room to investigatedrop in better reasoners
bull Contrast If This Then Thatndash Powerful site opened 2011ndash Over 1000000 rules created
Whatrsquos Wrong With Thisbull IFTTT is hard-coding its channels and rules
ndash Users can only set parametersndash At mercy of developer like applications of yore
bull Semantic WebAtomate visionndash Each channel is an RDF feedndash Rules are RDF queriesndash (must be end-user authorable eg CNL)
bull Power of a distributed systemwebndash Anyone can offer a new channelndash Anyone can build a new rule engineUI
SW Challenge Build SWIFTTT
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Whatrsquos Wrong With Thisbull IFTTT is hard-coding its channels and rules
ndash Users can only set parametersndash At mercy of developer like applications of yore
bull Semantic WebAtomate visionndash Each channel is an RDF feedndash Rules are RDF queriesndash (must be end-user authorable eg CNL)
bull Power of a distributed systemwebndash Anyone can offer a new channelndash Anyone can build a new rule engineUI
SW Challenge Build SWIFTTT
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
SW Challenge Build SWIFTTT
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Summarybull 3 Semantic Web Applications
ndash Supercharged spreadsheets for data managementndash Data and visualization authoring like HTML authoringndash Automated handling of incoming information streams
bull All driven by concrete end-user problemsndash Under umbrella of simplifying info management
bull All assessed with user studies
bull Make very little use of Semantic Web technologybull But all share key ldquoopen schemardquo paradigm
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Whither ESWC
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
ESWC Topicsbull Information extractionminingbull Ontology alignmentbull Inferencebull Query languages
bull Plenty of ldquosemanticrdquo but what ldquowebrdquobull Work already had a place AAAI KDD SIGMODbull What did we need a new field for
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Semantic Web
bull A step on the road to artificial intelligencebull A study of the processes of cognition
ndash Knowledge representationndash Classificationndash Logical Inferencendash Probabilistic reasoningndash Analogy (someday)
bull Web is secondary just a platformbull Long rangebull Perspective is well represented at ISWCESWC
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Semantic Webbull Improving human-information interactionbull Drawing on insights gained from the web
ndash View sourcecopytweakndash Tolerate inconsistencyndash Lightweight interactionsndash Standardsndash Power of the crowd
bull But also HCI DB IR MLbull Opportunity to rapidly and significantly improve
the human condition
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Where Are All the Intelligent Agents
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Where Are All the Intelligent Agents
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Bringing Intelligence to Applicationsbull Until we solve AI wersquoll have to make do with
Artifical AI --- ie humansbull Good at things that are hard for computers
ndash Entity extraction and disambiguationndash Inferencendash Alignment
bull Hate the drudgery of repetitive simple tasksndash Which is exactly what computers can already dondash Moving data between applicationsndash Reissuing numerous variants on same query
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Wherersquos the Sciencebull Human Factorsbull Must understand what people are goodbad atbull Design tools that address strengthsweaknessesbull These tools are experimentsbull Not enough to build must evaluate
ndash By formulating hypotheses about users and usagendash And testing in (controlled) lab and (uncontrolled) field
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Choose Your Motivation Wiselybull Late 90s a brief flurry of work on ldquoalgorithms
and data-structures for faulty memoriesrdquondash Algorithms that work well even if sometimes what
you read isnrsquot what you wrotebull Generally preceded by an argument that as
memories grow such faults become pervasivebull In fact right solution is ECC RAM (now standard)bull Flurry was stimulated by a purchaser at Google
trying to cover up a bad (non ECC RAM) purchasing decision
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Hammers vs Nailsbull Donrsquot define workfield by particular technologybull Start with the problem that needs to be solvedbull Then find any technology necessary to solve itbull Donrsquot forget the original motivation
ndash It might become obsolete
bull How do you describe your tool to end usersndash They care about what new thing it enablessimplifiesndash Not about how cleverly it does what it does
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Kargerrsquos Best ESWC Papersbull A Session-based Approach for Aligning Large Ontologiesbull Broadening the Scope of Nanopublicationsbull Multilingual semantic wiki based on Attempto Controlled
English and Grammatical Frameworkbull Personalized Concept-based Search and Exploration on
the Web of Data using Results Categorizationbull Collecting Links Between Entities Ranked by Human
Association Strengthsbull Guiding the evolution of a multilingual ontology in a
concrete settingbull Connecting the Smithsonian American Art Museum to
the Linked Data Cloud
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Semantic Web Apps at ESWCbull ESWC does offer Semantic Web Apps
ndash Mashup challengendash Demo session
bull Why arenrsquot these appearing as papersbull The missing piece evaluation
ndash What happens when users try to use itndash What happens when the schema changes
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the webbull Fetching remote pages
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Semantic Web Applicationsbull ldquoSemanticrdquo is a modifier on ldquoWebrdquobull What was so newwonderful about the web
ndash Could always author amp view docs on our computersndash Could always access them with ftp
bull ldquoMinorrdquo workflow changesndash URL canonical way for a doc to reference other docsndash click instant access to whatrsquos at the linkndash browser staying inside one application
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
The web did not make new things possible
It made old things simple
Can SW do the same
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-
Conclusionbull A key innovation opportunity in the Semantic
Web is making it easier for end users to produce share and consume structured data
bull An urgent need and immediate opportunity for tools that take the drudgery out of data work
bull We have to study end users to understand their needs build tools to meet those needs and assess how well those tools work
bull Donrsquot ask what you can do for the Semantic Web ask what the Semantic Web can do for you
- End User Semantic Web Applications
- Conclusion
- Back Story
- Problem-Driven Agenda
- Haystack
- Slide 6
- Writing a Brain Research Paper
- Adding ldquoThings to Dordquo Region
- Revised Environment
- Role of Semantic Web
- Semantic Web Applications
- Role of Semantic Web (2)
- Rest of Talk
- Homebrew Databases
- ldquoI want my spreadsheet database to work betterrdquo
- Supercharging Spreadsheets for Data Management
- Spreadsheets
- Spreadsheets (2)
- Alternative Related Worksheets
- One-to-ManyMany-to-Many Relationships
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- User Study
- User Study (2)
- User Study (3)
- User Study (4)
- Results Demographics
- Results Correctness and Features Used
- Results Timing
- Conclusion (2)
- A Semantic Web Application
- ldquoI Want To Publish My Volunteer Roster on the Webrdquo
- Web Authoring With Structured Data
- Some Web History
- Slide 45
- Slide 46
- The Virtuous Cycle of Web Authoring
- Structured Data is Better
- Slide 49
- Slide 50
- Slide 51
- Why
- Goal
- Do We Need This
- Approach
- Like Spreadsheets
- Example HTML
- Generalize to Data
- Can This be Done
- Slide 60
- Slide 61
- Slide 62
- Data
- Views
- Facets
- Lenses
- Key Primitives of a Data Page
- General Enough
- General Enough (2)
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Impoverished Information Visualization
- Exhibit
- Prototype Exhibit
- Usage
- Examples
- Slide 81
- Slide 82
- Slide 83
- Hobby Stores
- Science
- PhD Theses
- Rental Apartments
- Datagov
- NGOs
- Newspapers
- Libraries
- Sports
- Strange Hobbyists
- Slide 94
- Usage Study
- Domains
- Data Model
- Schema Size (Number of Properties)
- Data Format
- Single-View Exhibits
- Percentage of Schema in Visualization
- Authoring by Copying
- Scalability
- Incentivizing Data
- DATA EXPORT
- Slide 106
- Slide 107
- Slide 108
- Summary
- EXTENSIONS
- Wibit Collaborative Authoring in a Wiki
- Exhibit in a Wiki Wibit
- Exhibit in a Blog Datapress
- WordPress + datapress
- Or Just a Document
- A Semantic Web Application (2)
- I Canrsquot Handle My Incoming Information Overload Help
- End UserS Programming Information Stream Handlers
- Motivation
- Examples (2)
- What we need
- Controlled Natural Language Interface
- Example 1
- Example 2 Travel Mangement
- Inside a rule
- Rules in constrained natural language
- Rules in constrained natural language (2)
- Actions in constrained natural language
- Study
- Rule creation study (method)
- Slide 131
- Slide 132
- Rule creation study
- Rule creation study
- Slide 135
- Average time to complete each rule
- Perceived difficulty of creating rules
- Perceived usefulness
- What would you use atomate for
- What would you use atomate for (2)
- Discussion
- Slide 142
- Slide 143
- Slide 144
- Slide 145
- Slide 146
- Slide 147
- Slide 148
- Slide 149
- Slide 150
- Slide 151
- Whatrsquos Wrong With This
- SW Challenge Build SWIFTTT
- Summary (2)
- Whither ESWC
- ESWC Topics
- Semantic Web
- Semantic Web (2)
- Where Are All the Intelligent Agents
- Where Are All the Intelligent Agents (2)
- Bringing Intelligence to Applications
- Wherersquos the Science
- Choose Your Motivation Wisely
- Hammers vs Nails
- Kargerrsquos Best ESWC Papers
- Semantic Web Apps at ESWC
- Semantic Web Applications (2)
- Semantic Web Applications (3)
- Slide 169
- Conclusion (3)
-