best practices for publishing data
DESCRIPTION
A presentation given by Hjalmar Gislason, founder and CEO of DataMarket (http://datamarket.com/) at Strata Conference in London, October 2012TRANSCRIPT
F I N D A N D U N D E R S TA N D D ATA
October, 2012Hjalmar Gislason, founder & CEO - [email protected]
Best Practices for
Publishing Data
Founder and CEO
HjalmarGislason
Twitter: @datamarketSlides: http://blog.datamarket.com/
Heavy
Data Consumers
Providers of
Data Delivery Technology
| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012
Computers
• Structure
Humans
• Search• Visualization• Download
| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012
Computers
• Structure
Humans
• Search• Visualization• Download
1. Simple formats2. Indexes, unique IDs and meta-data3. FAQs and feedback channels
Publishing for Computers
"Don't anthropomorphize computers - they hate it."
- Unknown
Simple Formats
| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012
Simple Formats:Tim Berners-Lee’s Five Stars
| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012
Simple formats:You lost me at “Semantics”
| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012
Standards will emerge and there will be more and more of them
• RDF•OData vs. GData•DSPL
| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012
Indexes, unique ids and meta-data
| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012
Indexes, unique IDs and meta-data
• Must: Unique ID, Title, Last updated• Should: Meta-data
• Why?• No need for scraping
• Less load on your end• Ensures full coverage• Ensures content removal and updates
| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012
Indexes, unique IDs and meta-data
• Hard to emphasize enough!
• Unique IDs for everything: Datsets, columns, entities, ...
• Why?• Continuity: A small change for a man = giant leap for a
computer
| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012
Indexes, unique IDs and meta-data
• Any relevant contextual information• URL(s), descriptions, methodology, next updated, authors,
keywords, units, license information, ...
| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012
FAQs and feedback channels
#1 reason for not publishing data:
“There are errors in the data and I don'twant others to discover them”
| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012
FAQs and feedback channels
#1 reason for not publishing data:
“There are errors in the data and I dowant others to discover them”
| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012
FAQs and feedback channels
| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012
FAQs and feedback channels
1. Simple formats2. Indexes, unique IDs and meta-data3. FAQs and feedback channels
Publishing for Computers
| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012
Computers
• Structure
Humans
• Search• Visualization• Download
F I N D A N D U N D E R S TA N D D ATA
Twitter: @datamarket · Facebook: DataMarket · E-mail: [email protected]
Hjalmar Gislason, founder & CEO