a practical example of data archiving · a practical example of data archiving eleni castro >...

16
A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image: https://flic.kr/p/7vu434s

Upload: others

Post on 03-Sep-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Practical Example of Data Archiving · A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image:

A Practical Example of Data Archiving

Eleni Castro gt IQSS Harvard SHARE 2014 Fall Meeting October 14 2014

Image httpsflickrp7vu434s

Data Science Team

Find out more httpdatascienceiqharvardedu 2

Introduction to Dataverse Software framework for publishing citing and preserving research data (open source on github for others to install)

Provides incentives for researchers to share bull   Recognition amp credit via data

citations bull   Control over data amp branding bull   Fulfill journal data availability and

funder requirements Harvard Dataverse (open to all general repository instance at Harvard)

3

Why did we launch Dataverse

Replication Standard (King 1995)

Virtual Data Center (1999-2006)

4

Who Uses Dataverse

Worldwide Dataverse Installations

Institutions can setuphost their own Dataverse installation (UNC ODUM Fudan Univ Scholars Portal DANS etc) and within them can have dataverses for a variety of users (across all research domains) Researchers Projects Journals etc

5

Example of a Scholarrsquos Dataverse

6

Research Center Dataverse

7

Example of an Institute Dataverse

8

Example of a Journal Dataverse

9

Dataverse for Teaching Replication

10

   

   

Dataverse Best Practices (1) bull Standard Metadata Schemas

ndash DDI (great for social science data) amp DC ndash Coming in 40 DataCite 30 ISA- Tab (biomedical)

and VO Resource (astronomy)

bull Formal Data Citation (Altman amp King 2007) ndash Endorse + comply w Joint Declaration of Data

Citation Principles (incl Crosas) bull Persistent IDs Handles amp DOI

(DataCiteEZID)

11

   

Dataverse Best Practices (2) bull Fixity

ndash UNF (King amp Altman) for tabular data ndash MD5 checksums for other files

bull Open Data (+ metadata) Licenses (CC0) or custom

bull OAI-PMH harvesting metadata (DC DDI hellip)

bull LOCKSS (replication of files) agrave Data-PASS

12

Towards An Integrated Publishing Workflow

13

API Integration with Dataverse

Data Deposit API (metadata + data w SWORDv2)

For depositing datasets into Dataverse via API See OJS-Dataverse Journal Integration Project httpprojectsiqharvardeduojs-dvnhome

Also dvn R Package OSF Dataverse Add-on etc

Data Sharing API For searchingdownloading Dataverse datasets (metadata + data) via API See Thomas Leeperrsquos dvn R package

14

         

Future of Dataverse

igrave  Dataverse 40 ( try beta) igrave  Based on usability testing

igrave  WorldMap Integration (geospatial viz w GeoConnect)

igrave  Sharing Privacy Sensitive Data igrave  Secure Dataverse igrave  DataTags (questionnaires

based on privacy laws)

Longer-Term bull Provenance Registry (data citation amp provenance w SEAS (NSF)) bull ORCID Integration (API) bull Large-scale datasets (efficient storage) agrave iRods wODUM bull Ensuring long-term preservation for more file formats (eg Archivematica) bull Integrate with more Publishing Systems 15

Thank you Contact ecastrofasharvardedu

More information httpdatascienceiqharvardedu Twitter thedataorg

16

Page 2: A Practical Example of Data Archiving · A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image:

Data Science Team

Find out more httpdatascienceiqharvardedu 2

Introduction to Dataverse Software framework for publishing citing and preserving research data (open source on github for others to install)

Provides incentives for researchers to share bull   Recognition amp credit via data

citations bull   Control over data amp branding bull   Fulfill journal data availability and

funder requirements Harvard Dataverse (open to all general repository instance at Harvard)

3

Why did we launch Dataverse

Replication Standard (King 1995)

Virtual Data Center (1999-2006)

4

Who Uses Dataverse

Worldwide Dataverse Installations

Institutions can setuphost their own Dataverse installation (UNC ODUM Fudan Univ Scholars Portal DANS etc) and within them can have dataverses for a variety of users (across all research domains) Researchers Projects Journals etc

5

Example of a Scholarrsquos Dataverse

6

Research Center Dataverse

7

Example of an Institute Dataverse

8

Example of a Journal Dataverse

9

Dataverse for Teaching Replication

10

   

   

Dataverse Best Practices (1) bull Standard Metadata Schemas

ndash DDI (great for social science data) amp DC ndash Coming in 40 DataCite 30 ISA- Tab (biomedical)

and VO Resource (astronomy)

bull Formal Data Citation (Altman amp King 2007) ndash Endorse + comply w Joint Declaration of Data

Citation Principles (incl Crosas) bull Persistent IDs Handles amp DOI

(DataCiteEZID)

11

   

Dataverse Best Practices (2) bull Fixity

ndash UNF (King amp Altman) for tabular data ndash MD5 checksums for other files

bull Open Data (+ metadata) Licenses (CC0) or custom

bull OAI-PMH harvesting metadata (DC DDI hellip)

bull LOCKSS (replication of files) agrave Data-PASS

12

Towards An Integrated Publishing Workflow

13

API Integration with Dataverse

Data Deposit API (metadata + data w SWORDv2)

For depositing datasets into Dataverse via API See OJS-Dataverse Journal Integration Project httpprojectsiqharvardeduojs-dvnhome

Also dvn R Package OSF Dataverse Add-on etc

Data Sharing API For searchingdownloading Dataverse datasets (metadata + data) via API See Thomas Leeperrsquos dvn R package

14

         

Future of Dataverse

igrave  Dataverse 40 ( try beta) igrave  Based on usability testing

igrave  WorldMap Integration (geospatial viz w GeoConnect)

igrave  Sharing Privacy Sensitive Data igrave  Secure Dataverse igrave  DataTags (questionnaires

based on privacy laws)

Longer-Term bull Provenance Registry (data citation amp provenance w SEAS (NSF)) bull ORCID Integration (API) bull Large-scale datasets (efficient storage) agrave iRods wODUM bull Ensuring long-term preservation for more file formats (eg Archivematica) bull Integrate with more Publishing Systems 15

Thank you Contact ecastrofasharvardedu

More information httpdatascienceiqharvardedu Twitter thedataorg

16

Page 3: A Practical Example of Data Archiving · A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image:

Introduction to Dataverse Software framework for publishing citing and preserving research data (open source on github for others to install)

Provides incentives for researchers to share bull   Recognition amp credit via data

citations bull   Control over data amp branding bull   Fulfill journal data availability and

funder requirements Harvard Dataverse (open to all general repository instance at Harvard)

3

Why did we launch Dataverse

Replication Standard (King 1995)

Virtual Data Center (1999-2006)

4

Who Uses Dataverse

Worldwide Dataverse Installations

Institutions can setuphost their own Dataverse installation (UNC ODUM Fudan Univ Scholars Portal DANS etc) and within them can have dataverses for a variety of users (across all research domains) Researchers Projects Journals etc

5

Example of a Scholarrsquos Dataverse

6

Research Center Dataverse

7

Example of an Institute Dataverse

8

Example of a Journal Dataverse

9

Dataverse for Teaching Replication

10

   

   

Dataverse Best Practices (1) bull Standard Metadata Schemas

ndash DDI (great for social science data) amp DC ndash Coming in 40 DataCite 30 ISA- Tab (biomedical)

and VO Resource (astronomy)

bull Formal Data Citation (Altman amp King 2007) ndash Endorse + comply w Joint Declaration of Data

Citation Principles (incl Crosas) bull Persistent IDs Handles amp DOI

(DataCiteEZID)

11

   

Dataverse Best Practices (2) bull Fixity

ndash UNF (King amp Altman) for tabular data ndash MD5 checksums for other files

bull Open Data (+ metadata) Licenses (CC0) or custom

bull OAI-PMH harvesting metadata (DC DDI hellip)

bull LOCKSS (replication of files) agrave Data-PASS

12

Towards An Integrated Publishing Workflow

13

API Integration with Dataverse

Data Deposit API (metadata + data w SWORDv2)

For depositing datasets into Dataverse via API See OJS-Dataverse Journal Integration Project httpprojectsiqharvardeduojs-dvnhome

Also dvn R Package OSF Dataverse Add-on etc

Data Sharing API For searchingdownloading Dataverse datasets (metadata + data) via API See Thomas Leeperrsquos dvn R package

14

         

Future of Dataverse

igrave  Dataverse 40 ( try beta) igrave  Based on usability testing

igrave  WorldMap Integration (geospatial viz w GeoConnect)

igrave  Sharing Privacy Sensitive Data igrave  Secure Dataverse igrave  DataTags (questionnaires

based on privacy laws)

Longer-Term bull Provenance Registry (data citation amp provenance w SEAS (NSF)) bull ORCID Integration (API) bull Large-scale datasets (efficient storage) agrave iRods wODUM bull Ensuring long-term preservation for more file formats (eg Archivematica) bull Integrate with more Publishing Systems 15

Thank you Contact ecastrofasharvardedu

More information httpdatascienceiqharvardedu Twitter thedataorg

16

Page 4: A Practical Example of Data Archiving · A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image:

Why did we launch Dataverse

Replication Standard (King 1995)

Virtual Data Center (1999-2006)

4

Who Uses Dataverse

Worldwide Dataverse Installations

Institutions can setuphost their own Dataverse installation (UNC ODUM Fudan Univ Scholars Portal DANS etc) and within them can have dataverses for a variety of users (across all research domains) Researchers Projects Journals etc

5

Example of a Scholarrsquos Dataverse

6

Research Center Dataverse

7

Example of an Institute Dataverse

8

Example of a Journal Dataverse

9

Dataverse for Teaching Replication

10

   

   

Dataverse Best Practices (1) bull Standard Metadata Schemas

ndash DDI (great for social science data) amp DC ndash Coming in 40 DataCite 30 ISA- Tab (biomedical)

and VO Resource (astronomy)

bull Formal Data Citation (Altman amp King 2007) ndash Endorse + comply w Joint Declaration of Data

Citation Principles (incl Crosas) bull Persistent IDs Handles amp DOI

(DataCiteEZID)

11

   

Dataverse Best Practices (2) bull Fixity

ndash UNF (King amp Altman) for tabular data ndash MD5 checksums for other files

bull Open Data (+ metadata) Licenses (CC0) or custom

bull OAI-PMH harvesting metadata (DC DDI hellip)

bull LOCKSS (replication of files) agrave Data-PASS

12

Towards An Integrated Publishing Workflow

13

API Integration with Dataverse

Data Deposit API (metadata + data w SWORDv2)

For depositing datasets into Dataverse via API See OJS-Dataverse Journal Integration Project httpprojectsiqharvardeduojs-dvnhome

Also dvn R Package OSF Dataverse Add-on etc

Data Sharing API For searchingdownloading Dataverse datasets (metadata + data) via API See Thomas Leeperrsquos dvn R package

14

         

Future of Dataverse

igrave  Dataverse 40 ( try beta) igrave  Based on usability testing

igrave  WorldMap Integration (geospatial viz w GeoConnect)

igrave  Sharing Privacy Sensitive Data igrave  Secure Dataverse igrave  DataTags (questionnaires

based on privacy laws)

Longer-Term bull Provenance Registry (data citation amp provenance w SEAS (NSF)) bull ORCID Integration (API) bull Large-scale datasets (efficient storage) agrave iRods wODUM bull Ensuring long-term preservation for more file formats (eg Archivematica) bull Integrate with more Publishing Systems 15

Thank you Contact ecastrofasharvardedu

More information httpdatascienceiqharvardedu Twitter thedataorg

16

Page 5: A Practical Example of Data Archiving · A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image:

Who Uses Dataverse

Worldwide Dataverse Installations

Institutions can setuphost their own Dataverse installation (UNC ODUM Fudan Univ Scholars Portal DANS etc) and within them can have dataverses for a variety of users (across all research domains) Researchers Projects Journals etc

5

Example of a Scholarrsquos Dataverse

6

Research Center Dataverse

7

Example of an Institute Dataverse

8

Example of a Journal Dataverse

9

Dataverse for Teaching Replication

10

   

   

Dataverse Best Practices (1) bull Standard Metadata Schemas

ndash DDI (great for social science data) amp DC ndash Coming in 40 DataCite 30 ISA- Tab (biomedical)

and VO Resource (astronomy)

bull Formal Data Citation (Altman amp King 2007) ndash Endorse + comply w Joint Declaration of Data

Citation Principles (incl Crosas) bull Persistent IDs Handles amp DOI

(DataCiteEZID)

11

   

Dataverse Best Practices (2) bull Fixity

ndash UNF (King amp Altman) for tabular data ndash MD5 checksums for other files

bull Open Data (+ metadata) Licenses (CC0) or custom

bull OAI-PMH harvesting metadata (DC DDI hellip)

bull LOCKSS (replication of files) agrave Data-PASS

12

Towards An Integrated Publishing Workflow

13

API Integration with Dataverse

Data Deposit API (metadata + data w SWORDv2)

For depositing datasets into Dataverse via API See OJS-Dataverse Journal Integration Project httpprojectsiqharvardeduojs-dvnhome

Also dvn R Package OSF Dataverse Add-on etc

Data Sharing API For searchingdownloading Dataverse datasets (metadata + data) via API See Thomas Leeperrsquos dvn R package

14

         

Future of Dataverse

igrave  Dataverse 40 ( try beta) igrave  Based on usability testing

igrave  WorldMap Integration (geospatial viz w GeoConnect)

igrave  Sharing Privacy Sensitive Data igrave  Secure Dataverse igrave  DataTags (questionnaires

based on privacy laws)

Longer-Term bull Provenance Registry (data citation amp provenance w SEAS (NSF)) bull ORCID Integration (API) bull Large-scale datasets (efficient storage) agrave iRods wODUM bull Ensuring long-term preservation for more file formats (eg Archivematica) bull Integrate with more Publishing Systems 15

Thank you Contact ecastrofasharvardedu

More information httpdatascienceiqharvardedu Twitter thedataorg

16

Page 6: A Practical Example of Data Archiving · A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image:

Example of a Scholarrsquos Dataverse

6

Research Center Dataverse

7

Example of an Institute Dataverse

8

Example of a Journal Dataverse

9

Dataverse for Teaching Replication

10

   

   

Dataverse Best Practices (1) bull Standard Metadata Schemas

ndash DDI (great for social science data) amp DC ndash Coming in 40 DataCite 30 ISA- Tab (biomedical)

and VO Resource (astronomy)

bull Formal Data Citation (Altman amp King 2007) ndash Endorse + comply w Joint Declaration of Data

Citation Principles (incl Crosas) bull Persistent IDs Handles amp DOI

(DataCiteEZID)

11

   

Dataverse Best Practices (2) bull Fixity

ndash UNF (King amp Altman) for tabular data ndash MD5 checksums for other files

bull Open Data (+ metadata) Licenses (CC0) or custom

bull OAI-PMH harvesting metadata (DC DDI hellip)

bull LOCKSS (replication of files) agrave Data-PASS

12

Towards An Integrated Publishing Workflow

13

API Integration with Dataverse

Data Deposit API (metadata + data w SWORDv2)

For depositing datasets into Dataverse via API See OJS-Dataverse Journal Integration Project httpprojectsiqharvardeduojs-dvnhome

Also dvn R Package OSF Dataverse Add-on etc

Data Sharing API For searchingdownloading Dataverse datasets (metadata + data) via API See Thomas Leeperrsquos dvn R package

14

         

Future of Dataverse

igrave  Dataverse 40 ( try beta) igrave  Based on usability testing

igrave  WorldMap Integration (geospatial viz w GeoConnect)

igrave  Sharing Privacy Sensitive Data igrave  Secure Dataverse igrave  DataTags (questionnaires

based on privacy laws)

Longer-Term bull Provenance Registry (data citation amp provenance w SEAS (NSF)) bull ORCID Integration (API) bull Large-scale datasets (efficient storage) agrave iRods wODUM bull Ensuring long-term preservation for more file formats (eg Archivematica) bull Integrate with more Publishing Systems 15

Thank you Contact ecastrofasharvardedu

More information httpdatascienceiqharvardedu Twitter thedataorg

16

Page 7: A Practical Example of Data Archiving · A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image:

Research Center Dataverse

7

Example of an Institute Dataverse

8

Example of a Journal Dataverse

9

Dataverse for Teaching Replication

10

   

   

Dataverse Best Practices (1) bull Standard Metadata Schemas

ndash DDI (great for social science data) amp DC ndash Coming in 40 DataCite 30 ISA- Tab (biomedical)

and VO Resource (astronomy)

bull Formal Data Citation (Altman amp King 2007) ndash Endorse + comply w Joint Declaration of Data

Citation Principles (incl Crosas) bull Persistent IDs Handles amp DOI

(DataCiteEZID)

11

   

Dataverse Best Practices (2) bull Fixity

ndash UNF (King amp Altman) for tabular data ndash MD5 checksums for other files

bull Open Data (+ metadata) Licenses (CC0) or custom

bull OAI-PMH harvesting metadata (DC DDI hellip)

bull LOCKSS (replication of files) agrave Data-PASS

12

Towards An Integrated Publishing Workflow

13

API Integration with Dataverse

Data Deposit API (metadata + data w SWORDv2)

For depositing datasets into Dataverse via API See OJS-Dataverse Journal Integration Project httpprojectsiqharvardeduojs-dvnhome

Also dvn R Package OSF Dataverse Add-on etc

Data Sharing API For searchingdownloading Dataverse datasets (metadata + data) via API See Thomas Leeperrsquos dvn R package

14

         

Future of Dataverse

igrave  Dataverse 40 ( try beta) igrave  Based on usability testing

igrave  WorldMap Integration (geospatial viz w GeoConnect)

igrave  Sharing Privacy Sensitive Data igrave  Secure Dataverse igrave  DataTags (questionnaires

based on privacy laws)

Longer-Term bull Provenance Registry (data citation amp provenance w SEAS (NSF)) bull ORCID Integration (API) bull Large-scale datasets (efficient storage) agrave iRods wODUM bull Ensuring long-term preservation for more file formats (eg Archivematica) bull Integrate with more Publishing Systems 15

Thank you Contact ecastrofasharvardedu

More information httpdatascienceiqharvardedu Twitter thedataorg

16

Page 8: A Practical Example of Data Archiving · A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image:

Example of an Institute Dataverse

8

Example of a Journal Dataverse

9

Dataverse for Teaching Replication

10

   

   

Dataverse Best Practices (1) bull Standard Metadata Schemas

ndash DDI (great for social science data) amp DC ndash Coming in 40 DataCite 30 ISA- Tab (biomedical)

and VO Resource (astronomy)

bull Formal Data Citation (Altman amp King 2007) ndash Endorse + comply w Joint Declaration of Data

Citation Principles (incl Crosas) bull Persistent IDs Handles amp DOI

(DataCiteEZID)

11

   

Dataverse Best Practices (2) bull Fixity

ndash UNF (King amp Altman) for tabular data ndash MD5 checksums for other files

bull Open Data (+ metadata) Licenses (CC0) or custom

bull OAI-PMH harvesting metadata (DC DDI hellip)

bull LOCKSS (replication of files) agrave Data-PASS

12

Towards An Integrated Publishing Workflow

13

API Integration with Dataverse

Data Deposit API (metadata + data w SWORDv2)

For depositing datasets into Dataverse via API See OJS-Dataverse Journal Integration Project httpprojectsiqharvardeduojs-dvnhome

Also dvn R Package OSF Dataverse Add-on etc

Data Sharing API For searchingdownloading Dataverse datasets (metadata + data) via API See Thomas Leeperrsquos dvn R package

14

         

Future of Dataverse

igrave  Dataverse 40 ( try beta) igrave  Based on usability testing

igrave  WorldMap Integration (geospatial viz w GeoConnect)

igrave  Sharing Privacy Sensitive Data igrave  Secure Dataverse igrave  DataTags (questionnaires

based on privacy laws)

Longer-Term bull Provenance Registry (data citation amp provenance w SEAS (NSF)) bull ORCID Integration (API) bull Large-scale datasets (efficient storage) agrave iRods wODUM bull Ensuring long-term preservation for more file formats (eg Archivematica) bull Integrate with more Publishing Systems 15

Thank you Contact ecastrofasharvardedu

More information httpdatascienceiqharvardedu Twitter thedataorg

16

Page 9: A Practical Example of Data Archiving · A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image:

Example of a Journal Dataverse

9

Dataverse for Teaching Replication

10

   

   

Dataverse Best Practices (1) bull Standard Metadata Schemas

ndash DDI (great for social science data) amp DC ndash Coming in 40 DataCite 30 ISA- Tab (biomedical)

and VO Resource (astronomy)

bull Formal Data Citation (Altman amp King 2007) ndash Endorse + comply w Joint Declaration of Data

Citation Principles (incl Crosas) bull Persistent IDs Handles amp DOI

(DataCiteEZID)

11

   

Dataverse Best Practices (2) bull Fixity

ndash UNF (King amp Altman) for tabular data ndash MD5 checksums for other files

bull Open Data (+ metadata) Licenses (CC0) or custom

bull OAI-PMH harvesting metadata (DC DDI hellip)

bull LOCKSS (replication of files) agrave Data-PASS

12

Towards An Integrated Publishing Workflow

13

API Integration with Dataverse

Data Deposit API (metadata + data w SWORDv2)

For depositing datasets into Dataverse via API See OJS-Dataverse Journal Integration Project httpprojectsiqharvardeduojs-dvnhome

Also dvn R Package OSF Dataverse Add-on etc

Data Sharing API For searchingdownloading Dataverse datasets (metadata + data) via API See Thomas Leeperrsquos dvn R package

14

         

Future of Dataverse

igrave  Dataverse 40 ( try beta) igrave  Based on usability testing

igrave  WorldMap Integration (geospatial viz w GeoConnect)

igrave  Sharing Privacy Sensitive Data igrave  Secure Dataverse igrave  DataTags (questionnaires

based on privacy laws)

Longer-Term bull Provenance Registry (data citation amp provenance w SEAS (NSF)) bull ORCID Integration (API) bull Large-scale datasets (efficient storage) agrave iRods wODUM bull Ensuring long-term preservation for more file formats (eg Archivematica) bull Integrate with more Publishing Systems 15

Thank you Contact ecastrofasharvardedu

More information httpdatascienceiqharvardedu Twitter thedataorg

16

Page 10: A Practical Example of Data Archiving · A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image:

Dataverse for Teaching Replication

10

   

   

Dataverse Best Practices (1) bull Standard Metadata Schemas

ndash DDI (great for social science data) amp DC ndash Coming in 40 DataCite 30 ISA- Tab (biomedical)

and VO Resource (astronomy)

bull Formal Data Citation (Altman amp King 2007) ndash Endorse + comply w Joint Declaration of Data

Citation Principles (incl Crosas) bull Persistent IDs Handles amp DOI

(DataCiteEZID)

11

   

Dataverse Best Practices (2) bull Fixity

ndash UNF (King amp Altman) for tabular data ndash MD5 checksums for other files

bull Open Data (+ metadata) Licenses (CC0) or custom

bull OAI-PMH harvesting metadata (DC DDI hellip)

bull LOCKSS (replication of files) agrave Data-PASS

12

Towards An Integrated Publishing Workflow

13

API Integration with Dataverse

Data Deposit API (metadata + data w SWORDv2)

For depositing datasets into Dataverse via API See OJS-Dataverse Journal Integration Project httpprojectsiqharvardeduojs-dvnhome

Also dvn R Package OSF Dataverse Add-on etc

Data Sharing API For searchingdownloading Dataverse datasets (metadata + data) via API See Thomas Leeperrsquos dvn R package

14

         

Future of Dataverse

igrave  Dataverse 40 ( try beta) igrave  Based on usability testing

igrave  WorldMap Integration (geospatial viz w GeoConnect)

igrave  Sharing Privacy Sensitive Data igrave  Secure Dataverse igrave  DataTags (questionnaires

based on privacy laws)

Longer-Term bull Provenance Registry (data citation amp provenance w SEAS (NSF)) bull ORCID Integration (API) bull Large-scale datasets (efficient storage) agrave iRods wODUM bull Ensuring long-term preservation for more file formats (eg Archivematica) bull Integrate with more Publishing Systems 15

Thank you Contact ecastrofasharvardedu

More information httpdatascienceiqharvardedu Twitter thedataorg

16

Page 11: A Practical Example of Data Archiving · A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image:

   

   

Dataverse Best Practices (1) bull Standard Metadata Schemas

ndash DDI (great for social science data) amp DC ndash Coming in 40 DataCite 30 ISA- Tab (biomedical)

and VO Resource (astronomy)

bull Formal Data Citation (Altman amp King 2007) ndash Endorse + comply w Joint Declaration of Data

Citation Principles (incl Crosas) bull Persistent IDs Handles amp DOI

(DataCiteEZID)

11

   

Dataverse Best Practices (2) bull Fixity

ndash UNF (King amp Altman) for tabular data ndash MD5 checksums for other files

bull Open Data (+ metadata) Licenses (CC0) or custom

bull OAI-PMH harvesting metadata (DC DDI hellip)

bull LOCKSS (replication of files) agrave Data-PASS

12

Towards An Integrated Publishing Workflow

13

API Integration with Dataverse

Data Deposit API (metadata + data w SWORDv2)

For depositing datasets into Dataverse via API See OJS-Dataverse Journal Integration Project httpprojectsiqharvardeduojs-dvnhome

Also dvn R Package OSF Dataverse Add-on etc

Data Sharing API For searchingdownloading Dataverse datasets (metadata + data) via API See Thomas Leeperrsquos dvn R package

14

         

Future of Dataverse

igrave  Dataverse 40 ( try beta) igrave  Based on usability testing

igrave  WorldMap Integration (geospatial viz w GeoConnect)

igrave  Sharing Privacy Sensitive Data igrave  Secure Dataverse igrave  DataTags (questionnaires

based on privacy laws)

Longer-Term bull Provenance Registry (data citation amp provenance w SEAS (NSF)) bull ORCID Integration (API) bull Large-scale datasets (efficient storage) agrave iRods wODUM bull Ensuring long-term preservation for more file formats (eg Archivematica) bull Integrate with more Publishing Systems 15

Thank you Contact ecastrofasharvardedu

More information httpdatascienceiqharvardedu Twitter thedataorg

16

Page 12: A Practical Example of Data Archiving · A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image:

   

Dataverse Best Practices (2) bull Fixity

ndash UNF (King amp Altman) for tabular data ndash MD5 checksums for other files

bull Open Data (+ metadata) Licenses (CC0) or custom

bull OAI-PMH harvesting metadata (DC DDI hellip)

bull LOCKSS (replication of files) agrave Data-PASS

12

Towards An Integrated Publishing Workflow

13

API Integration with Dataverse

Data Deposit API (metadata + data w SWORDv2)

For depositing datasets into Dataverse via API See OJS-Dataverse Journal Integration Project httpprojectsiqharvardeduojs-dvnhome

Also dvn R Package OSF Dataverse Add-on etc

Data Sharing API For searchingdownloading Dataverse datasets (metadata + data) via API See Thomas Leeperrsquos dvn R package

14

         

Future of Dataverse

igrave  Dataverse 40 ( try beta) igrave  Based on usability testing

igrave  WorldMap Integration (geospatial viz w GeoConnect)

igrave  Sharing Privacy Sensitive Data igrave  Secure Dataverse igrave  DataTags (questionnaires

based on privacy laws)

Longer-Term bull Provenance Registry (data citation amp provenance w SEAS (NSF)) bull ORCID Integration (API) bull Large-scale datasets (efficient storage) agrave iRods wODUM bull Ensuring long-term preservation for more file formats (eg Archivematica) bull Integrate with more Publishing Systems 15

Thank you Contact ecastrofasharvardedu

More information httpdatascienceiqharvardedu Twitter thedataorg

16

Page 13: A Practical Example of Data Archiving · A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image:

Towards An Integrated Publishing Workflow

13

API Integration with Dataverse

Data Deposit API (metadata + data w SWORDv2)

For depositing datasets into Dataverse via API See OJS-Dataverse Journal Integration Project httpprojectsiqharvardeduojs-dvnhome

Also dvn R Package OSF Dataverse Add-on etc

Data Sharing API For searchingdownloading Dataverse datasets (metadata + data) via API See Thomas Leeperrsquos dvn R package

14

         

Future of Dataverse

igrave  Dataverse 40 ( try beta) igrave  Based on usability testing

igrave  WorldMap Integration (geospatial viz w GeoConnect)

igrave  Sharing Privacy Sensitive Data igrave  Secure Dataverse igrave  DataTags (questionnaires

based on privacy laws)

Longer-Term bull Provenance Registry (data citation amp provenance w SEAS (NSF)) bull ORCID Integration (API) bull Large-scale datasets (efficient storage) agrave iRods wODUM bull Ensuring long-term preservation for more file formats (eg Archivematica) bull Integrate with more Publishing Systems 15

Thank you Contact ecastrofasharvardedu

More information httpdatascienceiqharvardedu Twitter thedataorg

16

Page 14: A Practical Example of Data Archiving · A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image:

API Integration with Dataverse

Data Deposit API (metadata + data w SWORDv2)

For depositing datasets into Dataverse via API See OJS-Dataverse Journal Integration Project httpprojectsiqharvardeduojs-dvnhome

Also dvn R Package OSF Dataverse Add-on etc

Data Sharing API For searchingdownloading Dataverse datasets (metadata + data) via API See Thomas Leeperrsquos dvn R package

14

         

Future of Dataverse

igrave  Dataverse 40 ( try beta) igrave  Based on usability testing

igrave  WorldMap Integration (geospatial viz w GeoConnect)

igrave  Sharing Privacy Sensitive Data igrave  Secure Dataverse igrave  DataTags (questionnaires

based on privacy laws)

Longer-Term bull Provenance Registry (data citation amp provenance w SEAS (NSF)) bull ORCID Integration (API) bull Large-scale datasets (efficient storage) agrave iRods wODUM bull Ensuring long-term preservation for more file formats (eg Archivematica) bull Integrate with more Publishing Systems 15

Thank you Contact ecastrofasharvardedu

More information httpdatascienceiqharvardedu Twitter thedataorg

16

Page 15: A Practical Example of Data Archiving · A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image:

         

Future of Dataverse

igrave  Dataverse 40 ( try beta) igrave  Based on usability testing

igrave  WorldMap Integration (geospatial viz w GeoConnect)

igrave  Sharing Privacy Sensitive Data igrave  Secure Dataverse igrave  DataTags (questionnaires

based on privacy laws)

Longer-Term bull Provenance Registry (data citation amp provenance w SEAS (NSF)) bull ORCID Integration (API) bull Large-scale datasets (efficient storage) agrave iRods wODUM bull Ensuring long-term preservation for more file formats (eg Archivematica) bull Integrate with more Publishing Systems 15

Thank you Contact ecastrofasharvardedu

More information httpdatascienceiqharvardedu Twitter thedataorg

16

Page 16: A Practical Example of Data Archiving · A Practical Example of Data Archiving Eleni Castro > IQSS Harvard SHARE 2014 Fall Meeting October 14, 2014 Image:

Thank you Contact ecastrofasharvardedu

More information httpdatascienceiqharvardedu Twitter thedataorg

16