datashare for uc campuses

41
DataShare for the UCs 6 February 2014

Upload: university-of-california-curation-center

Post on 11-Nov-2014

1.205 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: DataShare for UC Campuses

DataShare  for  the  UCs  

6  February  2014    

Page 2: DataShare for UC Campuses

From  Flickr  by  Leo  Hidalgo  

Background  Demo  of  UCSF  DataShare  Technical  details  Other  details  Future  plans  Q&A  

Where  we’re  going  

Page 3: DataShare for UC Campuses
Page 4: DataShare for UC Campuses

Catalyze  widespread  research  data  sharing  

Develop  a  system  that  lowers  data  sharing  barriers  and  builds  an  engaged  user  community  

Goal      

How  

Page 5: DataShare for UC Campuses

How?  

Other  

Survey  of  users  by  Angela  Rizk-­‐Jackson  

Has  your  research  group  provided  public  access  to  data?   No  Yes  

Journal  required  

Funder  required  

Other  

Why?  

Repository  

Website  

n  =  114  

Page 6: DataShare for UC Campuses

Repository  choices…  

Page 7: DataShare for UC Campuses

Repositories    

for  data  

General  content  

Non-­‐institutional  

Publishers/for-­‐profits  

Short-­‐term  projects  

Institutional  

Discipline-­‐specific  

Repository  choices…  

Page 8: DataShare for UC Campuses

Institutional  

Discipline-­‐specific  

•  All  data  associated  with  a  paper  

•  Tells  a  story  •  Clearinghouse  for  

researcher’s  works  

•  Some  of  data  for  a  given  paper  

•  Discoverable  •  Integrated  systems  •  Collection  policies  

Repository  choices…  

?  Both  

Which  should  a  researcher  use?  

Which  is  more  important?  

Depends  

Page 9: DataShare for UC Campuses

Institutional  

•  All  data  associated  with  a  paper  

•  Tells  a  story  •  Clearinghouse  for  

researcher’s  works  

Page 10: DataShare for UC Campuses

IR’s  are  SO  2002.  

From  Flickr  by  Colin  ZHU  

From  Flickr  by    Ludie  Cochrane  

From

 Flickr  by    john

sons53

1    

From  Flickr  by    Kapil  Karekar  

Page 11: DataShare for UC Campuses

…  “Federal  agencies  investing  in  research  and  development  (more  than  $100  million  in  annual  expenditures)  must  have  clear  and  coordinated  policies  for  increasing  public  access  to  research  products.”  

Last  year…  

Page 12: DataShare for UC Campuses

IR  

From

 Flickr  by  wiccked

 

Page 13: DataShare for UC Campuses

From  Flickr  by  jackcheng  

But…  

Not  always  self-­‐service  

Sometimes  complicated  

Data?  

“Old”  user  interfaces  

 

 

Page 14: DataShare for UC Campuses

Simplify  data  deposit  for  UC  researchers  

 Simple  metadata  

Self-­‐service  upload  and  download  Branded  for  campus  

Most  Important:    Institutional  Control  Over  Data  

Page 15: DataShare for UC Campuses

From  Flickr  by  Leo  Hidalgo  

Background  Demo  of  UCSF  DataShare  Technical  details  Other  details  Future  plans  Q&A  

Page 16: DataShare for UC Campuses

From  Flickr  by  Leo  Hidalgo  

Background  Demo  of  UCSF  DataShare  Technical  details  Other  details  Future  plans  Q&A  

Page 17: DataShare for UC Campuses

Technical  goals  •  Easy  submission  

•  Persistent  citation  

•  Preservation  assurance  

•  Effective  discovery  

•  Control  over  terms  of  use  

•  All  the  benefits  of  a  centrally  hosted  service,  while  maintaining  campus  branding  and  identity  

From  www.dimensionsinfo.com  

From  Flickr  by  Eric  Peacock  

Page 18: DataShare for UC Campuses

System  components  •  Easy  submission  

•  Persistent  citation  

•  Preservation  assurance  

•  Effective  discovery  

•  Control  over  terms  of  use  

•  All  the  benefits  of  a  centrally  hosted  service,  while  maintaining  campus  branding  and  identity  

UCSF  drag-­‐n-­‐drop  client  

DNS,  Apache,  CSS,  and  campus  Shibboleth  IdPs  

datashare.berkeley.edu  datashare.ucdavis.edu  datashare.uci.edu  datashare.ucla.edu  …  

Data  use  agreements  (DUAs)  

Page 19: DataShare for UC Campuses

Deposit  interactions  

Merritt   Discovery  (XTF)  

Drag-­‐n-­‐drop  client  

EZID  

DataCite   Data  Citation  Index  Primo  

Campus  IdP  

Researcher  (data  producer)  

Atom  Shib  

Authenticate  with  campus  credentials  

Assemble  dataset  Add  metadata  Submit  to  Merritt  

Request  DOI  Register  metadata  

Populate  XTF  index  

Request  DOI  Register  metadata  

Harvest  for  A&I  discovery  Harvest  for  A&I  discovery  

DataShare  portal   CSS  

datashare.campus.edu  

SDSC  cloud  

Preservation  storage  

Assign  DOI  

Assign  DOI  

Data  use  agreement  

Page 20: DataShare for UC Campuses

Download  interactions  

Merritt   Discovery  (XTF)  

Drag-­‐n-­‐drop  client  

EZID  

DataCite   Data  Citation  Index  Primo  

Campus  IdP  

Researcher  (data  consumer)  

DataShare  portal   CSS  

datashare.campus.edu  

SDSC  cloud   Data  use  agreement  

Accept  DUA  terms  

Faceted  search  /  browse  

Faceted  search  /  browse  Faceted  search  /  browse  

Retrieve  data  

Download  data  Synchronous  for  small  datasets;  asynchronous  for  large  (>  500  MB)  

Page 21: DataShare for UC Campuses

From  Flickr  by  Leo  Hidalgo  

Background  Demo  of  UCSF  DataShare  Technical  details  Other  details  Future  plans  Q&A  

Page 22: DataShare for UC Campuses

Campus  Library  Delivers  service  to  community  Shapes  user  interface,  URL,  branding  Customizes  key  components  Develops  help,  training  

UC3  /  CDL  Guides  the  campus  

Preserves  content  in  Merritt  Connects  to  EZID  

Deploys  XTF  for  discovery  Works  with  vendors  

SDSC  Maintains  production  storage  infrastructure  Holds  three  independent  copies  of  content  

Roles  

Page 23: DataShare for UC Campuses

Branding  &  Customization  

•  Logo  •  URL  •  Contact  information  •  Other…?  From  Flickr  by    Diorama  Sky  

Page 24: DataShare for UC Campuses

•  EZID  accounts  –  Existing  campus  memberships  provide  unlimited  

DOIs    

•  Merritt  recharge  proposal  (awaiting  UCOP  approval)  

–  Pay-­‐as-­‐you-­‐go  $0.40/GB/year  –  Paid-­‐up  (for  10  years)  $2.93/GB  –  Threshold  pricing  100,  200,  500  GBs  

   1,  2,  5,  10,  20,  50,  100  TBs    

Cost  From  Flickr  by  Maura  Teague  

Page 25: DataShare for UC Campuses

Anticipated  cost  of  providing  all  campus  ladder-­‐track  faculty  with  5  GBs  for  10  years  

Cost  

Campus   Faculty   Threshold   Paid-­‐up  cost  

Berkeley   1,260   10  TB   $  29,300  

Davis   1,240   10  TB   $  29,300  

Irvine   1,051   10  TB   $  29,300  

Los  Angeles   1,701   10  TB   $  29,300  

Merced        159        1  TB   $      2,930  

Riverside        561      5  TB   $  14,650  

San  Diego   1,109   10  TB   $  29,300  

San  Francisco        366      2  TB   $      5,860  

Santa  Barbara        746      5  TB   $  14,650  

Santa  Cruz        485      5  TB   $  14,650  

Source:  http://legacy-­‐its.ucop.edu/uwnews/stat/headcount_fte/oct2013/welcome.html    

Page 26: DataShare for UC Campuses

Governance        &  Agreements  

Goal:    Simplify  &  Scale  Data  Use  &  Deposit  Agreements  

Page 27: DataShare for UC Campuses

CDL  

UC  Campus  

Data  Depositor  

Data    User  

Terms  of  service  

ODL  or  similar    

Terms  of  service  

ODL  or  similar  

Governance        &  Agreements  

Page 28: DataShare for UC Campuses

From  Flickr  by  Leo  Hidalgo  

Background  Demo  of  UCSF  DataShare  Technical  details  Other  details  Next  steps  &  future  plans  Q&A  

Page 29: DataShare for UC Campuses

Who  Decides?  

•  CDL  to  work  with  each  campus  to  implement  &  shape  service  

•  Campus-­‐to-­‐campus  interaction  •  Group  meetings  as  needed  •  SAG1  check-­‐ins  •  Communication  (…)  

Page 30: DataShare for UC Campuses

This  is  a  group  project  

From

 Flickr  by  Misch

ievo

us  One

 

Page 31: DataShare for UC Campuses

Two  heads  are  better  than  

one!  

From  Flickr  by  Alice  Bartlett  

Page 32: DataShare for UC Campuses

•  eScholarship  connection  •  ORCID  •  Altmetrics  •  Solr/Blacklight  for  discovery  •  Expand  metadata  options  •  Embargoes  •  Restricted  access  for  peer  review  •  Annotations  •  Export  to  citation  managers  •  Staging  area  •  Private  storage  •  Mapping  metadata/GIS  support  

From  Flickr  by  Emil  Nordén  

Page 33: DataShare for UC Campuses

Google  Groups  Web  Forum  

Communication  

Page 34: DataShare for UC Campuses

UC3  confluence  site    confluence.ucop.edu/display/Curation/DataShare+for+UCs  

Communication  

Page 35: DataShare for UC Campuses

• Listserv?  • Twitter  @DataShareOrg  • …?  

Communication  

From

 Flickr  by  gsagos/n

ho  

Page 36: DataShare for UC Campuses

github.com/CDLUC3/datashare  

Communication  

Page 37: DataShare for UC Campuses

DASH:    Helping  Community  Repositories  

What  Makes  DASH  Unique:  •  Modern,  intuitive  user  interface  for  superior  user  experience  •  Freely  available  code  for  download  and  use  by  anyone  •  User-­‐friendly  API(s)  to  ensure  interoperability  with  existing  

repositories  (e.g.,  SWORD  for  deposit;  Atom,  OAI-­‐PMH,  ResourceSync  for  populating  the  discovery  index).  

•  Customizable  interfaces  that  can  be  altered  easily  to  reflect  service  provider  branding  

•  Authentication  via  institutional  Identity  Management  Systems  

To be ReviseD

Page 38: DataShare for UC Campuses

Next  Steps  –    Next  2  Weeks  •  details  to  be  established  – who’s  interested  –  tech  contact  for  interested  campuses  

– communication  lines  

From  Flickr  by  Themactep  

Page 39: DataShare for UC Campuses

•  get  DataShare  up  and  running  – Shibboleth  configuration  &  other  authentication  

– Domains/URLs  established  – Customizations  –  logos  etc.  

From  Flickr  by  Themactep  

Next  Steps  –    Next  2  Months  

Page 40: DataShare for UC Campuses

•  in-­‐person  meeting?  •  CDL  camp?  •  communication/outreach?  

From  Flickr  by  Themactep  

Next  Steps  –    Longer  term  

Page 41: DataShare for UC Campuses

•  Geoffrey  Boushey  •  Julia  Kochi  •  Megan  Laurence  

•  Stephen  Abrams  •  Trisha  Cruse  •  Carly  Strasser  •  Perry  Willett  

•  Anirvan  Chatterjee  •  Angela  Rizk-­‐Jackson  •  Maninder  Kahlon  

Acknowledgements