bren - ucsb - spooky spreadsheets

Download Bren - UCSB - Spooky spreadsheets

Post on 26-Jan-2015

110 views

Category:

Technology

4 download

Embed Size (px)

DESCRIPTION

Talk for Jim Frew's grad class at Bren School, UC Santa Barbara. Oct 31, 2013. All about things you can do wrong (and right) with spreadsheets.

TRANSCRIPT

  • 1. FromFlickrbyJeGoldenSpooky SpreadsheetsCarlyStrasser|CaliforniaDigitalLibrary UCSB/BrenOct2013

2. Roadmap3. Toolbox 2.Bestpractices 1. Background 3. Scientistsarebadat datamanagement. FromFlickrbyrobertpaulyoung 4. Manytables 5. Embedded gures 6. myspreadsheetNoheadings 7. myspreadsheet 8. myspreadsheet 9. ? 10. www.petshaming.netNOReproducibility Transparency ReuseDidntsharethedata Didntdocumentthedata(metadata) Didntdocumentprovenance/workow 11. WhyshouldIcare? FromFlickrbyjohntrainor 12. Because theycare:FromFlickrbyRedden-McAllister 13. FromFlickrbyBigSwedeGuyBest Practicesent data managem 14. FromFlickrbyMarkSardellaPlanbeforedata collection 15. DesignsamplenamingschemeFromFlickrbyzebbie Createakey(datadictionary) Makesurenamesareunique DenecodesPlanning 16. DesignlenamingschemePhDcomics.comPlanning 17. DesignlenamingschemePlanningUsedescriptivelenames* Unique Reectcontents Bad: Mydata.xls 2001_data.csv bestversion.txtBetter: Eanis_nanaimo_2010_counts.xls Study organismSite nameYearWhatwas measured*Notforeveryone FromRCook,ESABestPracticesWorkshop2010 18. DesignleorganizationPlanningFromS.Hampton 19. Designleorganization BiodiversityLake Experiments Biodiv_H20_heatExp_2005to2008.csv Biodiv_H20_predatorExp_2001to2003.csv Fieldwork Biodiv_H20_PlanktonCount_2001toActive.csv Biodiv_H20_ChlAproles_2003.csv PlanningConsider Dependencies? Fileformats? Timeofcollection? Orderofanalysis?Wo rws ! koGrasslandFromS.Hampton 20. Designyourspreadsheet Constrainentries Atomize BreakdownspreadsheetsFromFlickrbyUlleskelfPlanning 21. ConsideradatabasePlanningArelationaldatabaseis Asetoftables Relationshipsamongthetables Alanguagetospecify&querythetables ARDBprovides Scalability:millions+records Featuresforsub-setting,querying,sorting Reducedredundancy&entryerrors FromMarkSchildhauer 22. ConsideradatabasePlanningYoushouldinvesttimeinlearningdatabasesif yourdatasetsarelargeorcomplex Considerinvestingtimeinlearningdatabasesif yourdataaresmallandhumble youeverintendtoshareyourdata youare