taxonomy and scratchpads

12
Vincent Smith & Simon Rycroft Taxonomy & Scratchpads

Upload: vincent-smith

Post on 18-May-2015

2.383 views

Category:

Technology


0 download

DESCRIPTION

Presentation by Vince Smith and Simon Rycroft given at the Encyclopedia of Life Code Sprint hosted at the BioSynC facility at the Chicago Field Museum.

TRANSCRIPT

Page 1: Taxonomy and Scratchpads

Vincent Smith & Simon Rycroft

Taxonomy &Scratchpads

Page 2: Taxonomy and Scratchpads

Biological taxonomyFindability, relationships (ontology) & tagging

• Scale• Metadata

Page 3: Taxonomy and Scratchpads

Scratchpadshttp://scratchpads.eu/

• Multi-host Drupal (5) site• Drupal customized for taxonomists• Communities apply for a site • 65+ sites, 750+ users, 130k nodes• Taxonomy central to many features

Page 4: Taxonomy and Scratchpads

Import / ExportExcel Template (CSV file) & uBio (XML feed)

http://www.ubio.org/webservices/classificationbank/search.php?classification=sp2000&node=Pediculus

(Our) Taxonomy Importhttp://svn.scratchpads.eu/viewvc/trunk/sites/all/modules/taxonomy_import/

Page 5: Taxonomy and Scratchpads

ManagementTaxonomy Manager & Taxonomy Core

http://drupal.org/project/taxonomy_manager

• Principle good, but no one uses it• Confusing and slow (HCI issues)• Major cross browser issues (Firefox)• Requires a number of “fixes”…

• Flexible metadata on terms (core)• Treat synonyms as full terms (core)• Link nodes as term attributes (e.g. biblio)• Improve manager HCI (drag-and-drop)

Page 6: Taxonomy and Scratchpads

Search & BrowseNavigation for finding tagged content

TinyTax

Automatically creates a mini-menu (block) of avocabulary that is configurable for default term

• Intuitive• Small footprint

• Integrates with a term’s page

TaxTab

http://drupal.org/project/tinytax

Augments default search with a tab for termssearches (includes term autocomplete)

http://svn.scratchpads.eu/viewvc/trunk/sites/all/modules/taxtab/

Page 7: Taxonomy and Scratchpads

• Quick & intuitive• Two step submission• Fast (but could be quicker)• Encourages tagging

AutotaggingAutomated tagging of content

Untagged node

Use or ignore discovered tags (drag & drop or add)

http://drupal.org/project/autotag

Page 8: Taxonomy and Scratchpads

Mega-VocabulariesSites with a million plus terms

Current Taxonomy Problems

e.g. http://catlife.myspecies.info(2 million+ terms)

• Taxonomy LeftandRight module• Implements nested sets• Over rides 3 taxonomy core functions

- taxonomy_get_tree- taxonomy_overview_terms- taxonomy_select_nodes

• PHP requires too much memory for large hierarchies• Very slow, especially above 50k terms• Some sites with 300k terms (unusable)• 1.8 million known species (6-80M est.)

• Very fast (in use with 2 million terms)• Solves insertion problem with decimals

Possible Solution

http://drupal.org/project/leftandright

Page 9: Taxonomy and Scratchpads

Sprint ExpectationsWhat we are looking to achieve

• Import and export of terms (TCS-XML?) from a repository• Improved & flexible term metadata• Handle synonyms as full terms• Link nodes as attributes of terms• Term and metadata management• Permissions on terms (low priority?)

Page 10: Taxonomy and Scratchpads

Questions?

Page 11: Taxonomy and Scratchpads
Page 12: Taxonomy and Scratchpads

Search & Browse 2Split Layout TreeMaps

e.g. http://scratchpads.eu/progress

• Intuitive• Small footprint

• Integrates with a term’s page• Potentially integrates multi-site content