2. carole goble - university of manchester
TRANSCRIPT
![Page 1: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/1.jpg)
Provisioning bioinformatics using open source software: plate spinning, wave riding and friendships.
Professor Carole Goble FREng FBCS CITPUniversity of Manchester, [email protected]
![Page 2: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/2.jpg)
Cytoscape
Bio* BioBabel
![Page 3: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/3.jpg)
Social collaboration environments for sharing, curating and cataloguing personal, group and community contributed scientific assets. BSD5000+ registered users, 56 countries1600+ workflows, 1700+ services
Scientific workflow management system for accessing open, public data services, assembling data processing and analysis pipelines and recording provenance. LGPL361 organisation, 48 countries70,000+ binary downloads , ~4000 source
http://www.mygrid.org.uk
Handy tools for data management tasks in bioinformatics. BSD
![Page 4: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/4.jpg)
The Taverna Open Source Suite of Tools
Client User InterfacesGUI WorkbenchWorkflow Repository
Service Catalogue
Third Party Tools
Programming and APIs
Web Portals
Activity and Service Plug-in Manager
Provenance Store
Workflow Server
Open Provenance
Model
Secure Service Access
Workflow Engine
e-Galaxy
Virtual Machine
![Page 5: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/5.jpg)
Scientific workflows, scripts and pipelinesNow also neuroscience, music and numerical analysisDeveloped with Oxford and Southampton
Web-based Software & Sharing Services“Mobilising the long tail of scientists for all our benefit”
Common Ruby on RAILS platformCommon and exchanged codebases
Systems Biology models, data and protocolsAdopted by 4 EU wide consortiums and 4 UK sitesDeveloped with HITS and Stellenboch
Crowd sourced curated Web servicesAdopted by EdUnify and ELDA education projectsDeveloped with EBI and EMBRACE network
Find experts, advice, scripts, variable setsTowards interface for UK Data ArchivesDeveloped with NIBHI
![Page 6: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/6.jpg)
BioPortal
Controlled vocabulary restrictions
Rightfield: Wired in Annotationhttp://www.rightfield.org.uk
![Page 7: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/7.jpg)
![Page 8: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/8.jpg)
New Bioinformatics. Like the Old Bioinformatics? But bigger.
• Large scale data pipelines & Next Gen Sequencing– Cloud Analytics
• Sharing pipelines & expertise • Service reuse & curation • Data/model sharing • Metadata annotation• Data integration
• Trained 950+ bioinformaticians
![Page 9: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/9.jpg)
PharmacogenomicsAssociation study of Nevirapine-induced skin rash in Thai Population
HIV and TB research in South AfricaTryps in African Cattle
Astronomy & HelioPhysics
Library Document Preservation
Systems Biology for BioFuels and Crop research
Observing Systems Simulation Experiments
JPL, NASA
User base – I’ll come back to this
![Page 10: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/10.jpg)
Sharing Platform Trusted Service
CurationIncentives, Content
Standards & ContentGovernance & PolicyMetadata standardsData sharing policyRules of curationMethodology
Local dataCentralised data
Preservation &Publication Platform
GatewayPublic data banks
Software & ToolsOpen source
PALSAdvice, Consultancy, Training
Knowledge Network Skills & Community Building
![Page 11: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/11.jpg)
OSS BenefitsAcademic, Industry & Society ConsumersSouth Africa, South America, Thailand
• Free software and resources.• Developer and user ramps.• Transparent access to code. • Constant innovation. • No vendor lock-in.• Capacity building in academia and small biotech• Open content (workflow) authoring community• Open developer community.
![Page 12: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/12.jpg)
OSS BenefitsAcademic, Industry & Society ProducersmyGrid projects and the team
• Reputation.• Adoption & Capacity Building• Collaborations• Help to do more with smart people
– Open content community– Open developer community– Constant innovation and improvement.– Customisation and extensions.
• Sustainability paths – more funds– Impact metrics
![Page 13: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/13.jpg)
Where does open source come from?Affects governance and approach
• Industry– Eclipse, Apache: Foundations
• Community – Bio*: Open Bioinformatics Foundation
• Projects– Taverna, myExperiment, BioCatalogue, SEEK, Galaxy,
KNIME, EMBOSS, Bioclipse, GMOD, BioConductor, CytoScape, SADI, Copasi ….
– Project based, sometimes foundations, often github• Individual
– PhD student (3 year cycle) & Independents– Stuck on sourceforge
![Page 14: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/14.jpg)
Provisioning Bioinformatics using Open Source Software
• Free– Funding models
• Open Source – License models
• Open Development – Contribution models
Provisioning
Open Source Software
• Sustained & Documented• Service & Preserved• Support & Community
![Page 15: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/15.jpg)
Plate spinning:• Staying relevant to new challenges
(but stable for current ones)• Supporting & sustaining software and communities• Balancing research & engineering
Friendships:• Coordinated Core foundation• Collaborations + Contributions• Activity and staff streams
Wave riding:• Opportunities for funding and collaborations• Readiness to adapt to be adopted
![Page 16: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/16.jpg)
Core Team24 post-docs, 8 organisations, 4 countries, 12 projects, 5 research councils, 28 PALs, £1.5million per annum
X
XX
X X
X
BioinformaticansSoftware EngineersComp Sci Researchers
System AdminsTool developersBench scientists
Service providersInfrastructure providers
![Page 17: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/17.jpg)
![Page 18: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/18.jpg)
All Collaborative Projectsand frequently international
This makes transitioning to open source development easier.
![Page 19: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/19.jpg)
Cooperative SCRUM & Community IntelligenceKeeping it real: Act local think and look global. Partner.
• The PALS– Advocacy and Alerts.
• Agile development– Team building.– Continual release.– “Web 2.0” style.– Discard at least once.
• Lots of collaborations.
[1] De Roure, D. and Goble, C. "Software Design for Empowering Scientists," IEEE Software, vol. 26, no. 1, pp. 88-95, Jan/Feb 2009[2] http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html
![Page 20: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/20.jpg)
Friendships: Software ContributorsContributor Licence Agreement (Apache style)
Collaborate withWork with
Don’t knowKnow
Virtual Liver
TavernaPBS
TavernaLC
ChemTavernaEdUnify
eGalaxy
OPALTavernaCDK-Taverna
SADIR Shell
FriendsFamily
Acquaintances Strangers
CoreTeam
CVRG
caBIG
![Page 21: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/21.jpg)
Production
Research
Applications
Seed Project
SageCite
![Page 22: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/22.jpg)
A Critical Mass of resource streams
• Research Councils & Industry– Hurrah for EU FP7, JISC and the BBSRC!
• Generic software: Applications– AstroPhysics, Astronomy, Chemistry, Document
Preservation, Social Science, BioDiversity.– Cookie cutter, Cross development
In kind contributions• Other projects and intl partners• Students
![Page 23: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/23.jpg)
Coordinated plate-spinning Egosystem
• Resource planning• Matrix management• Deliverable balancing• Agenda balancing• Stakeholder balancing • Complexity drift• Core slow downOn the other hand….• Resilient• Planned for• Skunky• Community Coordination• What open source is
![Page 24: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/24.jpg)
Every project brings its own deliverables and slows down the core roadmap
Makes coordination even tougher
Funding dips, funding fashions and cycles, funding gaps
Critical mass, top slicing subsidises, collaborators & friends, fund raisers and
“platform awards”
![Page 25: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/25.jpg)
(Funding) Fashions and Agile Wave riding
• Predicting the next wave• Making the wave• Spotting the wave and repositioning• Riding out the lulls and cycles
• Workflows -> e-Labs• New models of publishing• Semantics yo-yo
– Was Semantic Web, now Linked Data• Grid -> Cloud
![Page 26: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/26.jpg)
Coordination
Sustainability
Interoperability
Adoption
Critical Mass Community
Software
The Open Source Software Polo Mint Model
![Page 27: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/27.jpg)
Sustainability of the Core Software is Free like Puppies are FreeBIS RCUK Expert Panel report on e-infrastructure
Novelty• Funding agencies (excl BBSRC BBR and T&R)• Self-promotion
Research vs Production Confusion• Claiming research has a user base• Claiming production is research
Entropy• Curating services• Service Decay• Bit Rot
![Page 28: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/28.jpg)
Adoption academic not so different to commercial. Risk Management
• Fit for adoption (& customisation)– Not just stuck on Sourceforge.– Engineering Quality and Documentation.– Plan for adoption & exploitation.– Release cycles
• Help stakeholders adopt it.– The last mile, ramps– Engineering and Documentation.– Expertise: Support desk, SLAs, – Community self-help– Reward not hinder adoption.
![Page 29: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/29.jpg)
Strategic Interoperabilityand Flocking Amplify adoption, capability and usefulness.
• Taverna + CDK, RShell, EMBOSS• Taverna + Bioclipse• Taverna + Galaxy• Galaxy + Cytoscape• Galaxy + GMOD
• Used Standards & Formats• Simple and suitable APIs• Common frameworks, e.g. OSGi• Compatible Licensing
– OSSWatch
![Page 30: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/30.jpg)
• Drive long-term vision and secure resources.• Best practice, governance, policing.• Coordinate contributions & co-shaping.Models:• Benign dictator• Community democracy • Independent market placeA Foundation?• Export legal risk and admin overhead• IP assignment and due diligence• Community building• Succession: Benevolent Dictator for Life ?
Coordination. A core. A leader.
![Page 31: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/31.jpg)
Industry
Commercialisation Business Models• Dual Support: customisation and service• Eagle Genomics support partnership!• Indemnify against potential IP infringement
Full blown commercialisation• Slow down• Open Source starvation• Flow back guarantees• Different priorities
![Page 32: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/32.jpg)
Training
• Tutorials and Training• Summer schools• Developer and User
Days• Annotation Jamborees• Undergraduate and
Postgraduate Bioinformatics
Software ● Services ● Content ● Skills ● Community ●
![Page 33: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/33.jpg)
Provisioning Bioinformatics using Open Source Software. Going to get more so.
Provisioning
Open Source Software
Challenge is securing sustainability
of softwareof skillsof community
![Page 34: 2. Carole Goble - University of Manchester](https://reader035.vdocuments.net/reader035/viewer/2022062303/555011deb4c90535638b4a90/html5/thumbnails/34.jpg)
Further Information• myGrid
– http://www.mygrid.org.uk• Taverna
– http://www.taverna.org.uk• myExperiment
– http://www.myexperiment.org• BioCatalogue
– http://www.biocatalogue.org• SysMO-SEEK
– http://www.sysmo-db.org• MethodBox
– http://www.methodbox.org.uk• OMII-UK
– http://www.omii.ac.uk• Software Sustainability Institute
– http://www.software.ac.uk