science gateways – leveraging modeling and simulations in hpc infrastructures via increased...
TRANSCRIPT
![Page 1: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/1.jpg)
Sandra Gesing [email protected]
cHiPSet Training School 2016 22 September 2016
Science Gateways – Leveraging Modeling and
SimulaDons in HPC Infrastructures via Increased Usability
![Page 2: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/2.jpg)
University of Notre Dame
Sandra Gesing 2
• In the middle of nowhere of northern Indiana (1.5 h from Chicago)
• 4 undergraduate colleges • ~35 research insDtutes and centers • ~12,000 students
![Page 3: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/3.jpg)
Modeling and SimulaDons
Sandra Gesing Science Gateways 3
• Genomics • Proteomics • Metabolomics • Immunomics • System biology • Molecular simulaDons • Docking • Epidemiology • …
Black Swallowtail – larvae and buVerfly
![Page 4: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/4.jpg)
The Genomics Boom
Sandra Gesing 4
February 16, 2001 biotech company Celera
February 15, 2001 The Human Genome Project
![Page 5: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/5.jpg)
The Genomics Boom
Sandra Gesing 5
Craig Venter (le[) and Francis Collins (right)
![Page 6: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/6.jpg)
Big Data
Sandra Gesing 6
• Explosion in the quanDty, variety and complexity of data
• QuesDons can be answered impossible to even ask about 10 years ago
• Costs far reduced (e.g., Human Genome project, 15 years, ~$2 billion; today ~3 days, $1000)
![Page 7: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/7.jpg)
Big Data
Sandra Gesing 7 hVp://www.genome.gov/images/content/cost_per_genome_oct2015.jpg
![Page 8: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/8.jpg)
Modeling and SimulaDons
Sandra Gesing 8
![Page 9: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/9.jpg)
Workflows
Sandra Gesing Science Gateways 9
12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt 12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt 12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct 12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt 12421 taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt 12481 aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt 12541 ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg 12601 tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga 12661 tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc 12721 atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa 12781 taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa
Slide copied from: Stuart Owen „Workflows with Taverna“
A sequence of connected steps in a defined order based on their control and data dependencies
![Page 10: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/10.jpg)
Workflow Systems
Sandra Gesing Science Gateways 10
• Different workflow concepts • Different workflow languages • Different workflow constructs
Taverna
![Page 11: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/11.jpg)
Workflow Editors
Sandra Gesing Science Gateways 11
• Different technologies (workbenches, web-‐based) • Different look-‐and-‐feel
![Page 12: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/12.jpg)
State of the Art
Data and compute-‐ intensive problems
High-‐speed networks
Users generally not IT specialists Tools and workflow
engines
Web-‐based agile frameworks Distributed data and
compuDng infrastructures
Sandra Gesing 12
![Page 13: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/13.jpg)
Challenge for Developers
Sandra Gesing 13
Data and compute-‐ intensive problems
High-‐speed networks Tools and workflow engines
Web-‐based agile frameworks Distributed data and
compuDng infrastructures
Users generally not IT specialists
Need for intuiDve and self-‐explanatory user interfaces!
![Page 14: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/14.jpg)
Challenge for Developers
Sandra Gesing 14
Data and compute-‐ intensive problems
High-‐speed networks Tools and workflow engines
Web-‐based agile frameworks Distributed data and
compuDng infrastructures
Users generally not IT specialists
![Page 15: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/15.jpg)
Usability
Sandra Gesing 15
“A[er all, usability really just means that making sure that something works well: that a person … can use the thing -‐ whether it's a Web site, a fighter jet, or a revolving door -‐ for its intended purpose without geqng hopelessly frustrated.” (Steve Krug in “Don't make me think!: A Common Sense Approach to Web Usability”, 2005)
![Page 16: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/16.jpg)
Reusability
Sandra Gesing 16
“The key to producDvity is reusability. The easiest way to produce code is obviously to have it already!" (John R. Bourne in “Object-‐oriented Engineering: Building Engineering Systems Using Smalltalk-‐80”, 1992)
![Page 17: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/17.jpg)
Reproducibility
Sandra Gesing 17
“The closeness of agreement between independent results obtained with the same method on idenDcal test material but under different condiDons (different operators, different apparatus, different laboratories and/or a[er different intervals of Dme)…” (IUPAC (InternaDonal Union of Pure and Applied Chemistry iupac.org) GoldBook)
![Page 18: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/18.jpg)
Reproducibility
Sandra Gesing 18
“The closeness of agreement between independent results obtained with the same method on idenDcal test material but under different condiDons (different operators, different apparatus, different laboratories and/or a[er different intervals of Dme)…” (IUPAC (InternaDonal Union of Pure and Applied Chemistry iupac.org) GoldBook)
![Page 19: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/19.jpg)
Reusability vs. Reproducibility
Sandra Gesing 19
![Page 20: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/20.jpg)
Efficiency
Sandra Gesing 20
• Time • ComputaDonal resources • Money
![Page 21: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/21.jpg)
Science Gateways
Sandra Gesing Science Gateways 21
science gateway /sī′ əәns gāt′ wā′/ n. 1. an online community space for science and engineering research and
education. 2. a Web-based resource for accessing data, software, computing services, and
equipment specific to the needs of a science or engineering discipline.
![Page 22: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/22.jpg)
Why are Science Gateways Important?
Sandra Gesing Science Gateways 22
• Increased complexity of – today’s research quesDons – hardware and so[ware – skills required
• Greater need for openness and reproducibility – Science increasingly driving policy quesDons
• Opportunity to integrate research with teaching – BeVer workforce preparaDon
We need interfaces that provide
broad access to advanced resources
and allow all to tackle today’s challenging science ques9ons.
![Page 23: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/23.jpg)
Science Gateways
Sandra Gesing 23
![Page 24: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/24.jpg)
Science Gateways
Sandra Gesing Science Gateways 24
![Page 25: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/25.jpg)
Science Gateways
Sandra Gesing Science Gateways 25
It’s a Science Gateway
It’s a Research Portal
It’s a Collaboratory
It’s a Cyberinfrastructure
It’s e-‐Science eResearch
It’s a Virtual Lab
![Page 26: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/26.jpg)
Frameworks and APIs
Sandra Gesing 26
Re-‐invenDng is not always necessary..
![Page 27: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/27.jpg)
Frameworks and APIs
Sandra Gesing 27
... and users should get more features easily...
![Page 28: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/28.jpg)
Frameworks and APIs
Sandra Gesing 28
... but the model should fit to the demands of the community
![Page 29: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/29.jpg)
BioinformaDc Infrastructure Survey
Sandra Gesing 29
QuesDons around frustraDon and limitaDons of using • BioinformaDc so[ware • BioinformaDc resources • HPC and Cloud infrastructures and about challenges to train students in bioinformaDcs Answers o[en address • Hurdles to use bioinformaDc resources because of commandline access or not available so[ware
• Quality of documentaDon of so[ware • Need for parsers and converters for diverse data formats • Long waiDng Dme for support or even lack of support
![Page 30: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/30.jpg)
BioinformaDc Infrastructure Survey
Sandra Gesing 30
• Nick Loman (Birmingham, UK) • Thomas Connor (Cardiff, UK) • October 2015 • 272 answers
hVps://drive.google.com/drive/folders/0B7KZv1TRi06fLUJCU1BYM3JScjg
![Page 31: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/31.jpg)
BioinformaDc Infrastructure Survey
Sandra Gesing 31
![Page 32: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/32.jpg)
BioinformaDc Infrastructure Survey
0" 20" 40" 60" 80" 100" 120"
Cloud"
Ins0tu0on2wide"resource"
Local"resource"
Personal"computer"
Where do bioinformaDcians do most of their work
Sandra Gesing 32
![Page 33: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/33.jpg)
BioinformaDc Infrastructure Survey
0" 20" 40" 60" 80" 100" 120"
Cloud"
Ins0tu0on2wide"resource"
Local"resource"
Personal"computer"
0.00%$ 10.00%$20.00%$30.00%$40.00%$50.00%$60.00%$70.00%$80.00%$90.00%$
Best$for$job$
Good$documenta>on$
Word$of$mouth$recommenda>on$
Used$in$similar$analysis$
Quickest$
Already$installed$on$server$
Other$
Graphical$interface$
Where do bioinformaDcians do most of their work
Why do bioinformaDcians use the so[ware they use
Sandra Gesing 33
![Page 34: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/34.jpg)
BioinformaDc Infrastructure Survey
0" 20" 40" 60" 80" 100" 120"
Cloud"
Ins0tu0on2wide"resource"
Local"resource"
Personal"computer"
0.00%$ 10.00%$20.00%$30.00%$40.00%$50.00%$60.00%$70.00%$80.00%$90.00%$
Best$for$job$
Good$documenta>on$
Word$of$mouth$recommenda>on$
Used$in$similar$analysis$
Quickest$
Already$installed$on$server$
Other$
Graphical$interface$
Where do bioinformaDcians do most of their work
Why do bioinformaDcians use the so[ware they use
Sandra Gesing 34
![Page 35: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/35.jpg)
A Typical Life Cycle
Sandra Gesing Science Gateways 35
Early adopters
Publicity
Wider adopDon
Funding ends
ScienDsts disillusioned
New project
prototype
Gateways enable research, but are not research projects themselves… Sustainability is a problem…
![Page 36: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/36.jpg)
Science Gateways
Sandra Gesing Science Gateways 36
A new era… • Novel developments of web-‐based agile
frameworks • Infrastructure providers report that science
gateways are more used than commandlines
![Page 37: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/37.jpg)
Science Gateways
Sandra Gesing Science Gateways 37
A new era…
Gateways
Login
hVps://www.xsede.org/
![Page 38: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/38.jpg)
Science Gateways
Sandra Gesing Science Gateways 38
A new era… • Novel developments of web-‐based agile
frameworks • Infrastructure providers report that science
gateways are more used than commandlines But also always new challenges… • Novel infrastructures • Novel data sources like NGS sequencing
machines, telescopes such as the Square Kilometre Array (SKA) (will create data rates in exa-‐scale size)
è Support of developers necessary
![Page 39: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/39.jpg)
Science Gateways Community InsDtute
Sandra Gesing Science Gateways 39 hVp://sciencegateways.org
• Diverse experDse on demand
• Longer term support engagements
• So[ware and visibility for gateways
• InformaDon exchange in a community environment
• Student opportuniDes and more stable career paths
![Page 40: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/40.jpg)
Science Gateway Survey 2014
Sandra Gesing Science Gateways 40
• 29,000-‐person survey • 4957 responses from across domains
![Page 41: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/41.jpg)
Science Gateway Survey 2014
Sandra Gesing Science Gateways 41
n of applicaDon types=7,805, by 2,756 creators (out of 2,819); mean=2.8 applicaDon types per applicaDon creator
![Page 42: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/42.jpg)
Science Gateway Survey 2014
Sandra Gesing Science Gateways 42
34% 36%
20%17%
31%26%
42%
16%
30%
18%
45% 44%
14% 15%
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
UsabilityConsultant
GraphicDesigner
CommunityLiaison/Evangelist
ProjectManager
ProfessionalSoftwareDeveloper
SecurityExpert
QualityAssuranceand Testing
Expert
Wished we had this
Yes, we had this
n=2,756 respondents or 98% of applicaDon creators
![Page 43: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/43.jpg)
Science Gateway Survey 2014
Sandra Gesing Science Gateways 43
What services would be helpful?
![Page 44: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/44.jpg)
Science Gateway Technologies
Sandra Gesing Science Gateways 44
• Content management systems (Drupal) • Libraries for implementaDon (Django) • Portal frameworks (Liferay) • Science gateway frameworks (WS-‐PGRADE, Galaxy)
• StaDc layout • Layout extendable • Workflow-‐enabled
• APIs for implementaDon (Apache Airavata, Agave)
![Page 45: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/45.jpg)
Drupal
Sandra Gesing Science Gateways 45
![Page 46: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/46.jpg)
VectorBase -‐ Example for Drupal
Sandra Gesing Science Gateways 46
![Page 47: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/47.jpg)
VectorBase
Sandra Gesing Science Gateways 47
![Page 48: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/48.jpg)
VectorBase
Sandra Gesing Science Gateways 48
![Page 49: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/49.jpg)
Django
Sandra Gesing Science Gateways 49
![Page 50: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/50.jpg)
VecNet – Example for Django
Sandra Gesing Science Gateways 50
![Page 51: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/51.jpg)
VecNet
Sandra Gesing Science Gateways 51
![Page 52: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/52.jpg)
VecNet
Sandra Gesing Science Gateways 52
![Page 53: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/53.jpg)
VecNet
Sandra Gesing Science Gateways 53
![Page 54: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/54.jpg)
Liferay
Sandra Gesing Science Gateways 54
Portal framework • AuthenDcaDon (e.g., OpenSSO, CAS) • AuthorizaDon • Standards compliant
• JSR168/286 • Web services • Web 2.0 websites
• Web Publishing and Shared Workspaces • CollaboraDon • Social Networking
![Page 55: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/55.jpg)
WS-‐PGRADE
Sandra Gesing Science Gateways 55
User Interface WS-‐PGRADE
Liferay
DCI Resources Middleware Layer
High-‐Level Middleware Service Layer
gUSE
![Page 56: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/56.jpg)
MoSGrid as WS-‐PGRADE Example
Sandra Gesing Science Gateways 56
Molecular SimulaDon Grid • Science gateway integrated with underlying compute and data management infrastructure
• Distributed workflow management • Data repository • Metadata management
![Page 57: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/57.jpg)
MoSGrid
Sandra Gesing Science Gateways 57
![Page 58: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/58.jpg)
MoSGrid
Sandra Gesing Science Gateways 58
![Page 59: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/59.jpg)
MoSGrid
Sandra Gesing Science Gateways 59
![Page 60: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/60.jpg)
MoSGrid
Sandra Gesing Science Gateways 60
![Page 61: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/61.jpg)
MoSGrid
Sandra Gesing Science Gateways 61
![Page 62: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/62.jpg)
MoSGrid
Sandra Gesing Science Gateways 62
![Page 63: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/63.jpg)
MoSGrid
Sandra Gesing Science Gateways 63
![Page 64: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/64.jpg)
MoSGrid
Sandra Gesing Science Gateways 64
Molecular Dynamics • Study and simulaDon of molecular moDon Quantum Chemistry • Study and simulaDon of molecular electronic behavior relaDve to their chemical reacDvity Docking • Main focus on evaluaDon of ligand-‐receptor interacDons (e.g., for drug design)
![Page 65: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/65.jpg)
MoSGrid -‐ Metadata
Sandra Gesing Science Gateways 65
• Molecular SimulaDon Markup Language (MSML) • CML compliant • Template for each and every workflow
• Molecular input • Domain specific tools • Job configuraDon • OpDmized structures, trajectories, energies, …
• SemanDc search (Apache Lucene)
![Page 66: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/66.jpg)
MoSGrid -‐ Metadata
Sandra Gesing Science Gateways 66
![Page 67: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/67.jpg)
MoSGrid -‐ Metadata
Sandra Gesing Science Gateways 67
![Page 68: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/68.jpg)
MoSGrid – VisualizaDon
Sandra Gesing Science Gateways 68
TesDng of ChemDoodle and MolCAD
web.chemdoodle.com molcad.de
![Page 69: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/69.jpg)
MoSGrid – Basic Workflow
Sandra Gesing Science Gateways 69
Job DefiniHon
ApplicaHon Input
ExecuHon
Meta-‐ processing
Job Submission
ApplicaHon Output
Post-‐ processing Output
Portal User-‐ Input
Grid Resource
![Page 70: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/70.jpg)
MoSGrid – QC Portlet
Sandra Gesing Science Gateways 70
• Specialised interface for quantum chemistry so[ware (Gaussian, NWChem, ORCA)
• Basic workflows • Easy GeneraDon or Uploading of Input Files • Parsing of result files
![Page 71: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/71.jpg)
MoSGrid – MD Portlet
Sandra Gesing Science Gateways 71
![Page 72: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/72.jpg)
MoSGrid – Docking Portlet
Sandra Gesing Science Gateways 72
![Page 73: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/73.jpg)
MoSGrid – Docking Portlet
Sandra Gesing Science Gateways 73
![Page 74: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/74.jpg)
Galaxy
Sandra Gesing Science Gateways 74
![Page 75: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/75.jpg)
Galaxy
Sandra Gesing Science Gateways 75
![Page 76: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/76.jpg)
RNA-‐Seq Analysis
Sandra Gesing Science Gateways 76
![Page 77: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/77.jpg)
Apache Airavata
Sandra Gesing Science Gateways 77
• Airavata is a general purpose distributed system so[ware framework build on micro-‐service and component based architecture principles
• Airavata provides capabiliDes to compose, manage, execute and monitor large scale applicaDons and workflows on distributed compuDng resources
• Airavata supports execuDons on local clusters, naDonal grids, academic and commercial clouds
• Airavata is inherently mulD-‐tenanted
![Page 78: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/78.jpg)
Apache Airavata
Sandra Gesing Science Gateways 78
![Page 79: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/79.jpg)
Apache Airavata
Sandra Gesing Science Gateways 79
• External clients interact with Airavata API (based on Apache Thri[)
• Internally, components interact with each other through Component Programming Interfaces (thri[-‐based CPIs)
![Page 80: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/80.jpg)
Apache Airavata
Sandra Gesing Science Gateways 80
Clean way to define IDLs with richer data structures
![Page 81: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/81.jpg)
SciGap – Example for Apache Airavata
Sandra Gesing Science Gateways 81
Science Gateway Pla�orm as a Service
![Page 82: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/82.jpg)
Apache Airavata
Sandra Gesing Science Gateways 82
Science Gateway Pla�orm as a Service (SciGaP)
User IdenDty Management
InformaDon, Monitoring & AudiDng
ApplicaDon Programmer Interface
CIPRES
Science Gateways
Neuro Science Ultrascan BioVLAB GAAMP DES
SimWG Param Chem
Graphical Interfaces Admin Dashboards
XSEDE OSG Future Grid
Data Nets
Campus Clusters
Academic & Commercial
Clouds
InternaDonal Grids
Data & Provenance Management
Scalable Secure Load Balanced Configurable Fault Tolerant Maintainable Performance
Job & Workflow Management
![Page 83: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/83.jpg)
Apache Airavata
Sandra Gesing Science Gateways 83
Community Hangout
Mailing lists: [email protected] [email protected] [email protected]
Extend Airavata from your project or extend your project from Airavata
![Page 84: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/84.jpg)
Agave API
Sandra Gesing Science Gateways 84
Agave is a Science-‐as-‐a-‐Service web API pla�orm Run scienHfic codes • your own or community provided codes ...on HPC, HTC, or cloud resources • your own, shared, or commercial systems ...and manage your data
• reliable, mulD-‐protocol, async data movement ...from the web • webhooks, rest, json, cors, oauth2 ...and remember how you did it • deep provenance, history, and reproducibility built in
![Page 85: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/85.jpg)
Agave API
Sandra Gesing Science Gateways 85
• MulDtenant • Hosted idenDty
management • Supports mulDple IdP • OAuth2/OIDC server • API Management • Hosted or on premise
• VerDcal SSO • AnalyDcs and reporDng • Developer resources • MulDple SDK & CLI • Reference gateway • White labeled • 100% open source
![Page 86: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/86.jpg)
Agave API
Sandra Gesing Science Gateways 86
Used to power web & mobile applicaDons
![Page 87: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/87.jpg)
Agave API
Sandra Gesing Science Gateways 87
Used to extend exisDng processes
![Page 88: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/88.jpg)
Agave API
Sandra Gesing Science Gateways 88
(Re)Introducing the Micro App Paradigm
![Page 89: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/89.jpg)
Agave API
Sandra Gesing Science Gateways 89
Agave Delivers Process-‐as-‐a-‐Service
![Page 90: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/90.jpg)
Agave API
Sandra Gesing Science Gateways 90
![Page 91: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/91.jpg)
iPlant – Example for Agave API
Sandra Gesing Science Gateways 91
![Page 92: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/92.jpg)
Agave API -‐ Tutorials
Sandra Gesing Science Gateways 92
![Page 93: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/93.jpg)
Agave API -‐ Tutorials
Sandra Gesing Science Gateways 93
![Page 94: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/94.jpg)
CollaboraDon on Science Gateways
Sandra Gesing Science Gateways 94
Crucial Topics • Close collaboraDon with user communiDes • Knowledge about available technical soluDons Sounds easy but… • Requirements of user communiDes o[en not so
clear • Technologies someDmes sDll under development
for certain building blocks è Slow uptake of soluDons è Larger effort for creaDng science gateways
![Page 95: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/95.jpg)
New Science Gateways -‐ Checklist
Sandra Gesing Science Gateways 95
DISCUSSION
OrganizaDonal Aspects
Technical Aspects
Domain-‐Specific Aspects
Developers Domain Experts
![Page 96: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/96.jpg)
New Science Gateways -‐ Checklist
Sandra Gesing Science Gateways 96
Domain-‐specific aspects: • Goal, target area and target users • Visions/demands on the layout • PrioriDes of features and opDons, e.g., a list
from must-‐have to great-‐to-‐have opDons • IntegraDon of exisDng applicaDons or
development of applicaDons • Technologies of the applicaDons • VisualizaDon • Security demands • Workflows
![Page 97: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/97.jpg)
New Science Gateways -‐ Checklist
Sandra Gesing Science Gateways 97
OrganizaDonal aspects: • Time constraints for the development,
agreement on a (maybe even rough) project plan with milestones
• Agreement on alpha-‐ or beta-‐tester • Regular meeDngs
![Page 98: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/98.jpg)
New Science Gateways -‐ Checklist
Sandra Gesing Science Gateways 98
Technical aspects: • Experience with exisDng frameworks and
programming languages • Available infrastructure including security
infrastructure and resources • Available support of suitable technologies • Scalability of suitable technologies • Effort for extending exisDng technologies
compared to novel developments • Synergy effects with other science gateway
projects
![Page 99: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/99.jpg)
Challenges
Sandra Gesing 99
A world-‐wide research compuDng infrastructure • Transparent service selecDon
• e.g., Docker could be part of the soluDon • Access to data irrespecDve of locaDon • OpDons to share data efficiently • Appropriate privacy and security measures • OpDmized usage of resources
• e.g., opDmized usage of cloud compuDng and their business models
![Page 100: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/100.jpg)
Researchers
Sandra Gesing 100
~7 million researchers world wide
hVp://chartsbin.com/view/1124
![Page 101: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/101.jpg)
High-‐Speed Network
Sandra Gesing 101
![Page 102: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/102.jpg)
Challenges
IntegraDon of data sources and instruments • Different data formats • Different interfaces • Different hardwares and technologies … from small ones to the big ones…
Sandra Gesing 102
![Page 103: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/103.jpg)
Challenges
So[ware searchability, reproducibility and reusability • Science gateways step in the right direcDon but … much more work necessary on searchibility… Not only finding any data for a research area but finding the right data • Metadata approaches • DicDonaries • More involvement of
librarians
Sandra Gesing 103
![Page 104: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/104.jpg)
Challenges
So[ware searchability, reproducibility and reusability • Science gateways step in the right direcDon but … much more work necessary on reproducibility and reusability… • studies in medicine and pharmacology: 11% or 6% of the
analysed research was reproducible • myExperiment: only 20% of workflows reusable because
of dependencies on hardware, local or distributed data, so[ware versions
Sandra Gesing 104
![Page 105: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/105.jpg)
Challenges
So[ware searchability, reproducibility and reusability • Science gateways and workflow systems step in the
right direcDon but … much more work necessary on reproducibility and reusability… • ContainerizaDon approaches • MigraDon approaches • CombinaDon of both
Sandra Gesing 105
![Page 106: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/106.jpg)
Challenges – Novel and Old...
… require novel soluDons!
Sandra Gesing 106
![Page 107: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/107.jpg)
Projects -‐ OSF
• Big Data • Reproducibility
Open Access to Data and Projects could solve parts of the problems…
Sandra Gesing 107
![Page 108: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/108.jpg)
Workflow Enhancements
• Logical level: Meta-‐workflows Herres-‐Pawlis, S., Hoffmann, A., Rösener, T., Krüger, J., Grunzke, R., and Gesing, S. “MulD-‐layer Meta-‐metaworkflows for the EvaluaDon of Solvent and Dispersion Effects in TransiDon Metal Systems Using the MoSGrid Science Gateways”Science Gateways (IWSG), 2015 7th InternaDonal Workshop on, pp.47-‐52, 3-‐5 June 2015, IEEE Xplore, doi: 10.1109/IWSG.2015.13
• System level: CombinaDon of strengths of workflow systems Hazekamp, N., Sarro, J., Choudhury, O., Gesing, S., ScoV Emrich and Thain, D. “Scaling Up BioinformaDcs Workflows with Dynamic Job Expansion: A Case Study Using Galaxy and Makeflow”, e-‐Science (e-‐Science), 2015 IEEE 11th InternaDonal Conference on, pp.332-‐341, Aug. 31 2015-‐Sept. 4 2015
• PredicDon: Model for opDmizaDon of tasks and threads Choudhury, O., Rajan, D., Hazekamp, N., Gesing, S., Thain, D., and Emrich, S. “Balancing Thread-‐level and Task-‐level Parallelism for Data-‐Intensive Workloads on Clusters and Clouds”, Cluster CompuDng (CLUSTER), 2015 IEEE InternaDonal Conference on, pp.390-‐393, 8-‐11 Sept. 2015, doi:10.1109/CLUSTER.2015.60
Sandra Gesing 108
![Page 109: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/109.jpg)
Science Case: PolymerisaDon catalysts
Sandra Gesing 109
![Page 110: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/110.jpg)
TranslaDon into Workflows
Sandra Gesing 110
![Page 111: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/111.jpg)
TranslaDon into Workflows
Sandra Gesing 111
![Page 112: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/112.jpg)
Meta-‐Workflows
Sandra Gesing 112
![Page 113: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/113.jpg)
TranslaDon into Meta-‐Workflows
Sandra Gesing 113
![Page 114: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/114.jpg)
Scaling Up Workflows
# Machines # Cores
Data ParHHoning
Save
Sandra Gesing 114
![Page 115: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/115.jpg)
Scaling Up Workflows
Galaxy
Sandra Gesing 115
![Page 116: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/116.jpg)
Genome Sequencing
Sandra Gesing 116
• Finding precise order of nucleoDdes within a DNA molecule
• A (adenine), G (guanine), C (cytosine), and T (thymine) (Human genome over 3 billion of nucleoDdes)
![Page 117: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/117.jpg)
Genome Sequencing
Sandra Gesing 117
Let’s imagine a party game. The game is a guessing game. Here is how it is played:
You are thinking of a number and the group has to guess it. The tricky part is that the number is 200-‐digits in
length. You are reading the digits of the number in your head without making a sound. Every so o[en a person interrupts you, and you tell them the single digit you were just thinking and where it is in the sequence of 200. Each Dme you are interrupted, you have to start again. You leave a[er a few hours and the group has to figure out the 200-‐digit number. They have to piece together the informaDon you gave them, for example the 25th number was 5, the 40th number was 0, and so on. Using the informaDon from their interrupDons, they
can repeat the number they gave you.
![Page 118: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/118.jpg)
Scaling Up Workflows
Simple Workflow in Galaxy
Problem: As Size increases so does Time
Sandra Gesing 118
![Page 119: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/119.jpg)
Scaling Up Workflows
Workflow with Parallelism added in Galaxy
Problem: Tools must be updated every change in Parallelism/Relies on ScienDst
Sandra Gesing 119
![Page 120: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/120.jpg)
Scaling Up Workflows Workflow Dynamically Expanded behind Galaxy
Sandra Gesing 120
![Page 121: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/121.jpg)
Scaling Up Workflows
Sandra Gesing 121
![Page 122: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/122.jpg)
Scaling Up Workflows
Sandra Gesing 122
![Page 123: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/123.jpg)
Scaling Up Workflows Makeflow
• Task Structure INPUTS : OUTPUTS
COMMAND • Directed Acyclic Graph (DAG)
• ProgrammaDcally Generated
Sandra Gesing 123
![Page 124: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/124.jpg)
Scaling Up Workflows
Sandra Gesing 124
![Page 125: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/125.jpg)
Scaling Up Workflows
Sandra Gesing 125
![Page 126: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/126.jpg)
Scaling Up Workflows
Job Sandbox – Log file creaDon for cleanup
Sandra Gesing 126
![Page 127: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/127.jpg)
Scaling Up Workflows
Dynamic Job Expansion
• Work Queue: we uDlized 100s of cores from a Condor Pool
• Cleaning Sandbox using knowledge of intermediates and logging
• Explored methods to transmit needed environments such as executables and Java
61.5X speed-‐up on 32 GB dataset uDlizing these methods
Sandra Gesing 127
![Page 128: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/128.jpg)
Thread-‐level and Task-‐level Parallelism
• Develop predictive performance models for an application domain
• Achieve acceptable performance the first time
• Optimize resource utilization • Execution time • Memory usage
Sandra Gesing 128
![Page 129: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/129.jpg)
Thread-‐level and Task-‐level Parallelism
• WorkQueue master-worker framework
• Sun Grid Engine (SGE) batch system Sandra Gesing 129
![Page 130: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/130.jpg)
Thread-‐level and Task-‐level Parallelism 1. ApplicaDon-‐level model for Dme: 𝑇(𝑅,𝑄,𝑁)= 𝛽1𝑅𝑄/𝑁 + 𝛽2
2. ApplicaDon-‐level model for memory: 𝑀(𝑅,𝑁)= γ1R +γ2N
3. System-‐level model for Dme: 𝑇𝑇𝑜𝑡𝑎𝑙=𝜂1𝑄𝐾/𝐷 +𝜂2(𝑄/𝐵 + 𝑅𝐾𝑁/𝐵𝐶 )+𝜂3T(R, 𝑄/𝐾 ,𝑁)∗𝐾𝑁/𝑀𝐶 + 𝜂4𝑂/𝐵 +𝜂5𝑂𝐾/𝐷
4. System-‐level model for memory: 𝑀𝑀𝑎𝑠𝑡𝑒𝑟(𝑅,𝑄)=ϕ1R +ϕ2Q
Sandra Gesing 130
![Page 131: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/131.jpg)
Thread-‐level and Task-‐level Parallelism
7 data points (R)
7 data points (Q)
7 data points (N)
343 data points
Data CollecHon
Training data
Regression Model
Training
Accuracy Test
MAPE TesHng
Regression Coefficient
s
TesDng data
Sandra Gesing 131
![Page 132: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/132.jpg)
Thread-‐level and Task-‐level Parallelism
Avg. MAPE = 3.1
MAPE = Mean Absolute Percentage Error Sandra Gesing 132
![Page 133: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/133.jpg)
Thread-‐level and Task-‐level Parallelism
For the given dataset, K* = 90, N* = 4
Sandra Gesing 133
![Page 134: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/134.jpg)
Result
Sandra Gesing 134
# Cores/
Task
# Tasks
Predicted Time (min)
Speedup
Estimated EC2
Cost ($)
Estimated Azure
Cost ($)
1 360 70 6.6 50.4 64.8
2 180 38 12.3 25.2 32.4
4 90 24 19.5 18.9 32.4
8 45 27 17.3 18.9 32.4
![Page 135: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/135.jpg)
InformaDon on Science Gateways
Sandra Gesing Science Gateways 135
• Science Gateway InsDtute hVp://sciencegateways.org
• Science Gateway Workshops Europe: IWSG -‐ hVp://iwsg.info USA: GCE -‐ hVp://sciencegateways.org Australasia: IWSG-‐A -‐ hVp://iwsg.info
• IEEE Technical Area on Science Gateways hVp://ieeesciencegateways.org
• XSEDE Science Gateways hVps://www.xsede.org/gateways-‐overview
• CRC Science Gateways hVps://crc.nd.edu/index.php/research/gateways
![Page 136: Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability](https://reader031.vdocuments.net/reader031/viewer/2022030222/588449891a28aba8438b6983/html5/thumbnails/136.jpg)
Exercises
Sandra Gesing Science Gateways 136
QuesDons and exercises at hVp://bit.ly/2dlkySW
Data at
hVp://bit.ly/2cTwKaN