cyberinfrastructure and its applications in the czech republic · 10/1/2012 · ams-ix písek...
TRANSCRIPT
Cyberinfrastructure and its Applications inthe Czech Republic
Petr Holub, Luděk Matyska
I2 Fall meeting, 1.10.2012
Czech Cyberinfrastructure
Three cyberinfrastructure institutions in the Roadmap of largeinfrastructure for Research and Development, approved by the CzechGovernment
I CESNETI National Research and Education Network (NREN) providerI National Grid Infrastructure (NGI) coordinatorI Moving into the basic data provisioningI Independent legal body owned by public universities and Academy of
Science of the Czech RepublicI Centre CERIT-SC (CERIT Scientific Cloud)
I Largest Grid and Cloud providerI Flexible experimental compute and storage facility for the new
algorithms, tools, and applications developmentI Part of the Masaryk University in Brno
I Centre IT4InnovationsI Supercomputing centre, under setup (no services provided yet)I Part of the Technical University in Ostrava
Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 2 / 15
Cyberinfrastructure Parameters
I Optical network connecting all the Czech major cities and all thepublic and most private universities
I Multi 10Gbps backbone with several 100Gbps lines plannedI 10Gbps line to Geant (EU) and 5Gbps to I2I Shared traffic, bt dedicated research lines/lambdas available
I National Grid InfrastructureI Almost 6000 CPUs in total coordinated through CESNETI 2200 CPU provided by CERIT-SC
I With plans to almost double the figure next year
I Data facilitiesI More than 5 PB in early deployment (HSM and MAID at CERIT-SC)I Additional 5–8 PB (HSM) in the pipeline
I Supercomputing facilitiesI Some 4000+ CPUs next yearI 32 thousand CPUs in 2014
Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 3 / 15
Collaboration and Partnerships
I CESNET and CERIT-SC share part of the workforceI NGI originally conceived at Masaryk UniversityI Complementary activities in data provisioning and developmentI Common Cloud Task ForceI Common Identity Management and AAI infrastructureI Complementary high level application activities
I Following information valid for both
Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 4 / 15
Collaboration with third parties
I Partnership, not just provider and user relationshipI Joint activitiesI Joint projectsI Strong involvement of postgraduate students in the process
I Both from the Computer Science and the Application Area
I Currently mostly in the areas of computing and data manipulationand processing
I Examples follow
Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 5 / 15
Brain Neurology Models
I Brain dynamic causal models based on intracranial EEGI Intracranial electrodes, up to 128 signal channels from one electrode,
frequence up to 1 kHzI Analysis of complex systems made from coupled systemsI Correlation of signalsI Causal (directional) relationshipI Bi- and Multi-variant analyses
I Study of anatomic connectivity of brain tissue through diffuse tensorimaging
I provides anatomical model of the brain (brain threads)I Very computationally demanding
I Faculty of Medicine and University Hospital, Masaryk University,Institute of Scientific Instruments
Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 6 / 15
Bivariant models
I Floating windowI FFT, power spectrum, Hilbert transformation, wavesI Separate repetition for each frequency channel
I Synchronization indexesI many variants: regression (R2, h2), Shannon entropy . . .I Always only two channels analyzed
R2 = maxτcov2(x(t),y(t+τ))
var(x(t))var(y(t+τ))
Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 7 / 15
Multivariant Models
For more than two channelsI Multivariant Methods
I One possibility is to usebivariant methods for all thecombination of channels
I Visualization Problems,correctness
I Covariant matrix andeigenvalues
I Covariant matrix over allsignals for each window
I Only eigenvalues arevisualized
I Approximate, but givesproper impression
Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 8 / 15
Global Climate Change
Tree reconstruction from a laser scanI Search for a new algorithm for a reconstruction of a tree from a
cloud of 3D points from a laser scanI Tree scanned by a laser scan LIDAR
I Output is a 4 D map of XYZ coordinates plus reflection intensity(different for a trunk, leaves, . . . )
I Order of thousand points per a treeI Expected output
I Tree structureI in a format for further digital processing
I Institute of Global Climate Change
Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 9 / 15
Tree Reconstruction
I Major problemsI Data quality
I Combination of scans from different angles—precision, movementdue to the wind, . . .
I overlaps ⇒ gaps in dataI Adjacency graph → independent reconstruction of promising
identifies areas → combination of reconstructed areas into a treeI Use of neural networks to fill in gaps in data
Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 10 / 15
Virtual Microscope
http://atlases.muni.cz/I Collection of tissue scans with a resolution up to 170000x140000
pixels (gigapixel range)I More than 3000 samples in more than 150 million files
I Currently more than 30 thousand tiles (=independent scans) splitover 1 million images per a picture
I Accessible through web interfaceI Simulation of a real microscope
I Fine-grained focusI JPEG2000 version under development
I GPGPU (CUDA) accelerated processingI Interesting research in the perceived picture quality of JPEG versus
JPEG2000I Institute of Pathology, Faculty of Medicine, Masaryk University
Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 11 / 15
Molecular Modeling
Haptic models of interaction of a largebiomolecule with a smaller agent
I Energy gradient is mappedon the haptic forcefield
I Needs fast response (1 kHz)I Realistic simulation →
computationally intensiveElectric charges at atoms in a molecule
I Extremely computationally intensivefor large molecules
I Electronegativity equalizationI Based on ab initio parameters
Large multipoint interactionsI Long distance interaction of million of particlesI Necessary for realistic molecular dynamics simulation of large
biomoleculesNCBR (CEITEC)Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 12 / 15
ELIXIR and ELIXIR_CZ
I European Bioinformatics InfrastructureI Extremely large number of data
I Thousands of genomic sequencers foreseen in EuropeI Each capable of producing petabyte(s) of data yearly
I Participation at the ELIXIR_CZ node setupI Collaboration on data storage, management (including access
control), processing and long term preservationI Combined with computationally intensive simulations
I Many institutions through the country, coordinated through Instituteof Organic Chemistry and Biochemistry
I Cyberinfrastructure institutions founding members
Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 13 / 15
BBMRI_CZ
I National biobanking infrastructureI distributed infrastrucutre: both from geo and organization
perspectiveI gathering of anonymized data about the stored samples
BBMRI-CZ�National Biobanking Infrastructure�
in the� Cze�ch Re�publicDalibor Valík1, Pe�tr Holub2, Kristína Gre�plová1, Dana Knoflíčková1
1RECAMOMasarykův onkologický ústavŽlutý kope�c 7, 656 53 Brno
e�mail: [email protected], gre�[email protected], [email protected]
2CERIT-SCÚstav výpoče�tní te�chniky MUBotanická 68a, 602 00 Brno
e�mail: hope�[email protected]
Hospital information system
Data anonymization
Hospital
Sample storage information
Export to central storage
Biobank administratorBiobank monitorng system
Biobank
Central BBMRI-CZ index
Search interface
Sample request/approval interfaceCentral BBMRI-CZinfrastructure
Researcher
Approved research projects
sample request
appr
oval
/den
ial
IT Infrastructure for BBMRI-CZData gathe�ring as we�ll as sample� re�que�sting by the� re�se�arche�rs in BBMRI-CZ� infrastructure� is a distribute�d proce�ss that spans se�ve�ral inde�pe�nde�nt institutions and involve�s patie�nts’ data, thus re�quiring comple�x IT infrastructure� to support and prote�ct it. While� the� biobanks with colle�ct sample�s and store� sample�s in cryoboxe�s, the� IT infrastructure� will colle�ct me�tadata for e�ach sample� from se�ve�ral he�te�roge�ne�ous source�s (hospital information syste�ms, biobanks the�mse�lve�s, national oncology re�giste�r), anonymize� the� me�tadata and inde�x it in orde�r to allow re�se�arche�rs to find sample�s of the�ir inte�re�st using both simple� and comple�x que�rie�s. The� IT infrastructure� will le�ve�rage� Europe�an and Cze�ch e�-infrastructure�s: CESNET2 high-spe�e�d backbone� ne�twork, distribute�d storage� te�chnologie�s and computing syste�ms provide�d by CERIT-SC and authe�ntication and authorization infrastructure�s base�d on fe�de�ration principle�s on national le�ve�l. The� IT infrastructure� is be�ing de�ve�lope�d jointly by RECAMO and CERIT-SC partne�r proje�cts and is sche�dule�d for de�ployme�nt in 2012.
About BBMRI-CZBBMRI-CZ� is a national biobank proje�ct commite�d to providing re�se�arche�rs with mate�rial for me�dical and biological re�se�arch, focusing mainly on oncology. Coordinate�d by Masaryk Me�morial Cance�r Institute� in Brno, the� ge�ographically distribute�d facility will consist of at le�ast 5 biobanks (Brno, Prague�, Olomouc, Hrade�c Králové, and Plze�ň).The� proje�ct is part of the� BBMRI Europe�an Re�se�arch Infrastrcture� Consortium (ERIC).
5 Gb/s
Praha
Liberec
Pardubice
Brno
Olomouc
Ostrava
Opava
NIX
Internet
ChebPoděbrady
Turnov
GÉANT
AMS-IXPísek
SANETACONET
PIONIER
Dvůr Králové
Krnov
Kyjov
Jihlava
Humpolec
Řež
Děčín
Ústí n. L.
Plzeň
Beroun
Č. TřebováLitomyšl Karviná
ZlínVyškov
Břeclav a Lednice
České Budějovice
Vodňany
Nové Hrady
J. Hradec
Tábor
Třeboň
Telč
Znojmo
Hradec Králové
Most
Kostelecn.Č.L.
Ondřejov Kutná HoraMariánské
Lázně
Jablonec n. N.
Prostějov
Uherské Hradiště
DWDM10 Gb/s1–2,5 Gb/s100 Mb/s<100 Mb/s
BBMRI-CZ� (gre�e�n dots) on top of CESNET2 national re�se�arch and e�ducational ne�twork.Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 14 / 15
BBMRI_CZ
Hospital information system
Data anonymization
Hospital
Sample storage information
Export to central storage Biobank administratorBiobank monitorng system
Biobank
Central BBMRI-CZ index
Search interface
Sample request/approval interfaceCentral BBMRI-CZinfrastructure
Researcher
Approved research projects
sample request
appr
oval
/den
ial
Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 14 / 15
BBMRI_CZ
I National biobanking infrastructureI CERIT-SC helps to build the underlying IT inftrastructureI R&D problems:
I coherent data gathering from two layers of institutionI distributed pseudonymization architecture (bijective)I k-anonymization for rare cases/diseasesI extraction of data from Hospital Information SystemsI data protection during transmission and storageI long-term data preservation
I use of distributed AAI
Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 14 / 15
Conclusions
I Strong Cyberinfrastructure in the Czech RepublicI Coverage from network up to application layers
I Research work based on a partnership with other communitiesI Bringing students into the process
I Intensive international collaborationI Networks through GeantI Grids through EGII Supercomputing through PRACEI Also additional more narrowly focused projects
I Including collaboration with the US, although without externalfunding
Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 15 / 15