abstract · web viewkey messages will be in-line with the deep/er messaging, especially when...

179
Proposal Version: v0.40, 2016.09.27 – 14:00 DEEP-EST: “DEEP - Extreme Scale Technologies” Call: H2020-FETHPC-01-2016 Topic: Co-design of HPC systems and applications Type of action: Research and Innovation Action Principal Investigator: Prof. Dr. Dr. Thomas Lippert E-Mail: [email protected] List of participants Participant No* Participant organisation name Short Name Country 1 (Coordinator ) Forschungszentrum Juelich GmbH JUELICH Germany 2 Intel Deutschland GmbH Intel Germany 3 Bayerische Akademie der Wissenschaften BADW-LRZ Germany 4 Barcelona Supercomputing Center Centro Nacional de Supercomputacion BSC Spain 5 Aurora S.r.l. ETH- Aurora Italy 6 Megware Computer Vertrieb und Service GmbH Megware Germany 7 Ruprecht-Karls-Universitaet Heidelberg UHEI Germany 8 EXTOLL GmbH EXTOLL Germany 9 The University of Edinburgh UEDIN United Kingdom

Upload: others

Post on 06-Jun-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

Proposal Version: v0.40, 2016.09.27 – 14:00

DEEP-EST: “DEEP - Extreme Scale Technologies”

Call: H2020-FETHPC-01-2016Topic: Co-design of HPC systems and applicationsType of action: Research and Innovation Action

Principal Investigator: Prof. Dr. Dr. Thomas LippertE-Mail: [email protected]

List of participantsParticipant No* Participant organisation name Short Name Country1 (Coordinator) Forschungszentrum Juelich GmbH JUELICH Germany2 Intel Deutschland GmbH Intel Germany3 Bayerische Akademie der Wissenschaften BADW-LRZ Germany4 Barcelona Supercomputing Center – Centro

Nacional de SupercomputacionBSC Spain

5 Aurora S.r.l. ETH-Aurora Italy6 Megware Computer Vertrieb und Service

GmbHMegware Germany

7 Ruprecht-Karls-Universitaet Heidelberg UHEI Germany8 EXTOLL GmbH EXTOLL Germany9 The University of Edinburgh UEDIN United

Kingdom10 Fraunhofer Gesellschaft zur Foerderung

der Angewandten Forschungs e.V.FHG-ITWM Germany

11 Katholieke Universiteit Leuven KULeuven Belgium12 Stichting Astron, Netherlands Institute For

Radio AstronomyASTRON Netherlands

13 Association National Centre For Supercomputing Applications

NCSA Bulgaria

14 Norges Miljo-Og Biovitenskaplige Universitet

NMBU Norway

15 Haskoli Islands UoI Iceland16 European Organisation for Nuclear

ResearchCERN Switzerland

Page 2: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Table of contentsABSTRACT.................................................................................................................31 EXCELLENCE......................................................................................................31.1 Objectives.................................................................................................................................. 31.2 Relation to the work programme.............................................................................................41.3 Concept and methodology.......................................................................................................6

1.3.1 Motivation............................................................................................................................... 61.3.2 Concept.................................................................................................................................. 61.3.3 Technology readiness.............................................................................................................91.3.4 Links to other research and innovation activities....................................................................91.3.5 Methodology......................................................................................................................... 101.3.6 Gender analysis.................................................................................................................... 14

1.4 Ambition.................................................................................................................................. 141.4.1 Advances beyond the state-of-the-art...................................................................................141.4.2 Innovation potential...............................................................................................................15

2 IMPACT..............................................................................................................152.1 Expected impacts................................................................................................................... 15

2.1.1 Contribution to the SRA realisation.......................................................................................152.1.2 Proof-of-concept................................................................................................................... 162.1.3 Covering broader segments and emerging HPC markets.....................................................172.1.4 Impact on standard bodies and international research programs.........................................17

2.2 Measures to maximise impact...............................................................................................182.2.1 Dissemination and exploitation of results..............................................................................182.2.2 Communication activities......................................................................................................22

3 IMPLEMENTATION............................................................................................243.1 Work plan — Work packages, deliverables..........................................................................24

3.1.1 Overall structure of the work plan.........................................................................................243.1.2 Timing of the different Work Packages and their components..............................................253.1.3 Detailed work description......................................................................................................263.1.4 Graphical representation of interdependencies....................................................................56

3.2 Management structure, milestones and procedures...........................................................573.2.1 Management bodies.............................................................................................................573.2.2 Quality management.............................................................................................................603.2.3 Innovation management.......................................................................................................603.2.4 Internal communication.........................................................................................................603.2.5 Milestones............................................................................................................................. 613.2.6 Risk management................................................................................................................. 613.2.7 Conflict management............................................................................................................63

3.3 Consortium as a whole..........................................................................................................633.3.1 Complementary and completeness......................................................................................633.3.2 Industrial and commercial involvement.................................................................................65

3.4 Resources to be committed...................................................................................................663.4.1 Person months distribution...................................................................................................663.4.2 Budget overview................................................................................................................... 673.4.3 Other Direct Costs and subcontracting.................................................................................68

4 MEMBERS OF THE CONSORTIUM..................................................................694.1 Participants............................................................................................................................. 694.2 Third parties involved in the project...................................................................................107

5 ETHICS AND SECURITY.................................................................................1135.1 Ethics..................................................................................................................................... 1135.2 Security................................................................................................................................. 113

6 GLOSSARY......................................................................................................114

DEEP-EST Page 2 of 122 Last saved 27.03.2017 08:08

Page 3: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

AbstractThe DEEP-EST (“DEEP - Extreme Scale Technologies”) project will create a first incarnation of the Modular Supercomputer Architecture (MSA) and demonstrate its benefits. In the spirit of the DEEP1 and DEEP-ER2 projects, the MSA integrates compute modules with different performance characteristics into a single heterogeneous system. Each module is a parallel, clustered system of potentially large size. A federated network connects the module-specific interconnects. MSA brings substantial benefits for heterogeneous applications/workflows: each part can be run on an exactly matching system, improving time to solution and energy use. This is ideal for supercomputer centres running heterogeneous application mixes (higher throughput and energy efficiency). It also offers valuable flexibility to the compute providers, allowing the set of modules and their respective size to be tailored to actual usage.

The DEEP-EST prototype will include three modules: general purpose Cluster Module and Extreme Scale Booster supporting the full range of HPC applications, and Data Analytics Module specifically designed for high-performance data analytics (HPDA) workloads. Proven programming models and APIs from HPC (combining MPI and OmpSs) and HPDA will be extended and combined with a significantly enhanced resource management and scheduling system to enable straightforward use of the new architecture and achieve highest system utilisation and performance. Scalability projections will be given up to the Exascale performance class. The DEEP-EST prototype will be defined in close co-design between applications, system software and system component architects. Its implementation will employ European integration, network and software technologies. Six ambitious and highly relevant European applications from HPC and HPDA domains will drive the co-design, serving to evaluate the DEEP-EST prototype and demonstrate the benefits of its innovative Modular Supercomputer Architecture.

1 Excellence

1.1 ObjectivesThe DEEP-EST project has the following objectives:

1. Develop an energy efficient system architecture that fits High Performance Computing (HPC) and High Performance Data Analytics (HPDA) workloads, and satisfies the requirements of end-users and e-infrastructure operators: The “Modular Supercomputer Architecture” (MSA) combines diverse compute and service modules at the system level to form a single heterogeneous machine. Each module matches the needs of a certain class of algorithms. A MSA-system can be built from arbitrary combinations of such modules. Since each module can be sized appropriately, such a system can be configured to exactly match the workload mix of a computing centre. This in turn creates benefits for end-users – since applications will run at near-optimal performance – and for compute centres alike – since workload throughput and energy efficiency will be maximised.

2. Build a fully working MSA system prototype: the DEEP-EST prototype will be defined by co-design and implement three modules: an Extreme Scale Booster (for highly scalable algorithms that do not require top single-thread performance and which profit from high bandwidth memory (HBM) and data parallelism/SIMD), a Data Analytics Module (to support HPDA algorithms that require high capacity non-volatile memory and which profit from tightly integrated FPGA acceleration), and a Cluster

1 FP7-ICT-287530, www.deep-project.eu 2 FP7-ICT-610476, www.deep-er.eu

DEEP-EST Page 3 of 122 Last saved 27.03.2017 08:08

Page 4: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Module (to support algorithms that require high single thread performance and are limited in scalability).

3. Foster European technologies: DEEP-EST will support and strongly influence the evolution of the EXTOLL interconnect both as a highly scalable intra-module fabric and as a means to bridge dissimilar interconnects. These capabilities will be demonstrated in the DEEP-EST prototype. Additionally, the integrators ETH-Aurora and Megware and the cluster software provider ParTec (linked third party from JUELICH) will be strengthened by enhancing their products with the innovations developed in DEEP-ER.

4. Build a resource management and scheduling system fully supporting the MSA: it will be able to schedule heterogeneous workloads onto matching combinations of module resources, and build and orchestrate near-optimal schedules for realistic mixes of homogeneous and heterogeneous workloads.

5. Enhance and optimise programming models: the DEEP/-ER3 programming models for heterogeneous applications based on MPI and OmpSs will be enhanced and optimised to best leverage the DEEP-EST prototype. In addition, adapted HPDA APIs and frameworks will fulfil the requirements of HPDA applications and make the capabilities of the prototype available to them. Integration of HPDA models in the HPC programming environment will be explored.

6. Validate the full hardware (HW) / software (SW) stack with relevant HPC and extreme data workloads and demonstrate the benefits of the MSA: a set of carefully selected, important European HPC and extreme data applications will drive a cyclic co-design approach and validate the functionality and performance of the DEEP-EST prototype, demonstrating the benefits of the MSA compared to conventional homogenous and heterogeneous systems.

1.2 Relation to the work programmeDEEP-EST specifically addresses each challenge of the FETHPC-01-2016 call “Co-design of HPC systems and applications” as shown in Table 1. Furthermore, the project leverages existing European areas of strength (such as scientific applications, middleware and software development) and supports the build-up of technology and expertise in fields that are critical for establishing a European HPC value chain (e.g. interconnects, system integration, and energy efficiency). It builds on the results of European-funded R&D projects, primarily these of DEEP and DEEP-ER.

Call text DEEP-EST way to address it

Achieve world-class extreme scale, power-efficient and highly resilient HPC platforms through a strong co-design approach […] driven by a mix of ambitious applications […]

DEEP-EST will use co-design to define the “Modular Supercomputer Architecture” (MSA). Key design objectives are power efficiency and resiliency. Requirements of six high impact applications representing well both HPC and HPDA domains will shape the DEEP-EST system and its software environment via co-design (see Section 1.3.5.1). Co-design activities also occur between the applications and the SW parts of the project, as well as between the SW and HW. These discussions are channelled through the Design and Development Group (DDG), which has proven to be a valuable vehicle for discussions in DEEP/-ER.

3 The term “DEEP/-ER” is used when speaking jointly of the DEEP and DEEP-ER projects.

DEEP-EST Page 4 of 122 Last saved 27.03.2017 08:08

Page 5: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Call text DEEP-EST way to address itAchieve the full range of technological capabilities needed for delivering a broad spectrum of extreme scale HPC systems.

DEEP-EST develops both the prototype hardware and software stacks, making sure that the unique system capabilities can be exploited by applications, and that extreme scale systems based on this concept can be efficiently and reliably operated. The DEEP-EST modules (see Section 1.3.2) to be prototyped do cover a wide range of technologies. This innovative “mix and match” approach enables HPC centres to configure systems that do perfectly fit their requirements. The modular nature allows both the addition of new classes of modules using disruptive technologies and their replacement with updated versions.

The designs of these systems must […] support for various classes of applications […]. Special attention should be given to extreme data processing requirements.

The MSA by itself does support the full range of HPC and HPDA applications, and will achieve benefits in terms of throughput and energy efficiency for heterogeneous workload mixes. At least five of the six selected applications do include significant and large-scale data processing or analytics components, and their requirements will shape the architecture and prototype implementation. The Data Analytics Module in particular will be designed to best support extreme data processing. Its integration into a HPC system contributes to closing the gap between the two communities (HPC and HPDA).

[…] innovative and ground-breaking approaches to system architectures […]

The MSA itself, the implementation of the cross-module interconnect, the novel Data Analytics Module, the Network Attached Memory (NAM) and the Global Collective Engine (GCE) (see Section 1.3.5.2) are innovations commensurate to the call challenge. The resource management and scheduling plus API extensions will make the balanced system capabilities fully accessible for applications.

[…] show how their proposed solution improves energy efficiency and demonstrate the reduced energy-to-solution for the selected applications. […]

In a MSA-system a diverse group of applications uses optimally the computing resources, so that the application results obtained per Watt are maximised for the single application and the full portfolio. Additionally, energy efficient processor, network and memory technologies, dense integration of components and direct warm-water cooling will make the DEEP-EST prototype highly energy efficient, and facilitate free cooling or the re-use of waste heat. This prototype will demonstrate a first set of energy improvements and the project will use modelling technologies to extrapolate to the Exascale future.

[…] address the problem of maintaining reliability […] of an HPC system that is able of extreme scaling; […]

The proven distributed checkpoint/restart mechanisms of DEEP-ER (which support both dynamic task-based and conventional MPI programming models) will be adapted to support and take full advantage of the DEEP-EST architecture. Integration of cache-line-addressable Non-Volatile Memory (NVM) into the concept will accelerate checkpoint creation and recovery.

[…] provide analytical or simulation models that allow to extrapolate the sustained performance […]

Based on a very thorough analysis of the co-design applications with the Extrae/Paraver and Dimemas tools from BSC, detailed performance models will be created. These models, trace-based replay techniques and the efficiency factor analysis pioneered in DEEP/-ER will enable extrapolations of system performance for different technologies and scales.

DEEP-EST Page 5 of 122 Last saved 27.03.2017 08:08

Page 6: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Call text DEEP-EST way to address itThe target system architectures must scale to at least 100 PFlops and, for compute-centric workloads, a target of 15MW for 250 PFlops peak performance in 2019 is suggested.

The DEEP-EST prototype will achieve on the order of 1 PFlop/s, and the design power will be below 150 kW. The 15 GF/W computational efficiency will be targeted with the Extreme Scale Booster module. The predecessor projects DEEP and DEEP-ER have shown a steady improvement of the computational efficiency. A preliminary assessment gives about 7 GF/W peak for the DEEP-ER Booster’s energy efficiency. By continuing the use of direct water cooling and the selection of energy efficient many-core processors, the computational efficiency of the DEEP-EST Extreme Scale Booster will double compared to DEEP-ER, reaching the target value by year 2019.

[…] all application-aspects impacting the underlying system design are included in this topic.

A full, optimised software stack will be created, building on results from DEEP/-ER. The stack comprises system software, network management software, resource management, job scheduling, file system, programming models, performance analysis and modelling tools. Programmability and usability of the system, as well as application portability are ensured by extending standard and commonly accepted programming models and system software components.

[…] demonstrate their achievements in integrated pre-exascale prototypes.

The DEEP-EST prototype and its complete software stack are targeted to demonstrate scalability and performance at the Exascale level for traditional HPC and upcoming HPDA applications.

Table 1: Specific call text and how DEEP-EST addresses it

1.3 Concept and methodology

1.3.1 MotivationOne of the main challenges raised by the convergence between HPC and HPDA is to find an architecture that can match the requirements of both application fields. Traditional HPC applications are usually iterative and rely heavily on a small number of numerical algorithmic classes (like the original Berkeley “seven dwarfs”4) that operate on relatively small data sets and accrue very high numbers of floating point operations across iterations. HPC systems have been optimised according to these requirements, and it felt justified to rank these machines purely on their Flop/s performance for DGEMM5. With the years this has led to rather monolithic systems where the amount of memory per core is steadily decreasing. However, both complexity and memory requirements of real-world HPC codes are ever increasing, leading to a dissonance with these traditional systems. In addition, the desire to support the HPDA workloads rapidly emerging from the “Big Data” community clearly requires a change in systems architecture, since these will exhibit less “arithmetic intensity” and require additional classes of algorithms to work well (see e.g. advance deep learning neural network algorithms). Moreover some scientific fields like brain research are expected to make use of both technologies – HPC and HPDA – to the same degree in the future.

1.3.2 ConceptA new breed of HPC systems is needed to support the computation and data processing requirements of both traditional HPC and emerging HPDA workloads. The Cluster-Booster concept first implemented by DEEP broke with the traditional system architecture approach

4 https://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.pdf5 https://www.top500.org/

DEEP-EST Page 6 of 122 Last saved 27.03.2017 08:08

Page 7: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

(based on replicating many identical compute nodes, possibly integrating heterogeneous processing resources within one node) by integrating heterogeneous computing resources in a modular way at the system level. More precisely, it connected a standard HPC Cluster based on general-purpose processors with a cluster of many-core processors (the “Booster”) by way of a highly efficient and high-speed network. No constraints are put on the combination of Cluster and Booster nodes that an application may select, and resources are reserved dynamically. This has two important effects: Firstly each application can run on a near-optimal combination of resources and achieve excellent performance. Secondly all the resources can be put to good use by a system-wide resource manager allowing combining the set of applications in a complementary way, increasing throughput and efficiency of use for the whole system.

The DEEP prototype was designed to support mainly HPC applications. The DEEP-ER project integrated additional non-volatile memory (NVM) layers and prototyped network-attached memory technologies. Both extensions enable truly scalable high-performance I/O and resiliency capabilities.

The “Modular Supercomputer Architecture” (MSA) introduced now with DEEP-EST takes the Cluster/Booster architecture to the next, logical step (see Figure 1) and the project fully embraces HPDA applications. While DEEP and DEEP-ER combined just two modules dedicated to HPC applications, DEEP-EST will support the combination of multiple modules of different functionality or characteristics. In a MSA-system several “system modules”, – each one tuned to best match the needs of a certain class of algorithms – are connected to each other at the system level. An optimised resource manager enables assembling arbitrary combinations of these resources according to the application workload requirements.

Figure 1: Logical view of the Modular Supercomputer architecture

Examples of components in a MSA-system are:

Cluster Module (CM) that addressed the needs of the applications (or parts of) requiring high single thread performance and a modest amount of memory, which are typically the less scalable ones.

DEEP-EST Page 7 of 122 Last saved 27.03.2017 08:08

Page 8: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Extreme Scale Booster (ESB) that addresses the needs of highly scalable and compute intensive HPC codes (or code-constituents). Energy efficiency, packaging density and hardware scalability are also important aspects in the ESB design.

Data Analytics Module (DAM) to cover the specific requirements of data-intensive and HPDA applications (e.g. high memory capacity, specific operations like data streaming, bit manipulation etc.).

Further modules for specific communities or with innovative technologies (e.g. neuromorphic devices) may be added to this concept as well. Their inclusion within the project life-time will be decided taking application requirements, technology maturity and budget constraints into account.

Attached to the computing modules mentioned above, “Service Modules” are added to provide additional functionality:

The Scalable Storage Service Module (SSSM) provides the global storage capabilities for the whole system and all the running applications.

All (computing and service) modules are connected to each other by the:

Network Federation (NF), which provides high-speed connectivity between the modules. An inherent element of the MSA concept is that different network technologies might be required for the various modules in the system. The NF comprises all these network technologies, plus the hardware and communication protocols that seamlessly bridge between them. This type of switching support has been already successfully demonstrated in the DEEP project6.

An advanced orchestrator of the heterogeneous resources is key for the efficiency and usability of this architecture. Resource management and job scheduling will be able to determine optimal resource allocations for each combination of workloads, support adaptive scheduling and enable dynamical reservation of the resources. Added to the modularity of the system itself, this guarantees maximum usage of the overall system, since no component is “blocked” for an application if it is not utilised. Furthermore, the DEEP-EST unified programming environment, based on standard components such as MPI and OmpSs but also on the emerging programming paradigms of the data-analytic fields like map-reduce, will provide a model that fully supports applications to use combinations of heterogeneous nodes, and enables developers to easily adapt and optimise their codes while keeping them fully portable. For example, applications that chain various steps of their simulation/analysis in a workflow present an inherent modularity, mapping perfectly in a MSA-system while providing the opportunity to optimise the data flow between the steps. Resiliency features at low-level (network) and high-level (programming environment) provide the best possible response to failure of individual system components.

With this modular approach, DEEP-EST provides a very flexible architecture that can match the requirements of totally different classes of applications, making use of the modules according to their respective needs. The six HPC and HPDA applications in DEEP-EST have been chosen carefully to allow for testing a variety of different scenarios. Monolithic applications may well use only one of the modules, but more complex, multi-physics or multi-scale applications will distribute their code-constituents among several modules of the system and achieve better scalability and efficiency. This also offers the opportunity for more complex workflows or to conduct simulation and data analysis/visualisation concurrently, with the high-speed connection between different modules facilitating necessary data transfers. A code may for instance run on the ESB, while the analysis of the generated data is performed on the DAM and its visualisation on the CM. Indeed, with an adapted runtime environment, such a constellation would even be appropriate for “interactive supercomputing” by

6 N. Eicker, et al.: “Bridging The DEEP Gap - Implementation of an Efficient Forwarding Protocol”, Technical Report FZJ-2014-05536, Intel European Exascale Labs, 2014, http://juser.fz-juelich.de/record/1719822013

DEEP-EST Page 8 of 122 Last saved 27.03.2017 08:08

Page 9: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

analysing/visualising the status of the simulation in a different module while the simulation is advancing, and enabling the end-user to control the simulation accordingly.

From the data centre operator’s point of view, this approach has several advantages, too. First of all, the better and more efficient system utilisation made possible by the MSA will immediately benefit the data centre operator, leading to a higher ROI (return on investment). To maximise this effect, it is secondly possible to optimise the system configuration, i.e. the number and characteristics of modules to the best match of the specific requirements of the centre and its application portfolio and mix. Additionally, maintenance of individual modules is possible without disturbing the rest of the system, reducing the overall down-times of the machine. Furthermore, the long-term sustainability is improved: new modules can be added to an existing system and old ones substituted at different points in time, keeping the rest of the system and its central resources (e.g. storage) for a much larger lifetime. At some point this may involve bridging between older and newer network generations. Finally, procurement processes, which are typically depending on different funding sources (e.g. regional, national, project-bound, etc.) can be split and handled individually for the independent modules in a much easier way.

1.3.3 Technology readinessThe DEEP-EST project will design, build, and deploy a prototype system of the MSA. Within the predecessor projects DEEP and DEEP-ER, the consortium has demonstrated its capability to build systems of similar complexity and put them into production-like operation. The goal for DEEP-EST will be to deploy a system of Technology Readiness Level7 TRL 7 during the runtime of the project that will eventually mature to TRL 8 or higher until the end of the project. Most of the components (hardware and software) constituting the prototype will have TRL 8 or higher, but for some (e.g., the Global Collective Engine (GCE) currently in TRL2, or the Network Attached Memory in TRL4) the project will be a vehicle to push their TRL beyond 7. This mixture of different technology readiness levels of the individual building blocks will help to tackle the challenge of creating an Exascale-ready system using novel technologies, while at the same time mitigate the risk to deliver a system of too low a TRL to be reasonably usable for day-to-day workloads.

1.3.4 Links to other research and innovation activitiesDEEP-EST builds upon the results achieved in the FP7 projects DEEP and DEEP-ER, and will further develop technologies and concepts first created therein. Also their hardware prototypes at JUELICH will be used for software development and application porting. DEEP is already completed, and DEEP-ER will finish in March 2017 (before DEEP-EST starts). There is therefore no risk of missing some results needed for the DEEP-EST success.

Links to other research and innovation activities are at the level of experience-sharing. For example, links to other FP7 projects (e.g. Mont-Blanc) were stablished by DEEP and DEEP-ER through the “European Exascale Projects” (EEP) initiative. This kind of collaboration, now driven by H2020 coordination actions as the EXDCI8 project, will be continued by DEEP-EST (see Section 2.2.2). Though the overlap in time with EXDCI is small, a follow-up support action is expected to come out of the FETHPC-03-2017-a call.

Also with PRACE, participation in and co-organisation of training workshops and dissemination activities will be sought. DEEP-ER is at the time of this writing preparing a memorandum of understanding (MoU) with PRACE to open the use of the DEEP-ER prototype to PRACE members. If the collaboration is fruitful, the same approach will be followed by DEEP-EST. Similar scenarios will be considered for the Centres of Excellence and other EU-projects, including the HBP.

7http://ec.europa.eu/research/participants/data/ref/h2020/other/wp/2016_2017/annexes/h2020-wp1617-annex-g-trl_en.pdf8 www.exdci.eu

DEEP-EST Page 9 of 122 Last saved 27.03.2017 08:08

Page 10: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Furthermore, through the partners’ participation in international research and innovation collaborations and initiatives, the DEEP-EST project will share experiences and gain visibility in the wider HPC community. Examples are the Joint Laboratory for Extreme Scale Computing9 (JLESC), in which JUELICH and BSC are members and the Big Data and Extreme-Scale Computing10 (BDEC), in which BSC, UEDIN, ETH-Aurora, JUELICH, Intel, KULeuven, BADW-LRZ, and JUELICH’s linked third party ParTec participate.

Finally, through its contribution to the convergence of HPC and HPDA, DEEP-EST also supports the European Cloud Initiative11, in particular the European Open Science Cloud12. Future projects in this and other areas will profit from the results achieved in DEEP-EST.

1.3.5 Methodology

1.3.5.1 Co-designThe DEEP-EST project is driven by co-design activities, the core of which is constituted by an ambitious mix of co-design applications shaping the implementation of the proposed architecture. Six ambitious, high-impact scientific applications from the fields of Neuroscience (NEST), Molecular Dynamics (GROMACS), Radio Astronomy (SKA data analysis), Space Weather (iPIC3D), Earth Science (HPDBSCAN, piSVM, DNN), and High Energy Physics (CMS data analysis pipeline) have been selected as the guiding codes, ensuring that the MSA developed in DEEP-EST will be useful for real-world applications. All but GROMACS do contain significant HPDA parts. Their code structure (constituents) and their requirements in terms of compute, memory, network, storage, and functionality will be carefully analysed to define the characteristics of the individual hardware modules. Each application will set its own specific requirements to achieve best performance. The co-design effort will be considered successful not only when a specific code is able to run faster or with less energy on the DEEP-EST prototype than on other systems, but also when the varied group of six applications do run more efficiently as a portfolio – e.g. thanks to the system modularity and its improved resource management and allocation functionality – optimally exploiting the available hardware resources. To achieve this, the co-design applications will be adapted.

DEEP-EST includes all layers of the SW stack in the co-design activities. For instance, there is a co-design relationship between the system SW and the hardware (e.g. on monitoring sensors) as well as between the programming models/APIs and the hardware (e.g. I/O and/or storage capabilities for checkpoint/restart). Within the SW layers, the programming models/APIs will take into account the application requirements. Finally, the influence always goes in both directions – the hardware, for instance, influences fundamental design decisions of the application and middleware developers, and applications will be adapted to make use of programming model innovations.

The Design and Development Group (DDG) – constituted by technical experts from the hardware, software and application domains – will drive the co-design discussions and take with their outcome the most important design decisions in DEEP-EST. This process will continue throughout the whole project adapting where necessary the requirements as applications mature, hardware specifications concretise, and system software evolves.

1.3.5.2 The DEEP-EST hardware prototypeThe DEEP-EST prototype will be a mid-size system with a total combined peak performance of around 1 PFlop/s, containing the modules depicted in Figure 2. This initial set of modules shall be sufficient to demonstrate the modular concept, to explore several of its use cases and to test the orchestration capabilities developed in the course of the project.

9 https://jlesc.github.io/10 http://www.exascale.org/bdec/11https://ec.europa.eu/digital-single-market/en/%20european-cloud-initiative12 https://ec.europa.eu/research/openscience/index.cfm?pg=open-science-cloud

DEEP-EST Page 10 of 122 Last saved 27.03.2017 08:08

Page 11: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

The exact configuration of the different modules in terms of processor and memory technology, and the performance and amount of both, will be decided in the initial phase of the project based on a comprehensive co-design effort (see Section 1.3.5.1).

Figure 2 describes the high-level hardware characteristics of the DEEP-EST prototype:

The Cluster Module (CM) shall be built with high-end general-purpose processors and a significant amount of memory per Cluster Node (CN).

The Extreme Scale Booster (ESB), with highly energy-efficient many-core processors (e.g. third generation Intel Xeon Phi, code-named Knights Hill) as Booster Nodes (BN) with a certain amount of on-package memory per core at high bandwidth.

The Data Analytics Module (DAM), with nodes (DN) based on general-purpose processors and a huge amount of (DRAM+non-volatile) memory per core. Depending on the applications needs, integrated CPU+FPGA chips and PCIe-attached GPGPUs will be considered.

The Scalable Storage Service Module (SSSM) provides the global storage capabilities for the whole system and all the running applications. It will serve also as interface to the large-scale external storage pool used at JUELICH for the production systems.

The Network Federation (NF) will leverage the flexibility of the European EXTOLL interconnect technology to efficiently bridge to other interconnect technologies that might be used in other modules (e.g. Intel’s Omni-Path, Mellanox’s InfiniBand, etc.).

Figure 2: High level hardware configuration of the DEEP-EST prototype

Two additional components will be included in the DEEP-EST prototype.

Network Attached Memory (NAM): providing high-capacity, wire-speed accessible, non-volatile memory as e.g. scratchpad or checkpoint-restart space. For its implementation, integrated CPU+FPGA processing elements are planned, and technology options for the non-volatile memory will be considered (e.g. NV-DIMM or

DEEP-EST Page 11 of 122 Last saved 27.03.2017 08:08

Page 12: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

SSD-replacement devices based on technologies like NAND or 3D-Xpoint). This takes the shared memory capabilities of the DEEP-ER NAM to its logical next level.

Global Collective Engine (GCE): a processing element (such as an FPGA) combined with volatile memory and directly attached to the EXTOLL fabric, which will accelerate MPI collective operations. The potential of such acceleration is expected to be substantial with the advent of non-blocking collectives in the MPI-3.0 standard.

In addition to the DEEP-EST prototype, hardware platforms such as evaluators and software development vehicles (SDVs) will be deployed early in the project to develop system-level and application software. These will make component technologies available as soon as possible, utilising commodity off-the-shelf (COTS) technologies in combination with engineering samples. The use of these initial systems will also provide very valuable information for the actual development of the prototype.

1.3.5.3 Energy efficiencyAs the DEEP-EST project strives to develop a scalable design for the 100 PFlop/s class, energy efficiency will be one of the most important design criteria in order to make it a sustainable and affordable HPC system. The MSA is inherently energy efficient as it allows for finding a perfect match of application requirements and hardware capabilities. It guarantees the most efficient code execution and thus reduces the application runtime and its energy-to-solution. This approach will be further supported by using the latest available hardware technology to build the prototype system. Additionally, highly efficient direct warm-water cooling will be used, which eliminates the need for energy-hungry mechanical chillers and facilitates the re-use of waste heat. An extensive set of sensors will be deployed in the prototype to closely monitor the power consumption of individual components. In combination with sophisticated analysis tools and energy models, this will help to identify inefficiencies and devise potential angles for optimisation.

1.3.5.4 Resource management and job schedulingResource management and job scheduling will play a pivotal role for the success of the overall MSA. Existing job schedulers guarantee efficient use of monolithic supercomputers. However the MSA requires capabilities to manage heterogeneous resources, to enable co-scheduling of resource sets across modules, and to handle dynamically varying resource-profiles. The project will extend SLURM’s13 capabilities to provide this functionality in a flexible way, and to significantly enhance scalability of the scheduler itself. While the expected size of the prototype might allow using SLURM as is, the significantly larger sizes of future production systems would create major bottlenecks.

At the same time similar new challenges arise on the side of resource management, which have to be ready for heterogeneous platforms and provide sufficient scalability for future production systems. The project will use the psslurm component of the ParaStation process management that has proven suitability for daily use on JUELICH’s production supercomputer JURECA. To meet the new demands this software will be extended and improved towards scalability and support of heterogeneous platforms. In addition to that, new capabilities to allow for a dynamic resource management will be implemented.

1.3.5.5 Programming environmentThe “classical” field of HPC applications is dominated by the use of MPI to express distributed memory parallelism and OpenMP for thread parallelism. The dynamic, task-based OmpSs14 model has shown definite benefits, and serves as a “breadboard model” to drive the future evolution of OpenMP. MPI itself is being extended by features that materially improve scalability, and it looks poised to keep its position for a significant time. For DEEP-EST, MPI’s full and integrated support for heterogeneous systems is important. Therefore, the

13 http://slurm.schedmd.com/14 https://pm.bsc.es/ompss

DEEP-EST Page 12 of 122 Last saved 27.03.2017 08:08

Page 13: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

project will continue the development of OmpSs and ParaStation MPI to support the MSA, and to interact with the resource management and scheduler. The project will investigate also how MPI and OmpSs can help implementing HPDA frameworks or applications, improving performance and efficiency on the DEEP-EST system and/or ensuring scalability of HPDA applications. However, it is highly unlikely that all users from this field will adapt to the MPI+OpenMP programming models of the HPC community. Therefore, an extensive evaluation of the available HPDA programming models will be done, and those fitting the needs of the DEEP-EST applications will be integrated with the resource management, the job scheduler and the HPC programming models (see Section 1.3.5.6).

DEEP-EST will also address the topics of I/O and resiliency. The efficient management of data between different modules poses a great challenge for I/O systems, such as BeeGFS and SIONlib, which will leverage new non-volatile memory technologies to cope with this new scenario. Moreover, traditional check-pointing libraries (e.g. FTI or SCR) will be enhanced with new features and a simpler interface to deal with new application requirements.

1.3.5.6 Data AnalyticsIn general, data analytics is characterised by widespread use of high-level frameworks that hide I/O operations, support the prevalent data analytics parallel processing patterns, and implement increasingly complex machine learning algorithms. Examples are Hadoop15 and Spark16 for the first two areas, and TensorFlow17, Caffe18, and Theano19 for the latter. These frameworks are often underpinned by lower-level data processing and numerical libraries, such as Intel’s Data Analytics Acceleration Library20. DEEP-EST clearly cannot support all of these frameworks within the project frame and budget. The first application analysis and co-design phase will decide which data access, coordination and machine learning frameworks will be covered in DEEP-EST, with a priority on supporting the application needs. The results will directly influence the Data Analytics Module (DAM) architecture in WP3 and WP4, as well as the software architecture work in WP5 (e.g. for interaction with the resource management and scheduler) and the programming model work in WP6. The latter will focus on enabling use of the selected framework in heterogeneous applications, and on porting and optimising their implementations to reach highest efficiency on the DAM and make good use of the NAM functionality. For the latter, the benefits of using the DEEP-EST HPC programming models will be investigated. In addition, application developers will be supported in their use of said frameworks, and advice will be given on how to best evolve the applications to make optimal use of the DEEP-EST system.

1.3.5.7 Benchmarking and modellingIn the last phase of the project, once the prototype is up and running, the software layers will be tested under production conditions. Applications adapted to the system characteristics and its programming environment will be benchmarked utilising the whole machine and the results compared with the ones obtained on others systems. The six full-fledged applications participating in the project (see Section 1.3.5.1) will be used to verify the overall DEEP-EST concept.

Benchmarking and modelling complement the application activities. Synthetic benchmarks enable the test of innovative technologies at early stages of the project, which after qualification can be selected to be integrated on the prototype platform. Also the hardware and software developed within the DEEP-EST project will be evaluated with benchmarks and

15 http://hadoop.apache.org/releases.html16 http://spark.apache.org/17 https://www.tensorflow.org/18 http://caffe.berkeleyvision.org/19 http://deeplearning.net/software/theano/20 https://software.intel.com/en-us/blogs/daal

DEEP-EST Page 13 of 122 Last saved 27.03.2017 08:08

Page 14: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

applications. These tools, combined with software to model the system and its behaviour, will allow extrapolating the obtained results into the range of hundreds of PFlops.

The MSA provides a unique opportunity to achieve performance at Exascale for both compute-intensive HPC applications, and data-intensive HPDA applications. The benchmarking, application, and modelling results to be achieved in the DEEP-EST project will serve to demonstrate this.

1.3.6 Gender analysisDEEP-EST recognises the necessity of improving gender diversity, and diversity in general, in the STEM-fields (science, technology, engineering and mathematics), which include of course HPC. This applies both on the side of developers and users of systems like the one proposed in the project.

Partner UEDIN was instrumental in founding the Women in HPC21 network, which has now grown to be a global network championing and supporting gender diversity in high performance computing. Furthermore, UEDIN has also launched the Diversity in HPC22 initiative to highlight and improve diversity in high performance computing and computational simulation. Other partners also have their own diversity initiatives, such as JUELICH’s Office for Equal Opportunities23.

DEEP-EST will build on, and engage with these initiatives to highlight and address diversity in the project, within its user base, and in the wider community. The project will organise, contribute to, and attend events, and follow recommendations that these organisations propose. The DEEP-EST female technical staff will be encouraged to be role models for the next generation of women HPC developers and users. Furthermore, the project aims at having an equal number of male and female Work Package (WP) leaders, demonstrating through the diverse technical leadership a range of role models for early career researchers to follow.

1.4 Ambition

1.4.1 Advances beyond the state-of-the-artThe project will for the first time implement the Modular Supercomputer Architecture (MSA), and it will conduct a thorough evaluation of such a system with six ambitious applications that combine HPC simulation and large scale HPDA. DEEP-EST aims to improve the usability, flexibility and sustained performance of future supercomputers significantly beyond what can be provided by today’s monolithic systems. Across several technology fields, significant progress beyond the state of the art is required in order to attain the flexibility to combine modules optimised for specific application areas:

Data Analytics Module (DAM) incorporating cutting-edge acceleration and non-volatile memory technology and designed to best support large scale, HPDA applications.

Combination of modules with different interconnects: the project will demonstrate the Network Federation between e.g. Intel’s Omni-Path technology and the EXTOLL interconnect. A highly optimised bridging protocol between these interconnects will be developed, significantly extending the results from the DEEP project.

Efficient management of heterogeneous resources. The ParaStation Resource Management will be extended to handle arbitrary combinations of resources across the modules.

Efficient co-scheduling of resources spread across different modules: The project will extend the SLURM scheduler to allow for a flexible and scalable operation of a MSA-

21 http://www.womeninhpc.org22 http://www.hpc-diversity.ac.uk23 http://www.fz-juelich.de/bfc/DE/Home/home_node.html

DEEP-EST Page 14 of 122 Last saved 27.03.2017 08:08

Page 15: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

system for both, applications utilising different modules at the same time and work-flow oriented approaches.

Data Analytics frameworks and programming models that fully leverage the clustered DAM and support scale-out of large data analytics problems, integrated with HPC programming models, where it makes sense.

1.4.2 Innovation potentialThe MSA is a novel approach to overcome the fundamental limitations of today’s monolithic supercomputer architectures with respect to flexibly and efficiently serving: heterogeneous applications like multi-physics simulations, combinations of classical HPC simulations with HPDA, and the large mixes of applications with different resource requirements typically run by a supercomputer centre. The architecture combines innovative hardware and software elements, and DEEP-EST will build a fully working prototype out of cutting-edge component technology.

The potential of this innovation is immense: if successful, throughput for a centre would increase, and TCO and energy required for the above mentioned applications would be substantially reduced. Looking further into the future, the results would greatly influence the way of harnessing von Neumann architectures and make them accessible to users in science and industry.

In addition, the component modules and the Network Federation will drive innovation and achieve scalability and efficiency gains over the best of breed systems today. Of particular interest are the advances in interconnect technology (EXTOLL) and the Network Federation, the acceleration of collective communication (GCE), the provision of globally accessible memory resources (NAM) and the novel clustered Data Analytics Module (DAM) itself.

In combination, the innovations driven by DEEP-EST will show a way to achieve highest scalability, performance and energy efficiency while supporting the full range of HPC and HPDA applications, and ensuring that HPC centres can make full use of their installed system capabilities and capacity.

2 Impact

2.1 Expected impacts

2.1.1 Contribution to the SRA realisationObjectives and actions proposed for the DEEP-EST project will support key parts of the “Strategic Research Agenda” (SRA)24 as published by the ETP4HPC. The discussion below refers to the latest version of this document, labelled “2015 update”. It is organized by “technology” area as defined in the SRA.

2.1.1.1 HPC System Architecture and ComponentsWork packages 3 and 4 will contribute to the milestones M-ARCH-1 (CM, DAM and ESB node architecture), M-ARCH-2 and M-ARCH-3 (Use of on-package, high-bandwidth memory for ESB and NVDIMM for ESB and DAM), M-ARCH-5 and M-ARCH-6 (EXTOLL network federation and NAM). M-ARCH-4 will be addressed by the development of a new EXTOLL ASIC outside of the project. Progress towards M-ARCH-7 will be made for the ESB (substantial improvement with respect to the current Intel Xeon Phi “KNL” efficiency) and for the DAM (by the use of an integrated FPGA). The introduction of the GCE as a network-integrated acceleration engine falls under M-ARCH-10.

24 http://www.etp4hpc.eu/en/sra.html

DEEP-EST Page 15 of 122 Last saved 27.03.2017 08:08

Page 16: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

2.1.1.2 System Software and ManagementThe generalised programming models developed in WP6 do directly support M-SYS-OS-6, and the included support for using deep memory hierarchies fits under M-SYS-OS-3. The network management and federation work in WP5 will realise M-SYS-IC2.

The focus of DEEP-EST on advanced resource management and scheduling in WP5 will significantly support M-SYS-RM1 (scalability) and M-SYS-RM2 (adaptability). The scalable monitoring and RAS system specified and implemented across WP3, WP4 and WP5 is a critical ingredient for realising M-SYS-CL2. The MSA and its initial implementation in DEEP-EST will provide the infrastructure for implementing the M-SYS-Vis-xx milestones, although no specific implementation work is scheduled in the project.

2.1.1.3 Programming EnvironmentThe specific WP6 activities on MPI and OmpSs will contribute to M-PROG-API-2 and M-PROG-API-3. The work on data analytics programming models will provide important input to M-PROG-API-5, M-PROG-API-LIB-3 and potentially M-PROG-API-6 and M-PROG-API-7.

2.1.1.4 Energy and ResiliencyThe application analysis and modelling activities in WP2 will clearly benefit M-ENR-ES1, and together with scheduling enhancements at the system (WP5) and application (OmpSs) level will further M-ENR-ES2. Provision of on-package memory (ESB) and of NVDIMM (ESB, DAM) plus the novel NAM will contribute directly to M-ENR-AR4. Resiliency improvements in WP6 target M-ENR-FT10 – here, the focus of DEEP-EST is on restarting failed computation efficiently rather than preventing failures.

2.1.1.5 Balance Compute, I/O and Storage PerformanceThe planned use of NVDIMM in some modules (ESB, DAM) and the integration of the NAM into the Network Federation will implement M-BIO-1 and contribute to M-BIO-3. Extensions of BeeGFS and SIONlib target M-BIO-4 and M-BIO-8, and several WP2 applications do realise the vision of M-BIO-5.

2.1.1.6 Big data and HPC Usage ModelsDEEP-EST drives its co-design from a set of applications that do include substantial and advanced data processing and analytics. As a consequence, the DEEP-EST prototype will directly implement M-BDUM-DIFFUSIVE-1 and M-BDUM-DIFFUSIVE-2. The analysis work in WP2 will contribute to M-BDUM-METRICS-1, M-BDUM-METRICS-2, and M-BDUM-METRICS-3. WP6 will address M-BDUM-PROG-1 and parts of M-BDUM-PROG-2 as needed by the HPDA co-design applications. Finally, progress will be made towards M-BDUM-MEM1 and M-BUM-MEM2, again determined by actual application requirements.

2.1.2 Proof-of-conceptThe DEEP-EST Project will provide the hardware, system software, programming environment, tools, and application experience to proof the MSA concept. The DEEP-EST prototype (see Section 1.3.5.2) will be the first system designed as a MSA-system, even though the DEEP/-ER systems have already shown some of the expected characteristics. The hardware and software configuration will be designed to support the addition of further modules, remaining technology agnostic and with focus on scalability, programmability, energy efficiency and application portability. To guarantee the achievement of these ambitious goals, a set of six applications – representing HPC and HPDA sectors and a variety of research fields – participate in the co-design effort that will shape the precise configuration of the DEEP-EST prototype (see Section 1.3.5.1) and serve to demonstrate its capabilities.

DEEP-EST Page 16 of 122 Last saved 27.03.2017 08:08

Page 17: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

2.1.3 Covering broader segments and emerging HPC marketsThe DEEP-EST technology addresses the needs of existing medium- and large-scale supercomputer centres, which all run a large variety of applications with widely differing resource requirements. This corresponds to the “Supercomputer” and “Divisional” segments as defined by IDC25, making in 2015 roughly half of the HPC market.

The more rapidly growing “Departmental” class of HPC system operators will profit if they run heterogeneous workloads or workflows with different steps showing different characteristics. It is hard to estimate a reliable percentage, yet the increasing take-up of multi-physics simulations and complex end-to-end workloads including data pre-processing and analysis indicates that a sizeable part of such HPC operators can be addressed.

Furthermore, the project does specifically address two important emerging markets connected to HPC:

Intersection of HPC and HPDA: Five of the six DEEP-EST applications combine simulations and advanced data analytics, and the co-design approach will ensure that DEEP-EST technology fits the needs of this market segment. In addition, very large-scale data analytics problems will require the use of clustered resources, for which the DAM and the associated SW stack will provide a good solution.

HPC providers in the Cloud: IDC does report growing uptake of Cloud HPC services, and conversely a growing part of HPC systems operators that provide such services. The later are a target market for DEEP-EST, since they will see a wide variety of applications and workloads, and they will have to put a premium on highly efficient utilisation of their systems.

2.1.4 Impact on standard bodies and international research programsThe software used in the DEEP-EST project is in most cases distributed as Open Source (see Table 3). The partners responsible for the different components do invest efforts in standardising their software, and will do it also for the extensions and enhancements implemented within DEEP-EST. Examples of such standardisation efforts are given in the topics below:

OpenMP forum: BSC is an active member of the OpenMP community and pushes OmpSs developments into the OpenMP standard.

MPI Forum: Findings concerning MPI-related extensions to ParaStation will be published by JUELICH’s linked third party ParTec to the MPI community, giving feedback to the MPI Forum as the standard setter.

SLURM: BSC and JUELICH will use its already existing contacts within the SLURM community to push the DEEP-EST extensions into the SLURM’s main branch.

Big Data Value Association (BDVA): Intel, BSC, and FHG are members and will promote the application and system results of DEEP-EST with an eye on influencing standardisation of HPDA use-cases and interfaces.

Community-specific impacts from the co-design applications. For earth sciences: DEEP-EST partners are active in the big data computing technology and earth science informatics areas. Findings of the DEEP EST project in the field of earth science data analysis and analytics will be given as an input to the sessions of these conference series that often drive “de-facto standards” in the community.

2.2 Measures to maximise impact

2.2.1 Dissemination and exploitation of resultsThe dissemination and exploitation of results achieved in DEEP-EST is one of the project’s main objectives and will be done in a concerted action by all partners. Measures will be taken

25 Joseph, E. et al, “IDC HPC Update at ISC16”, presented at Internation Supercomputing Converence 2016, Frankfurt (Germany)

DEEP-EST Page 17 of 122 Last saved 27.03.2017 08:08

Page 18: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

to maximise the project’s impact in the European and global HPC community and beyond. The basis for all the actions will be a comprehensive and concerted dissemination and communication campaign, addressing mainly the following target groups:

HPC community (HW and SW developers, special interest groups like openHPC, EEHPC working group, young and female HPC researchers and students).

Data centres as potential operators of future MSA-systems. Domain scientists and industrial HPC customers as potential users. Standardisation bodies (see Section 2.1.4) and lobby groups (e.g. ETP4HPC). Multipliers like journalists and social media influencers. Political decision makers and general public.

2.2.1.1 Dissemination of results: Draft dissemination planDisseminating the project results will be a key element of the DEEP-EST communication strategy (see Section 2.2.2). The dissemination activities will base on four pillars:

Conferences and other events: Members will actively and regularly take part in relevant (scientific) conferences, trade fairs, workshops and trainings to present their results as well as to share lessons learned and discuss best practices.

Publications: The DEEP-EST researchers will aim at publishing their results in peer-reviewed journals and conference proceedings following the Open Access (OA) policy for EU-funded projects (see Section 2.2.1.3.1 for details).

Technical documentation: Towards the end of the project, the technical documentation will be made available and an overview of DEEP-EST software (see Section 2.2.1.2.1) will be made public via the website.

Website and media channels: The project website will be the central hub for collecting and sharing dissemination material: Once accepted, all non-confidential deliverables will be published via the website. Publications (scientific and non-scientific) as well as presentations and talks will also be included. The project’s social media channels will link to the website content to increase the visibility of the content.

Dissemination activities will continue after the end of the project, as results achieved during the last phase of the project will likely be published after its finalisation. The DEEP-EST prototype will remain being accessible to the application developers and its continuous use will be encouraged. Additionally, the consortium will strive to make the prototype available to external users, e.g. through contacts with PRACE or other projects (see Section 1.3.4). Such efforts will already start during the project with an early access programme directed at interested academic and industrial users.

The visibility of EU funding will be ensured: All scientific publications as well as all other dissemination and communication material (e.g. poster, slides, flyers etc.) will include an acknowledgment sentence referring to the funding source and the specific project number as well as carry the European flag. Dissemination material resulting from the early access programme will be reviewed by the project before publication to make sure these follow the same standards.

2.2.1.2 Exploitation of resultsEuropean industry partners and SMEs in the computing sector – such as ETH-Aurora, Megware, EXTOLL, and JUELICH’s linked third party ParTec – intend to commercialise the results they achieve in DEEP-EST. Academic partners (JUELICH, BADW-LRZ, BSC, FHG-ITWM, UEDIN, UHEI) will strive for Open Source release of their software developments (see Section 2.2.1.2.1) and/or commercialisation through Third Parties or spin-off companies. Currently it is expected that exploitation will be possible for the fields mentioned in the following table.

Area Foreseen impact and exploitation Target audience

DEEP-EST Page 18 of 122 Last saved 27.03.2017 08:08

Page 19: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Table 2: Areas in which DEEP EST will have an impact and foreseen exploitation.

Several tools will be employed to maximise the impact of the items mentioned in the table above. For items related to hardware development, architecture and system integration: trade fairs (e.g. SC and ISC), media relations, press releases, lobbying (e.g. via ETH4HPC), own marketing effort from the partners, and the early access programme will be the main maximising vehicles. On the software and applications side, also paper publications, trainings, workshops, the case study booklet and best practises & lessons learned guides will be done to spread the word on the project developments and raise the interest of new users.

Depending on the project results, the DEEP-EST partners intend to deploy MSA-Systems in their next generation’s (pre-)Exascale production machines, installed in the time frame 2020-2025. The dissemination of the results by the PRACE consortium with 25 member states, and through the ETP4HPC among HPC user industry will considerably increase the potential market for the DEEP-EST hardware and software technologies.

2.2.1.2.1 Open Source

The majority of the software components used in the DEEP-EST project is readily available as Open Source. Table 3 describes the main ones, their current licence, the partner responsible for this package, the type of support provided, and the extensions to the software done within DEEP-EST. The licences under which the DEEP-EST extensions to the existing software packages will be published shall be equal or more open than the original licence. In the case of new developments independent of the existing software packages, or for combination of packages which currently present different kinds of licences, the use of Open Source licences will be encouraged.

DEEP-EST Page 19 of 122 Last saved 27.03.2017 08:08

Page 20: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Product name

Current licence

Partner responsible

Support type Extensions done in DEEP-EST

Para Station MPI

Q Public License v1.0

ParTec + JUELICH

Commercial from ParTec

ParaStation MPI (communication library and process management) extended to support MPI jobs distributed over a wide variety of resources.

OmpSs GPLv3 BSC Community effort led by BSC

Generalisation of the DEEP offload model and improvements on the resiliency strategies first implemented in DEEP-ER.

SLURM GPLv2 BSC + JUELICH + Intel

Community effort.Commercial from SchedMD.

Resource manager and job scheduler extended to support the MSA.

BeeGFS GPLv2 (Client)BeeGFS EULA26 (Server)

FHG-ITWM

Community led by FHG-ITWM.Commercial from ThinkParQ.

Implement storage pools and integrate into the SLURM resource manager to support the MSA.Plug-in architecture to support different storage backends.

libNAM LGPLv2 UHEI + JUELICH

Community effort led by UHEI.Commercial from EXTOLL.

Extensions to support the capacity scaled and non-volatile NAM developed in DEEP-EST. Additional data operations.

Extrae/ Paraver/Dimemas

LGPLv2.1

BSC Community effort led by BSC.

Improvements in efficiency modelling.

JUBE GPL JUELICH Community effort led by JUELICH.

Integration of new benchmarks in the suite.If needed, extension of functionalities to support the DEEP-EST prototype and software stack.

DCDB27 GPLv2 BADW-LRZ

Community effort led by BADW-LRZ.

Web-based frontend for visualisation.Data analysis.

piSVM, HPDBSCAN, DNN

GPL UoI Community effort led by UoI and JUELICH.

Codes will be ported to new DEEP-EST prototype in order to leverage the data analytics module and NAM.

Table 3: List of software produced or enhanced in the DEEP-EST project

The project website will promote the DEEP-EST software during and after the end of the project. A short description of each of the software components of the DEEP-EST environment will be given, detailing the kind of licence and support model applying to each of them. Links to the websites and/or repositories of each of those components - typically available at the website of the responsible partners will be given.

26 http://www.beegfs.com/docs/BeeGFS_EULA.txt27 Data Center DataBase

DEEP-EST Page 20 of 122 Last saved 27.03.2017 08:08

Page 21: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

2.2.1.3 Management and protection of intellectual property and knowledgeProcedures and rules to manage the Intellectual Property (IP) will be detailed in the Consortium Agreement (CA), which will be based on the DESCA template. Currently it is expected that IP will be created in the fields mentioned in Table 2. This list is not exhaustive and may not be used to precedent any additional regulation later defined in the CA. The partner’s adherence to the CA will be overseen by the Coordinator.

The articles in the CA will be formulated with three goals:

Protect the IP and with it the commercial interests of the partners Guarantee that access is given to the IP required to achieve the project goals Facilitate the exploitation of the project results within and beyond the project duration

Regarding the publication of project results and protections of the partner’s knowledge, the partners commit to inform the rest of the consortium of planned publications and communication activities containing project content before their submission to any journal, conference or workshop. A mailing list will facilitate the distribution of such information.

2.2.1.3.1 Open Access and self-archiving

The consortium fully supports the open access strategy demanded for EC-funded projects and will strive for open access in all publications. As experience shows, however, it will be more feasible to aspire a mix between the ‘gold’ and ‘green’ open access model. Budget has already been allocated within the proposed dissemination budget (see Section 3.4.2), to pursue a gold open access strategy and make scientific, peer-reviewed articles resulting from the project accessible upon publication. In cases where the journal only allows for the ‘green’ access model, the researches themselves will be responsible for self-archiving their articles, as most institutions in the consortium already dispose of their own repositories. Would a consortium member lack of an own repository, the article will be made available via open access repositories (e.g. Zenodo). The official project website will announce all DEEP-EST publications linking to the corresponding repositories.

Open Access guidelines will be collected in a fact sheet and made accessible to all partners, including information on which journals are open access or hybrid and how to handle negotiations with publishers. The DEEP-EST partners will inform WP7 when they have submitted project-related papers for publication and of the acceptance or rejection of these. This will allow WP7 to keep the list of publications on the project website up to date.

With respect to the Open Research Data pilot started for H2020 projects, the DEEP-EST consortium will monitor the scientific datasets produced within the project and evaluate if they are relevant for this pilot program. It is important to note here that the innovation in DEEP-EST is of technical nature. The project itself will not generate scientific research data that will contribute to advance science in certain domains. Rather, the project aims at developing new technical resources that shall help to advance science in the future. For this reason, DEEP-EST opts out the pilot program. Still, in case relevant scientific datasets emerge, the project is striving to make these publicly accessible and the exploitation plan will be extended with a proper data management plan.

2.2.2 Communication activitiesOutreach activities by the DEEP-EST project will be grounded on a solid communication strategy, which is broader in scope than the mere dissemination activities listed in Section 2.2.1.1). The main objective of the communication efforts is to bring the overall MSA concept, including its unique selling points, the project’s results, and lessons learned to the awareness of the defined target audiences (see Section 2.2.1). A consistent communication approach will be applied based on DEEP-EST own strategy, and creating synergy effects with other European Commission (EC) funded FET-HPC projects.

DEEP-EST Page 21 of 122 Last saved 27.03.2017 08:08

Page 22: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

DEEP-EST will follow the communication strategy taken in DEEP/-ER, executed via a mix of owned, earned and paid channels. The project’s owned channels will include the project website (www.deep-projects.eu, web domain already reserved) and social media channels. Earned channels will be based on strong relations to journalists from traditional media as well as social media influencers. These stakeholders will help increasing the visibility and impact of the DEEP-EST project by mentioning, referring to, and reporting about the project, its objectives and outcomes. Last but not least, when deemed necessary and beneficial to the project, fees will be paid for publication (in media outlets as well as for scientific publications).

Concrete activities with regards to communication will include:

Branding and messaging: DEEP-EST will be positioned as a further member of the “DEEPprojects” family (following and extending DEEP and DEEP-ER). Logos have been designed already for DEEP-EST and the DEEPprojects brand, following the established corporate design line. Key messages will be in-line with the DEEP/-ER messaging, especially when talking about overall concepts. Specific DEEP-EST messages will be defined and all key messages will be adapted to the various stakeholders.

Website and social media channels: The website will be the central information hub about the DEEP-EST project. The DEEP-ER project website (www.deep-er.eu) will be continued and converted into a DEEPprojects website (www.deep-projects.eu). In the same way, the existing social media channels will be extended, especially for the Twitter handle. New channels might be evaluated and be set-up. Decisions on that topic will be taken in the first three months of the project and specifications given in the first deliverable of WP7.

Creation of information material: Next to classic material like e.g. project flyers, a case study booklet, as well as final activities potentially including a final project brochure are planned. Audio-visual material will be evaluated according to its cost-benefit ratio.

Media relations: Over the years, the DEEP/-ER projects have built a network of media contacts to key outlets especially within the HPC media landscape – including e.g. insideHPC, Scientific Computing, ScienceNode, HPCwire or The Next Platform. This list of contacts will be leveraged by DEEP-EST, and extended to include in particular media channels outside the HPC environment. Outreach to media will be done via classic PR tools like press releases, interviews, background talks etc.

Conferences, trade fairs and events: One-to-one encounters at conferences, trade fairs and events will be exploited. On top of the classic HPC conferences like the Supercomputing (SC), the International Supercomputing (ISC) and the European HPC Summit week, further events will be identified. Possible activities include participation in the technical programs via BoF-sessions, workshops, tutorials, and research papers, or via self-organised events like the planned user awareness days or e.g. roundtables.

Collaboration Activities: DEEP-EST will seek close cooperation with other projects in the HPC area – depending on the future HPC project landscape, cooperation will be established with relevant support actions and other R&D projects. The scope of the collaboration to be defined with the interested projects potentially includes joint communication and outreach activities, presence in conferences (e.g. joint booths or BoFs), and trainings and workshops.

DEEP-EST Page 22 of 122 Last saved 27.03.2017 08:08

Page 23: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

3 Implementation

3.1 Work plan — Work packages, deliverables

3.1.1 Overall structure of the work planThe project is organised into Work Packages (WPs) defined to group similar and closely related activities. Technical discussions between WPs are channelled through the Design and Development Group (DDG), which drives co-design and takes the most important design decisions, guaranteeing a coherent development of hardware and software in the project.

The overall structure of DEEP-EST is as follows:

Applications, benchmarking and modelling:WP1 “Applications” analyses the application codes, provides their requirements as co-design input, defines use-cases and traces for benchmarks and modelling, and prepares the codes to optimally run on the DEEP-EST modular prototype. WP1 will demonstrate the usability of the MSA concept and its software environment.

WP2 “Benchmarking and modelling” combines the application use cases of WP1 with additional benchmarks to evaluate the MSA. It provides the modelling tools and techniques to simulate the system and predicts the expected performance at larger scale and different configurations, including aspects like scheduling policies and energy consumption.

Modular architecture, design and prototype development:WP3 “System architecture” designs the MSA, defining the high-level system specifications. A key-aspect is the collection of technical requirements in the co-design cycle. WP3 also verifies that the DEEP-EST prototype built in WP4 fulfils the identified requirements. It evaluates new technologies deploying small scale evaluators and Software Development Vehicles (SDVs), reports on the lessons learnt in the architecture development, and gives a final assessment of the achieved functionality and performance.

WP4 “Prototype development” designs all the hardware modules of the DEEP-EST prototype based on the high level specifications defined by WP3. It manufactures, tests, and integrates these modules to build up the DEEP-EST prototype. Preparation of the local infrastructure, prototype installation, bring-up and maintenance are also responsibility of WP4.

System software, management and programming environment:WP5 “System software and management” designs the complete software stack and develops the scheduling, management, file system, monitoring, and resiliency parts of it. Goal is to optimally support the MSA, enabling a mix of distinct applications to share the system and run distributed amongst modules, while optimising the use of the resources, with special attention to data management.

WP6 “Programming environment” develops the upper layers of the software stack, allowing the applications to map their communication and I/O patterns onto the modular system topology. Building upon the work done in DEEP/-ER, the chosen programming model combines ParaStation MPI with OmpSs and BeeGFS and SIONlib to handle I/O operations. Optimisations and further developments will be done to support the DEEP-EST prototype. Additionally, frameworks for HPDA applications will be supported and integrated into the DEEP-EST software stack.

Dissemination and management:Finally, dissemination and project management will be accomplished by WP7 “Dissemination” and WP8 “Coordination”. The leaders of these WPs (BADW-LRZ and JUELICH) will be supported by all other partners, who will provide input for progress and

DEEP-EST Page 23 of 122 Last saved 27.03.2017 08:08

Page 24: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

midterm reports, organise trainings, and disseminate their work through workshops, conferences and publications.

DEEP-EST Page 24 of 122 Last saved 27.03.2017 08:08

Page 25: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

3.1.2 Timing of the different Work Packages and their componentsIn the following Gantt diagram the project is assumed to begin on 1st July 2017. If the start time is different, all Tasks will be accordingly shifted without changes in duration or dependencies.

DEEP-EST Page 25 of 122 Last saved 27.03.2017 08:08

Page 26: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Figure 3: Gantt diagram

DEEP-EST Page 26 of 122 Last saved 27.03.2017 08:08

Jul-17

Oct-17

Dec-17Jan-18

Mar-18

Jun-18Jul-18

Sep-18Oct-18

Dec-18Jan-19

Apr-19

Jun-19Jul-19

Sep-19Oct-19

Dec-19

Feb-20Mar-20

Jun-20

14

67

912

1315

1618

1922

2425

2728

3032

3336

all

MS3

; D1.

1Ap

plic

ation

co-d

esig

n re

quire

men

ts4

D1.1

all

MS5

; D1.

2, D

1.3

Appl

icati

on a

naly

sis

12D1

.2D1

.3al

lM

S9; D

1.4

Porti

ng to

mod

ular

sup

erco

mpu

ting

18D1

.4al

lD

1.5

Opti

mis

ation

s an

d be

nchm

arki

ng12

D1.5

Tk2.

1M

S3, D

2.1

Defi

nitio

n D

EEP-

EST

benc

hmar

k su

ite9

D2.1

all

MS7

, D2.

2, D

2.3

Appl

icati

on a

naly

sis,

mod

ellin

g an

d pr

ojec

tion

30D2

.2D2

.3

Tk3.

1M

S3,5

,6; D

3.1,

D3.

2Sys

tem

arc

hite

ctur

e in

co-

desi

gn12

MS3

D3.1

D3.2

Tk3.

2Ev

alua

tors

and

SD

Vs21

Tk3.

3D

3.3,

D3.

4Sy

stem

ass

essm

ent

15D3

.3D3

.4Tk

3.4

D3.

5Sy

stem

arc

hite

ctur

e re

view

9D3

.5

Tk4.

1M

S6,9

,10;

D4.

1, D

4.6Cl

uste

r Mod

ule

desig

n an

d co

nstr

uctio

n14

D4.1

D4.6

Tk4.

2M

S6,9

,10;

D4.

4, D

4.7Ex

trem

e Sc

ale

Boos

ter d

esig

n an

d co

nstr

uctio

n18

D4.4

D4.7

Tk4.

3M

S6,9

,10;

D4.

2, D

4.6D

ata

Anal

ytics

Mod

ule

desig

n an

d co

nstr

uctio

n18

D4.2

D4.6

Tk4.

4, T

k4.5

MS6

,9,1

0; D

4.3,

D4.

5EXTO

LL, C

GE,

Net

wor

k Fe

dera

tion

and

NAM

14D4

.3D4

.5Tk

4.6

MS1

0Pr

otot

ype

inst

alla

tion

and

brin

g-up

8M

S10

Tk4.

6D

4.9

Prot

otyp

e su

ppor

t and

mai

nten

ance

16D4

.9

Tk5.

1M

S3,4

,6; D

5.1,

D5.

2Soft

war

e ar

chite

ctur

e de

finiti

on12

MS3

D5.1

D5.2

Tk5.

2-Tk

5.6

MS7

; D5.

3, D

5.4

Syst

em s

oftw

are

impl

emen

tatio

n18

D5.3

D5.4

Tk5.

2-Tk

5.6

MS1

0, D

5.5

Sorft

war

e su

ppor

t and

mai

nten

ance

12D5

.5

Tk6.

1-Tk

6.5

MS6

; D6.

1Pr

og. e

nviro

n. d

esig

n an

d sp

ecifi

cacti

ons

3D6

.1Tk

6.1-

Tk6.

5M

S7,9

; D6.

2, D

6.3

Prog

. env

iron.

dev

elop

men

t15

D6.2

D6.3

Tk6.

1-Tk

6.5

MS1

0, D

6.4

Prog

. env

iron.

mai

nten

ance

and

sup

port

12D6

.4

all

MS1

,2,8

; D7.

1-D7

.6D

issem

inati

on, o

utre

ach

and

trai

ning

36D7

.1D7

.2D7

.3D7

.4D7

.5D7

.6

all

MS1

,2; D

8.1-

D8.

8M

anag

emen

t of t

he p

roje

ct36

M1

D8.1

D8.2

D8.3

D8.4

D8.5

D8.6

D8.7

Grap

hica

l rep

rese

ntati

on:

Colo

r leg

end:

Mile

ston

es a

nd D

eliv

erab

les

are

mar

ked

with

a th

ick

blac

k lin

e, a

s in

the

follo

win

g ex

ampl

e:Ap

plic

ation

s and

mod

els

If a

deliv

erab

le a

nd a

mile

ston

e ha

ppen

sim

ulta

neou

sly,

onl

y th

e de

liver

able

is g

raph

ical

ly re

pres

ente

dSy

stem

Har

dwar

eD

epen

denc

ies a

re m

arke

d w

ith a

n ar

row

, poi

nting

tow

ards

the

wor

k th

at d

epen

ds o

n th

e w

ork

at th

e or

igin

of t

he a

rrow

Syst

em S

oftw

are

Oth

ers

WP2

Ben

chm

arki

ng a

nd m

odel

ling

WP5

Syst

em so

ftw

are

and

man

agem

ent

WP3

Syst

em a

rchi

tect

ure

Prot

otyp

e de

velo

pmen

tW

P4

App

licati

ons

Mon

th

WP

Task

Mile

sone

s;

Deliv

erab

les

Desc

riptio

n of

Wor

k

Duration (M)

DEEP

-EST

Pro

ject

Sch

edul

e

WP1

WP7

WP8

Coor

dina

tion

WP6

Pro

gram

min

g en

viro

nmen

t

Diss

emin

ation

Page 27: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

3.1.3 Detailed work description

3.1.3.1 Work Package 1: Applications

Work package number 1 Lead beneficiary JUELICHWork package title ApplicationsParticipant number 1 2 4 11 12Short name of participant JUELICH Intel BSC KULeuven ASTRONPerson/month per participant

45 8 18 36 36

Participant number 13 14 15 16Short name of participant NCSA NMBU UoI CERNPerson/month per participant

36 36 36 36

Start Month 1 End Month 36

Objectives Gather the requirements of six applications and drive the co-design of the DEEP-EST

prototype architecture and system. Provide specific application use-cases for benchmarking and modelling activities. Identify how to best match the co-design applications or parts of them to the DEEP-EST

prototype modules. Provide adapted applications that are also optimised for the target modules to the WP2

benchmarking activities. Assess the MSA as an architecture and the DEEP-EST prototype with regards to

performance, ease of use, and portability for the co-design codes.

Description of workSix important ambitious scientific codes have been selected as the DEEP-EST co-design applications. They come from the HPC and HPDA areas, and represent a group of research fields relevant for the European Research Arena. With the exception of GROMACS (which uses a multi-physics approach), all do combine HPC computation with advanced data processing and analytics. Thus, they do consist of multiple parts with different resource requirements and will be eminently suitable to assess the potential of the MSA and the DEEP-EST prototype.

This work package consists of a support Task (Tk1.1), and one Task for each of the co-design applications (Tk1.2–Tk1.7). The former will provide training and support and also coordinate the work of the latter.

In the first four months focus will be on analysing the applications and collecting the detailed requirements of each for the DEEP-EST architecture and prototype. This will factor in the application roadmaps, anticipating the planned evolution of codes, and also cover requirements for HPDA programming models. The requirements will be collected and reported in D1.1 as the foundation of the DEEP-EST co-design effort.

Close interaction to WP2 will be established to prepare the comprehensive benchmarking and evaluation activities planned there. Important aspects are the definition of specific use cases, their integration with the JUBE benchmarking environment28, and the generation of event traces for the WP2 analysis and prediction tools. D1.2 will report on these activities.

28 http://www.fz-juelich.de/ias/jsc/EN/Expertise/Support/Software/JUBE/_node.html

DEEP-EST Page 27 of 122 Last saved 27.03.2017 08:08

Page 28: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

The next step is for each application to find out how to best map it to the modules of the DEEP-EST prototype, potentially splitting the application into separate parts (e.g. HPC computation and data analytics). The analysis results created above will help significantly. D1.3 will document the partitioning and mapping strategies.

After M12, the application codes will be adapted to the target hardware, using existing systems and the SDVs provided by WP3. During this phase co-design discussions will continue with work packages 4, 5 and 6 as more detailed design decisions are taken and the codes get early access to hardware that is representative of the technology used for the DEEP-EST prototype. D1.4 will report the results achieved.

In the third year the codes will be ported to the actual DEEP-EST prototype. Performance will be measured and compared with the code performance at the beginning of the project, and also to numbers obtained on other HPC platforms. D1.5 will report on the results.

Task 1.1: Support team (JUELICH, Intel, BSC; M1-M36)29

This Task coordinates the work of the other WP1 tasks (sets the working schedule, coordinates the writing of reports and deliverables) and supports application developers in their code analysis and porting effort to speed up the development process. Furthermore, this Task will drive the co-design cycle between WP1 and WP 3, 4, 5, and 6.

Support will be offered to the application partners throughout all the steps outlined in the introduction above. A systematic training program to be conducted with WP7 will cover all angles of the DEEP-EST hardware, system software and programming environments. User support will be given on the SDVs and on the DEEP-EST prototype whenever necessary.

The partners involved in this Task are BSC, Intel and JUELICH. BSC will provide training for OmpSs and Extrae/Paraver. Intel will support the porting efforts to the CM, ESB and DAM. JUELICH will provide support in analysing, porting and tuning applications first to the DEEP-ER prototype and later to the software development vehicles and the DEEP-EST prototype.

Task 1.2: Neuroscience (NMBU; M1-M36)

NEST30 is a widely-used, publically available simulation software for spiking neural network models, scaling up to the full size of Petascale computers. NEST focuses on the dynamics, size and structure of neural systems rather than on exact morphology of individual neurons. The internal dynamic of model neurons in NEST is simple, described by a small number of linear ordinary differential equations, and neurons communicate via discrete events (spikes).

There is growing scientific demand to use NEST also to model networks of neurons with more complex internal dynamic, e.g., being highly non-linear. This will require much greater computing power. Another challenging aspect is the very large amount of data generated by large-scale network simulations. Today, this data is usually written to file and analysed later offline, but this approach is often inefficient.

Through DEEP-EST, NEST will be tuned to fully exploit the power of hybrid systems for models including non-linear subthreshold dynamics. A leapfrogging strategy that interleaves spike delivery on the CM with neuron dynamics updates on the ESB will be attempted. Furthermore, this approach will be combined with real-time data analysis in the DAM, to either prepare data for real-time visualisation or perform a dimension reduction by, e.g., computing correlations or detecting spike patterns. The SSSM will be used to store the generated simulation data.

29 For each task, in brackets: the partners involved underlining the task leader; the time frame of the task in project months.30 Plesser, H.E. et al, “NEST: the Neural Simulation Tool", in "Encyclopedia of Computational Neuroscience", p. 1849-1852, Springer New York, 2015, doi: 10.1007/978-1-4614-6675-8_258

DEEP-EST Page 28 of 122 Last saved 27.03.2017 08:08

Page 29: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Task 1.3: Molecular dynamics (NCSA; M1-M36)

GROMACS31,32 is one of the most widely-used computational chemistry tools. This free, Open Source software depicts complex chemical systems in terms of a realistic atomic model, to predict the materials’ macroscopic properties. GROMACS is a molecular dynamics calculations toolbox providing a rich set of calculation types, preparation and analysis tools. It is highly computation intensive and runs best on systems of many-core processors (with fast RAM access and large SIMD registers for calculating particle-pair interactions) and a low latency interconnect. The long-range electrostatic interactions are calculated by means of FFTs utilising all-to-all communications. Therefore, a fast communication network is of crucial importance in order to not compromise application scalability.

In this Task GROMACS will be tuned to optimally exploit the MSA and provide commensurate input to the co-design effort that will shape the DEEP-EST modules and prototype. A GROMACS use case with multi-physics characteristics will be identified. During the initial phase of the project detailed code-analysis will be done to reach an optimal code-distribution over the DEEP-EST modules (most probably ESB and CM). Based on this analysis GROMACS will be adapted and tuned. It is important that the domain decomposition procedure should minimise all-to-all communications and thus improve application scalability.

Final measurements with molecular dynamics simulations of scientific interest on the DEEP-EST prototype will demonstrate the performance and scalability improvements achieved by GROMACS, and will be given back to the scientific community.

Task 1.4: Radio astronomy (ASTRON; M1-M36)

The Square Kilometre Array33 (SKA), the next-generation radio telescope to be built in 2020, will have Exascale compute and PB/s bandwidth requirements, four orders of magnitude higher than current radio telescopes. These high data rates and processing requirements make efficient algorithms and HPC platforms inevitable, to maximise scientific outcome and to minimise costs and energy usage. In this Task, key algorithms used by radio telescopes will be adapted to the DEEP-EST modular system.

ASTRON will combine the modular approach with algorithmic improvements in the processing pipelines that are necessary to fully exploit the scientific capabilities of the SKA. Somewhat simplified algorithms implemented for GPUs, Digital Signal Processors, and Intel Xeon Phi exist. The goal to move to more complex algorithms and new architectures. This will allow comparing performance, energy efficiency, programming models and effort for these platforms.

This Task will focus on the computationally most expensive parts of the processing pipeline, which includes the imaging application. A novel algorithm, Image-Domain Gridding, promises to create sky images more efficiently and accurately than the current state-of-the-art, as it applies corrections for direction-dependent and wide-field effects much more efficiently. Additional applications are a flagger (that detects and rejects data that is affected by interference) and calibration. Together, these form the main components of the imaging pipeline.

Within DEEP-EST, these algorithms will be implemented and analysed on the different modules. Metrics like floating point performance, execution times, and energy usage are utilised to compare the HPC modules (and where possible with other architectures).

Co-design efforts are supported both ways: ensure that modules support the application efficiently, and optimise the pipeline to run efficiently on the modules. Software developed in

31Abraham, M.J. et al, “High performance molecular simulations through multi-level parallelism from laptops to supercomputers”, SoftwareX, Volumes 1–2, 2015, Pages 19–2532www.prace-ri.eu/IMG/pdf/Performance_Analysis_and_Petascaling_Enabling_of_GROMACS. pdf33 https://www.skatelescope.org/

DEEP-EST Page 29 of 122 Last saved 27.03.2017 08:08

Page 30: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

the project is expected to be used first in the LOFAR telescope upgrade and will demonstrate how to minimise costs and energy usage of the SKA Science Data Processor.

Task 1.5: Space Weather (KULeuven; M1-M36)

Space weather studies the physics of the Sun’s plasma ejections, its propagation through the solar system and its effects on the earth atmosphere and on human life and technology. It has a high societal relevance for its impact on the satellite and electric distribution industries.

During the DEEP/-ER projects, KULeuven developed the particle-in-cell code iPic3D, which models the interaction of the solar plasma with the Earth. iPic3D uses the Cluster-Booster architecture: the highly scalable and highly vectorisable particle operations run in the ESB, while the more communication intensive electromagnetic field solver runs concurrently on the CM.

In DEEP-EST, iPic3D will be the last stage in a pipeline that covers the whole propagation of the solar plasma from the Sun to the Earth. The system pipeline will consist of three parts:

1. General Learning Algorithm (GLA): Trains a Neural Network (NN) to forecast energetic events on the Sun using Data Analysis.

2. Magneto-hydrodynamics (MHD) code Slurm34: Calculates the propagation of the CMEs provided by the NN to forecast the solar wind conditions at our orbit.

3. Based on this information, iPic3D is launched to observe the effects on the magnetosphere of the planet.

The full pipeline requires continuous interaction between the GLA, the MHD and the iPic3D codes. This chaining of codes will fully exploit the modularity of the DEEP-EST prototype: Satellite images will be retrieved and stored on the SSSM. They will be used to train the GLA in the DAM. The CM will run the MHD code, as well as the field operations of iPic3D, while its particle operations will run concurrently on the ESB.

The overall pipeline will be optimised for the DEEP-EST prototype; it will challenge the scheduling and resource management systems, and contribute to the co-design of the compute modules. Most of the code-development will be on GLA, which is foreseen to exploit the DAM, while the iPic3D and the MHD codes are already available and in an advanced stage of development. The final goal of this Task is a highly modular pipeline capable of simulating the full Sun-Earth System running efficiently on the DEEP-EST prototype.

Task 1.6: Data analytics in Earth Science (UoI; M1-M36)

The continuous progress in remote sensor resolutions of Earth observation platforms generates large quantities of hyperspectral data for the mapping and monitoring of natural and man-made land covers. Current Synthetic Aperture Radar missions35– with high spatial resolution and frequent repeat passes – raise huge requirements for the analysis of satellite time series data. This enables the observation and analysis of dynamic processes involving natural landscape and built-up sites with significant socio-economic, environmental, and geopolitical impact. Similarly, 3D point cloud datasets in earth sciences created by 3D laser scanners drive data growth, up to scans of whole countries.

Three data analytics methods are used by UoI in order to extract knowledge: clustering (HPDBSCAN36), classification (piSVM37), and a variety of Deep Learning frameworks (e.g.

34 Slurm is an MHD code developed by KULeuven. Not to confuse with SLURM Workload Manager.35 R. Torres et al., “GMES Sentinel-1 mission”, Remote Sensing of Environment, Volume 120, pp. 9-24, May 2015, doi: 10.1016/j.rse.2011.05.02836 M. Goetz, C. Bodenstein, M. Riedel, “HPDBSCAN – Highly Parallel DBSCAN”, Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC2015), Machine Learning in HPC Environments (MLHPC) Workshop, Austin, 201537 G. Cavallaro, M. Riedel, M. Richerzhagen, J.A. Benediktsson, A. Plaza, “On Understanding Big Data Impacts in Remotely Sensed Image Classification Using Support Vector Machine Methods”,

DEEP-EST Page 30 of 122 Last saved 27.03.2017 08:08

Page 31: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Tensorflow, Theano, Caffe, CNTK, and Torch).

This Task will explore innovative parallel I/O methods using the NAM devices, and will adapt UoI’s analytics codes to leverage the MSA. A port of a suitable Deep Learning network will be performed as a joint activity with Tk6.3, with the DAM being the main target. The requirements of the three principal HDPA techniques mentioned above will be injected into the co-design effort.

The three algorithms (HPDBSCAN, piSVM, and deep neural network (DNN) learning) will utilise the DAM. The hyperspectral remote sensing data will be sent to the NAM, while the overall satellite dataset and large point cloud data residing on the SSSM.

The expected outcome is reduced time to solution for classification, clustering and deep learning applications, including the search for correct parameters through cross-validation, and a significant speed-up with respect to standard parallel versions due to the adoption of the MSA. The innovative integrated computing and data architecture of DEEP-EST will contribute to cutting edge knowledge discovery with unprecedented effectiveness and efficiency.

Task 1.7: High Energy Physics (CERN; M1-M36)

The LHC experiments at CERN collect enormous amounts of data, which needs to be pre-processed, treated and then analysed to extract the scientific information which physicists look for. This makes the codes developed for LHC paramount examples of HPDA applications.

¿ DEEP−EST ,CERN will investigatea new model for deploying improvements for the analysisof data created by the CMS instrument . This focuse son the instrument code (which specifies how theinstrument “ sees ” events)∧its calibration .Currently , new codemust be availablebefore starting the actualdata processing.CERN will explore the dynamic reprocessing of objectswhen theinstrument code∨calibration changes .¿ test this new concept a large high performance processing centre withexcellent integrated storage is required .The MSA could be an ideal platform.

At least threeof the DEEP EST modules will beused : the SSSM will contain theinput data ,the CM will beused for data refresh ,∧the DAM for Data reduction. The CM∧DAM modulesare likely ¿be stressed by thisapplication , whichwill define its requirements withinthe co design phaseof the project . At a later stage of the project , porting ¿ the ESB will be investigated .Derived data will reside∈a largeobject store(SPARK /HDFS∨CEPH ). As selections are made , the object provenance is checked∧an updateof the object will be triggered when needed .The concept will be demonstrated on a flagship analysis that is sensitive ¿changes∈calibrations : The Higgs decay ¿two gamma photonsis an obvious choice .The degree of thread parallelism∈the reconstruction algorithms determines their efficiency on an HPC platform∧effort will be d evoted ¿algorithm modernisation .

Architectureslike MSA are under investigationas a potential solutionfor High−Luminosity LHC (HL−LHC appears∈the 2016 ESFRI roadmap asa landmark Research Infrastructure)∧the ability ¿exercise a proof −of −concept would be a great benefit ¿CMS .

Deliverables Deliv. Date

D1.1 Application co-design input [JUELICH; all WP1-Tasks]38

Documents requirements of all applications for co-design; includes all data analytics SW requirements, compute, memory & communication performance, footprints, communication patterns, etc.

M4

D1.2 Application use cases and traces [JUELICH; all WP1-Tasks]Use cases integrated in JUBE and Extrae trace files provided to WP2

M9

D1.3 Application distribution strategy [JUELICH; all WP1-Tasks]Details the use of modules by the applications, including partitioning and mapping of the different parts to modules.

M12

D1.4 Initial application ports [JUELICH; all WP1-Tasks]Reports on optimisation done in each code part to be ready for running on the DEEP-EST prototype. Results of application runs on equivalent hardware (SDVs) will also be detailed, comparing performance with previous versions of the code and other platforms.

M24

D1.5 Final report on applications experience [JUELICH; all WP1-Tasks]Report performance improvements over the project, compare with other platforms, and detail porting efforts and ease-of-use.

M36

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, Issue 99, pp. 1-13, 201538 http://cms.web.cern.ch/

DEEP-EST Page 31 of 122 Last saved 27.03.2017 08:08

Page 32: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Relevant Milestones Lead Partner

Deliv. Date

MS3 Initial co-design input collected JUELICH M4MS5 DEEP-EST benchmark suites defined JUELICH M9MS9 Prototype delivered to Jülich, SW and applications ready for

deploymentETH-Aurora M24

Type equation here .

DEEP-EST Page 32 of 122 Last saved 27.03.2017 08:08

Page 33: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

3.1.3.2 Work Package 2: Benchmarking and modelling

Work package number 2 Lead beneficiary BSCWork package title Benchmarking and modellingParticipant number 1 2 3 4 9Short name of participant JUELICH Intel BADW-LRZ BSC UEDINPerson/month per participant

18 12 18 48 12

Start Month 1 End Month 36

Objectives Test and evaluate the DEEP-EST prototype with WP1 and third-party applications;

validate the benefits of the MSA and the DEEP-EST HW/SW implementation. Evaluate the benefits of co-allocation and advanced, parameterised scheduling policies

(created in Tk5.5) and the scalability of these. Produce models for the WP1 applications and leverage them to predict performance and

energy usage on systems with different technology or scale.

Description of workThis WP will evaluate the DEEP-EST prototype and measure the benefits that it brings for application end-users and operators of compute infrastructure. To this end, a benchmark suite composed of the WP1 applications, 3rd-party applications, synthetic benchmarks and workload mixes of these will be used. Measurable benefits do include time-to-solution, utilisation of resources, energy and energy efficiency, system throughput, and job waiting time. Benefits measured will partly come from the innovative MSA, and partly from the implementation of the compute modules. Care will be taken to attribute the effects to both causes in a reasonably precise way, e.g. by running applications both across modules (thus leveraging heterogeneity and improved module technology) and on a single module alone (assessing the implementation of that module). For the workload mixes, the effects of using advanced scheduling policies and co-scheduling will be analysed.

To assess the scalability of the DEEP-EST prototype (and of MSA in general), an in-depth performance analysis of the benchmarks will be used to create performance models. These will enable prediction of application and workload performance along the vectors of system scale (enhancing the techniques used in the DEEP-ER project) and system performance characteristics (including computation, communication and I/O performance). In addition, existing modelling techniques for energy consumption will be adapted and improved, using machine learning methods.

Both techniques will be used to generate predictions for performance and energy use/efficiency of scaled-up versions of the DEEP-EST prototype. Likewise, the effects of advances in component and fabric technology on MSA-systems will be forecasted.

Task 2.1: Benchmarking (JUELICH, Intel, BSC, UEDIN; M1-M36)

Working closely with WP1 it will be investigated whether and which additional third-party and/or synthetic benchmarks shall be included to ensure that DEEP-EST covers all HPC and HPDA areas important to Europe, and to enable detailed measurement of all important performance aspects of the planned DEEP-EST prototype. The result will be two benchmark suites, which will be integrated in JUBE:

1. Application benchmarks consisting of the WP1 codes (and becoming available after M9 in initial and later in optimised versions) and of a set of 3rd-party application-level benchmarks in fields not adequately covered by WP1.

DEEP-EST Page 33 of 122 Last saved 27.03.2017 08:08

Page 34: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

2. Synthetic benchmarks that focus on specific performance aspects of an HPC or HPDA system, including compute, communication and I/O performance. These will allow comparison of the DEEP-EST systems with existing and future platforms without having to port and run the full application suite everywhere, and can guide the module and system development in WP4 at a much earlier stage. They will also feed in to the creation of models in Tk2.3.

For both benchmark suites, baseline measurements will be obtained with existing large systems, starting with the readily available synthetic and 3rd-party codes, including where possible the original versions of WP1 applications. On the evaluation platforms and SDVs from WP3, additional benchmark results will be obtained. As optimised WP1 applications become available, these will be taken up, and the testing will move to the full DEEP-EST prototype. Workload measurements will commence on multi-module SDVs and then transition to the prototype system. Comparisons will be made to results of workload tests on conventional systems available to the consortium at that time.

The workloads (i.e. mixes of applications) as defined by Tk2.2 will be used for measuring system-wide metrics and assessing the efficiency benefits of the DEEP-ER prototype for computer centres.

Task 2.2: Modelling and validation of scheduling policies (BSC,JUELICH, Intel; M1-M36)

This Task will first define a set of DEEP-EST workloads that is representative of how a MSA-system will be used. Component applications will come from the benchmark suites of Tk2.1. Arrival times and dependencies between application instances will be specified, and a mix of realistic application sizes will be defined. Where it makes sense, this Task will work with Tk2.1 to extend applications to take advantage of the new Tk5.5 co-allocation functionality.

The performance prediction data generated in Tk2.3 will be used to evaluate the potential of the policies proposed in Tk5.5 when used in extreme scalable systems. These performance predictions will allow us to detect problems and scalability limits.

Main outcome of this Task is a deep parametric evaluation of how the configurable scheduling policy developed in Tk5.5 performs on the above mentioned workloads. Co-allocation should improve system utilisation and average response time, yet the achieved results will depend on the exact scheduling policy parameters in complex ways. Obtained best case results will be compared against those obtained by traditional scheduling policies without co-allocation.

Task 2.3: Performance modelling and extrapolation (BSC, Intel; M1-M36)

The BSC performance tools and modelling infrastructure (BSC efficiency model) are the basis of this Task: Extrae/Paraver measure the use of compute and memory resources, the interconnect, and the I/O subsystem, while Dimemas provides predictions for architecture or application modifications and the BSC efficiency model yields accurate scaling predictions (as already demonstrated in the DEEP/-ER projects).

In DEEP-EST, these tools will be used to create models of all WP1 applications. These models will rely on basic performance parameters extracted from synthetic benchmarks, and will allow predicting the efficiency of the codes for different system configurations, and to extrapolate them to Exascale dimensions. Furthermore, the BSC efficiency model itself will be improved, in order to compute upper and lower efficiency boundaries. Performance predictions will be also used in Tk2.2 to do a pre-evaluation of scheduling policies.

The BSC tools can automatically identify different phases of applications, and this capability will serve to build specific models for each of the phases, thus improving precision compared to conventional approaches that treat applications as single unit. The above activities will factor in the characteristics of the module-to-module communication, as well as to the extent

DEEP-EST Page 34 of 122 Last saved 27.03.2017 08:08

Page 35: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

possible the effects of multi-level memory hierarchies (DRAM combined with faster high-bandwidth memory or slower non-volatile memory) and of the I/O subsystem characteristics.

Task 2.4: Energy modelling (BADW-LRZ, UEDIN; M1-M36)

Modelling the energy and power usage of applications clearly is important for directing system design and application optimisations. It can also lead to better scheduling decisions when running a workload, either reducing energy consumption, or maintaining performance within a fixed power envelope.

This Task will build on analytic models that BADW-LRZ have developed to predict the energy consumption of applications based on CPU performance counters, and on work done in the Adept project39 where UEDIN were involved in benchmarking and simulating power and energy consumption for applications on a range of hardware. The analytic models will be extended to use the full range of counters of the DEEP-EST prototype, and be calibrated using a subset of the Tk2.1 benchmarks. Machine learning techniques will serve to further improve accuracy, and the result will be an accurate set of models for the power and energy requirements of a wide range of workloads on the DEEP-EST prototype.

Deliverables Deliv. Date

D2.1 DEEP-EST benchmark suites [JUELICH; Tk2.1 and Tk2.2]Describe the selected application and low-level benchmarks, and the workloads to be used by Tk2.1 and Tk2.2

M9

D2.2 Initial application analysis and models [BSC; All WP2-Tasks]Initial results of the application analysis, modelling approach and first set of models created

M18

D2.3 Benchmarking, evaluation and prediction report [BSC; All WP2-Tasks]Report detailing the benchmarking results in Tk2.1 and Tk2.2, as well as the final application models and Exascale predictions of Tk2.3 and Tk2.4

M36

Relevant Milestones Lead Partner

Deliv. Date

MS3 Initial co-design input collected JUELICH M4MS5 DEEP-EST benchmark suites defined JUELICH M9MS7 Initial application analysis models and first full implementation of

system software and programming environment availableBSC M18

39 http://www.esfri.eu/esfri_roadmap2016/roadmap-2016.php

DEEP-EST Page 35 of 122 Last saved 27.03.2017 08:08

Page 36: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

3.1.3.3 Work Package 3: System architecture

Work package number

3 Lead beneficiary Intel

Work package title

System architecture

Participant number

1 2 3 5 6 7 8

Short name of participant

JUELICH Intel BADW-LRZ

ETH-Aurora

Megware UHEI EXTOLL

Person/month per participant

23 26 3 8 6 6 6

Start Month 1 End Month 36

Objectives Define the system architecture and create the high-level specification for the DEEP-EST

prototype in close co-design collaboration. Manage technical project risks via a sequence of evaluation platforms, and provide early

access to new technology for SW developers. Assess fulfilment of the high-level system and module specifications in light of WP4

results. Report on achieved results and lessons learnt and propose an evolution of the MSA

towards Exascale.

Description of workThe MSA foresees multiple dissimilar (compute) modules to be integrated into a consistent heterogeneous system. Each module is assumed to be a homogeneous composition of a large number of nodes with an interconnect fabric establishing high-speed communication for the application layer. Across the modules these interconnects might be different.

The main innovative elements of the DEEP-EST prototype architecture are:

Three specific modules: Cluster Module, Extreme Scale Booster and Data Analytics Module. They are connected to an external storage system (the Scalable Storage Service module).

Network Federation enabling inter-module communication, orchestration and management, in effect binding all modules together.

Network-attached Memory (NAM) units that are attached to the NF and provide large amounts of storage-class memory and compute-near-memory resources to all modules.

Global Collective Engines (GCE) units which accelerate collective MPI operations across the EXTOLL network.

RAS plane which facilitates highly scalable, high-frequency monitoring of system status and performance parameters, as well as control of system operation.

This WP will define the system-level architecture and the high level specifications of the modules, which are then taken over by WP4 for the prototype implementation. Evaluators and software development vehicles will be deployed and used by other WPs. In the second and third years of the project WP3 will assess the fulfilment of the high-level specifications, defining and running functionality and performance tests once the prototype is deployed. Finally, it will summarise the lessons learned from the architecture perspective.

Task 3.1: System architecture in co-design (JUELICH, Intel, BADW-LRZ, ETH-Aurora,

DEEP-EST Page 36 of 122 Last saved 27.03.2017 08:08

Page 37: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Megware, UHEI, EXTOLL; M1-M12)

This Task will define the overall system architecture in the first six months of the project, which constitutes the first intense co-design and definition phase, and will involve WP1 (Applications), WP5 (System SW) and WP6 (Programming Models). The (qualitative and quantitative) requirements identified by application and system software developers will shape system architecture and high-level specification of the DEEP-EST prototype system. An in-depth study of available technology options will be conducted, driving the architecture decisions to find the best match between SW needs and HW options.

Decisions will include aspects as technologies used for the complete system and modules, the relative size of modules, the way to bridge between different network modules, and the attachment of globally accessible memory (NAM).

Besides the specific SW requirements, the system architecture will accommodate the need for highest performance, scalability, energy efficiency, and resiliency. A monitoring and RAS architecture based on the DEEP/-ER results will serve to achieve the two latter performance aspects. WP4 partners are involved in the architectural discussions, analysing their technical and economic feasibility taking into account the technology roadmap of suppliers and availability of components.

After this initial co-design phase, the Task will proceed to a more detailed high level design of the DEEP-EST prototype system. This will cover the exact set of modules to be built, the technologies to be used for each and their approximate size, the cross-module bridging implementation and a minimal sensor set (e.g. for temperature, voltage and current) to be implemented in the modules. The specification will also cover the NAM and GCE, with the RAS requirements applying to it as well.

Task 3.2: Evaluators and SDVs (Intel, JUELICH, ETH-Aurora, Megware; M3-M24)

To keep control of the overall progress, and to be able to react on internal or external risks materialising, it is imperative to introduce intermediate evaluation points for the component development and integration activities performed in the project. These take the form of system evaluators. An additional form of intermediate system artefacts are software development vehicles (SDVs), which provide elements of the compute modules and the interconnect systems to system- and application-level SW developers at an early stage. Two levels of system evaluators are foreseen:

Node evaluators implement (early versions of) single nodes of modules developed in DEEP-EST (CM, ESB, DAM, and NAM). They serve to demonstrate CPU and memory functionality, and will be available first for each of the listed modules.

Module evaluators combine a number of nodes with the specific interconnect of the module and serve to demonstrate the complete intra-module functionality. These will likely trail the node evaluators; for mature modules (such as the Cluster), they could actually be merged with them.

The primary role of the evaluators is to test and validate functionality and performance compared to the accepted DEEP-EST design envelope; once this is achieved, they can also serve to provision SDVs to the software WPs. By necessity, these SDVs will be of a small scale, focus on a specific module and be paid from the development HW budget.

For the DEEP-EST prototype, it is foreseen to cover the ESB (node and module evaluator, SDV), the DAM (node and module evaluator, SDV), the NAM (node evaluator), and an evaluation setup for the NF plus any bridging components that might be necessary. Since the CM will very likely be a low-risk evolution of proven technology, only a module evaluator is planned. Later it will be fully incorporated into the final prototype. Evaluators of the novel EXTOLL interconnect system developed in Tk4.4 and the GCE will also be deployed.

DEEP-EST Page 37 of 122 Last saved 27.03.2017 08:08

Page 38: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Task 3.3: System assessment (Intel, JUELICH, BADW-LRZ, ETH-Aurora, Megware; M12-M27)

The fulfilment of the system specifications and the application requirements by the prototype constructed in WP4 will be assessed here. Functional and performance tests will be defined, together with the system developers, for verification. Once the system is delivered, these tests will be run and results reported in D3.4.

Task 3.4: Architecture evaluation and outlook (Intel, JUELICH, BADW-LRZ; M27-M36)

This Task will collect and report on the experience gathered through the co-design cycle, the system specification, its verification and final operation. Based on the experience of application and software developers, and operators, it will extract conclusions on the MSA, its usability, advantages and disadvantages with respect to standard architectures, and propose improvements for the future.

Deliverables Deliv. Date

D3.1 System architecture [JUELICH; Tk3.1]High level, initial design of the main system characteristics.

M6

D3.2 High level system design [JUELICH; Tk3.1]More detailed high level specifications of the characteristics and requirements to be fulfilled by the system and each of its modules. Includes the criteria to later make the prototype assessment.

M12

D3.3 Tests for prototype assessment [Intel; Tk3.3]Tests to be run on the prototype constructed in WP4 to make sure that it fulfils the specifications and criteria specified in D3.2.

M18

D3.4 Prototype assessment [Intel; Tk3.3]Results of verification tests.

M27

D3.5 Modular Supercomputer architecture assessment and outlook [Intel; Tk3.4]Summary of architecture experience, outlook, and recommendations for future improvements.

M36

Relevant Milestones Lead Partner

Deliv. Date

MS3 Initial co-design input collected JUELICH M4MS4 System hardware and software architecture JUELICH M6MS6 Hardware and software specifications JUELICH M12

DEEP-EST Page 38 of 122 Last saved 27.03.2017 08:08

Page 39: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

3.1.3.4 Work Package 4: Prototype development

Work package number

4 Lead beneficiary ETH-Aurora

Work package title

Prototype development

Participant number

1 2 3 5 6 7 8

Short name of participant

JUELICH

Intel BADW-LRZ

ETH-Aurora

Megware UHEI EXTOLL

Person/month per participant

27 3 7 140 30 31 77

Start Month 6 End Month 36

Objectives Design, construction and integration of the Cluster Module (CM), the Extreme Scale

Booster (ESB) and the Data Analytics Module (DAM). Design, construction and integration of advanced EXTOLL interconnect solutions,

including the Global Collective Engine (GCE). Design and construction of the Network Federation (NF) across the modules mentioned

above, and of the Network Attached Memory (NAM). Installation, bring-up and operation of the complete DEEP-EST prototype in the JUELICH

data centre, including support for the operations team.

Description of workThis WP will design, test, integrate, install, and support the DEEP-EST prototype. From the architecture specification of Tk3.1 a technically sound, full system design and specification will be created, taking into account technology availability as well as time and budget constraints. Part of WP4’s activities is to maintain close relations with hardware vendors to ensure alignment of their release schedule with the DEEP-EST timeline. WP4 will also study the mechanical, power, and thermal constraints and provide feedback to Tk3.1 to shape the high-level system design.

WP4 is structured according to the DEEP-EST prototype modules and components. The HPC system integrators ETH-Aurora and Megware will develop and deploy the three main DEEP-EST modules: the CM and the ESB by ETH-Aurora, and the DAM by Megware. All relevant engineering aspects, including node block diagram and form factor, CPU and memory plus storage build outs per node, network integration and interfaces, integration of nodes into chassis and system, energy supply and cooling, and HW-supported sensors and actuators will also be taken into account. Completing the prototype, EXTOLL and UHEI will develop an advanced interconnect solution based on EXTOLL, the Network Federation (NF), the Network Attached Memory (NAM), and the Global Collective Engine (GCE). JUELICH will procure a suitable Scalable Storage Service Module (SSSM). BADW-LRZ contributes to the discussions on the required sensors, their integration, and support.

Based on D3.2, Tk4.1 to Tk4.5 will perform feasibility studies, undergoing multiple iterations of system design refinement using the evaluators and SDVs from Tk3.2. The resulting system specifications will be documented in D4.1, D4.2, D4.3 and D4.4. Then integration and tests of the corresponding DEEP-EST modules and its network and attached devices will start leading to the final design of all hardware components, including the final mechanical integration into one or more racks, the implementation of an efficient power delivery solution, an efficient cooling solution, and a thorough system instrumentation. The outcome of the

DEEP-EST Page 39 of 122 Last saved 27.03.2017 08:08

Page 40: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

second phase will be documented in deliverables D4.5, D4.6, D4.7, and D4.8.

Finally, Tk4.6 will integrate all modules to realise the DEEP-EST prototype as a working system. It will also install it at JUELICH, perform its bring-up and transition to a support and maintenance phase, making the prototype available to the application users. This phase will keep the system alive and in good working condition throughout the project.

Task 4.1: General purpose Cluster Module (CM) (ETH-Aurora, BADW-LRZ; M6-M20)

The CM will provide support for generic, parallel HPC applications, using general-purpose processors with high single-thread performance and good support for highly complex or dynamic control flows. It will be the home for all components of an application that do not fit well to the other, more specialised modules (ESB or DAM). At the same time this module can dispatch all applications or workflows that might then fork out to the other modules. The CM will also provide the connectivity to the SSSM, and will act as the gateway to the JUELICH campus network for storage, user and administration connectivity.

It is expected that the CM will largely leverage available technology, with extensions to make it fit into the MSA structure of the DEEP-EST prototype. The feasibility study (D4.1) will have a focus on the CM integration with the larger DEEP-EST prototype.

Task 4.2: Extreme Scale Booster (ESB) (ETH-Aurora, BADW-LRZ; M6-M24)

The work in this Task leverages the results of the DEEP/-ER projects. Working closely with WP3 and the DDG, this Task will define the ESB and report the results in D4.4 and D4.7.

The ESB will be tailored to the needs of highly scalable (parts of) applications and is expected to become the largest module in the DEEP-EST prototype. Because of that, special focus will be put into realising a highly scalable system from the hardware point of view, with density and energy efficiency as key aspects of the system integration.

Task 4.3: Data Analytics Module (DAM) (Megware, BADW-LRZ; M6-M24)

This Task will design and specify the DAM, based on the specifications provided in D3.1 and D3.2.

The DAM will be designed to best support workloads in the HPDA and machine learning space. Its exact architecture will be determined during the co-design phase, although it is likely that the nodes will be characterised by:

Some high performance processing elements coupled with support for acceleration of often-used code paths and/or operations, possibly by using reconfigurable computing techniques.

Large amounts of directly addressable memory to support very large datasets. High bandwidth, low latency I/O components, especially focussed on bandwidth and

random access latencies (this might include non-volatile memory technology close to the processing elements).

Design and specification will be documented in D4.2, while D4.8 reports on the delivered DAM.

Task 4.4: EXTOLL interconnect and GCE (EXTOLL, UHEI; M6-M20)

The European interconnection technology EXTOLL is ideally positioned to serve as a highly scalable interconnect for compute modules such as the ESB. A new concept of tightly integrating the structured 3D network topology of the EXTOLL network into one box forming a cube fabric module will be targeted. Such a network module (called fabric cube: fabri3) will provide networking connectivity for a number of nodes ranging from 8 to a whole rack (64+ nodes). Furthermore, fabri3’s can be connected to form arbitrary large networks. The basis of fabri3 is the existing EXTOLL Tourmalet ASIC, which integrates a network interface controller

DEEP-EST Page 40 of 122 Last saved 27.03.2017 08:08

Page 41: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

and switching logic within a single high-performance chip. Each fabri3 will feature embedded controllers for management and RAS purposes. Fabri3 modules will also feature connections to special purpose extension modules (like accelerators (GCE) or bridges to other networking technologies). Use of the next-generation Tourmalet chip (developed outside of DEEP-EST) will be evaluated around the midpoint of the project.

To enhance the scalability of the EXTOLL fabric for MPI collective operations, a Global Collective Engine (GCE) integrated with the fabri3 will be developed. It will allow blocking and non-clocking collective operations to be executed without engaging the participating CPUs. Detailed architecture and features will evolve from the co-design with the other software and hardware activities. A low-level library interface for the GCE will be developed that enables MPI implementations to take advantage of the GCE for their advanced MPI collective operations. In Tk6.1 ParaStation MPI will be adapted to use the GCE.

Design and specification of the fabri3 and GCE will be documented in D4.3, while D4.5 reports on the implementations delivered.

Task 4.5: Network Federation (NF) and NAM (UHEI, EXTOLL; M6-M20)

The project plans to use EXTOLL as the basis for Network Federation (NF) to profit from the flexibility of its driver and configuration of the NICs, plus complete control over its firmware. Results from DEEP/-ER will be used as far as possible, and UHEI and EXTOLL will implement the adaptations that will be required for the NF.

To connect to other networks, gateway nodes are foreseen. These nodes will feature energy-efficient CPUs and use COTS hardware. The project will build upon the substantial experiences from the DEEP project, were a very efficient and high performance bridging solution between InfiniBand and EXTOLL has been developed. Software required for bridging to other networks will be developed within Tk5.3.

The Network Attached Memory (NAM) concept first introduced in DEEP-ER will be significantly enhanced in DEEP-EST to support higher memory-capacity, and enable more flexible programmability of NAM processing resources. To provide high capacity (up to multiple TB per NAM) and bandwidth (enabling wire-speed access), different memory technologies will be explored, in particular Intel’s 3D XPoint (as NV-DIMM or NVMe-attached devices), and NAND Flash (NVMe attached devices).

Upcoming state-of-the-art components will be analysed in terms of their utility to implement the NAM functionality in a cost effective and capable way. High-throughput server CPUs with a fast in-package FPGA, combined with high-capacity non-volatile memory modules directly connected to the memory subsystem are a promising candidate for the NAM. Access patterns of regular or very irregular as required by HPDA applications will be supported. Additional data manipulation operations will be available in the access path to the memory. Part of this Task is to analyse the possibilities of different data manipulation operations in the NAM.

Appropriate software libraries will be developed to support the use of these NAM modules. On the software side integration into existing programming environments, in particular for HPDA, will be addressed in Tk6.3. The design and specification of both NF and the NAM will be reported in D4.3, while D4.5 will describe the delivered components.

Task 4.6: Prototype installation and maintenance (ETH-Aurora, JUELICH, Intel, BADW-LRZ, Megware, UHEI, EXTOLL; M20-M36)

This Task will first be responsible for the installation and bring-up of the DEEP-EST prototype at JUELICH. The system integrators (ETH-Aurora and Megware) will deliver and install the computer racks in the room, while EXTOLL and UHEI contribute to the bring-up and support of the network, NAM, and GCE devices and BADW-LRZ for the RAS-plane tools. JUELICH

DEEP-EST Page 41 of 122 Last saved 27.03.2017 08:08

Page 42: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

will prepare the local infrastructure and procure and install the SSSM - a limited-size system based on off-the-shelf components providing local storage to the prototype. The SSSM will also serve as a front-end to JUELICH’s large-scale GPFS storage pool, which will provide large storage capacity to the applications.

After the successful bring-up of the DEEP-EST hardware, the software stack developed in WP5 and WP6 will be installed and thoroughly tested. Software bring-up will be a joint effort between the partners providing the software and hardware components of the system. Deliverable D4.9 will report on all of these activities.

After this, the DEEP-EST prototype will be put into production mode, making it available to the application developers. This will require maintenance and support from the WP3 partners to keep the system in good working condition. This might include hardware interventions to correct potential system or component failures. For the system level software and programming environments, support (bug fixes, performance and functionality improvements) will be provided by WP5 and WP6, respectively.

Deliverables Deliv. Date

D4.1 Cluster Module design [ETH-Aurora; Tk4.1]CM design, including a detailed list of its main components and integration.

M12

D4.2 Data Analytics Module design [Megware, Tk4.3]DAM design, including a detailed list of its main components and integration.

M12

D4.3 Network Federation, fabri3, NAM and GCE designs [EXTOLL; Tk4.4 and Tk4.5]Design of Network Federation architecture, fabri3, NAM and GCE.

M12

D4.4 Extreme Scale Booster design [ETH-Aurora, Tk4.2]Design of the ESB, including details on its main components and integration

M15

D4.5 Network Federation, fabri3, NAM and GCE [EXTOLL; Tk4.4 and Tk4.5]Final implementation of the Network Federation, fabri3, NAM and GCE.

M20

D4.6 Cluster Module [ETH-Aurora; Tk4.1]Completed system, delivered to Jülich. Short report summarising results of components tests and any final system modifications not foreseen in D4.1.

M20

D4.7 Extreme Scale Booster [Lead ETH-Aurora; Tk4.2]Completed system, delivered to Jülich. Short report summarising results of components tests and any final system modifications not foreseen in D4.4.

M24

D4.8 Data Analytics Module [Lead Megware; Tk4.3]Completed system, delivered to Jülich. Short report summarising results of components tests and any final system modifications not foreseen in D4.2.

M24

D4.9 Final DEEP-EST prototype report [Lead ETH-Aurora; All WP4-Tasks]Description of the final DEEP-EST prototype, including all its modules and components: CM, ESB, DAM, FN, fabri3, NAM and GCE, and a description of the infrastructure preparation and the prototype’s installation process.

M30

Relevant Milestones Lead Partner

Deliv. Date

MS6 Hardware and software specifications JUELICH M12MS9 Prototype delivered to Jülich, SW and applications ready for

deploymentETH-Aurora M24

MS10 Prototype bring-up complete, SW and programming environment installed on the system

ETH-Aurora M25

DEEP-EST Page 42 of 122 Last saved 27.03.2017 08:08

Page 43: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

3.1.3.5 Work Package 5: System software and management

Work package number

5 Lead beneficiary JUELICH

Work package title System software and managementParticipant number 1 2 3 4 7 8 9 10Short name of participant

JUELICH

Intel BADW-LRZ

BSC UHEI EXTOLL UEDIN FHG-ITWM

Person/month per participant

78 16 53 42 2 41 6 6

Start Month 1 End Month 36

Objectives Design the overall software architecture to enable efficient utilisation of MSA systems. Improve the manageability and resiliency of the high speed interconnects. Federate different interconnect technologies to allow applications running distributed

across several modules. Implement an optimised resource management for the MSA maximising resource use. Extend the SLURM scheduler to the MSA and heterogeneous applications. Provide monitoring and data collection SW capabilities for continuous measurement of

power consumption and other system attributes of the DEEP-EST prototype.

Description of workThis WP will define the system software architecture for the MSA and implement it for the DEEP-EST prototype. Building on the achievements and experience gained in the DEEP/-ER projects it takes into account the specific challenges posed by the new MSA.

The work here will be a catalyst for transforming hardware characteristics and application requirements into software and programming functionalities to be implemented in both WP5 and WP6. It will federate different interconnect technologies and improve the manageability and fault-tolerance of the high-speed interconnects used in the DEEP-EST prototype. The resource management will be extended to handle multiple heterogeneous modules of the MSA, and to support heterogeneous applications and workload mixes by implementing co-scheduling strategies for the global job scheduler. Frequent and detailed system monitoring will enable optimal resource utilisation, energy efficiency and reliable operation. This will be achieved by a highly-scalable monitoring and RAS (Reliability, Availability, Serviceability) infrastructure that enables the correlation of various sensor data.

This WP will be led by JUELICH’s linked third party ParTec.

Task 5.1: Software architecture (JUELICH, Intel, BADW-LRZ, BSC, UHEI, EXTOLL, FHG-ITWM; M1-M12)

This Task will collect the requirements for a system software architecture that enables an efficient utilisation and exploitation of a MSA-system. In a second step these requirements will serve to define the programming environment (WP6) and the system software layers enabling management, I/O, and resiliency of the DEEP-EST prototype.

Information will be collected from WP3 and WP4 (regarding the architecture of the individual compute modules and the overall system) and from WP1 and WP2 (requirements in terms of functionality and interfaces needed by application codes, benchmarks and modelling tools). The system software architecture matching this set of requirements will be defined in D5.1. In a next step D5.2 will document all components and layers, the interfaces between them,

DEEP-EST Page 43 of 122 Last saved 27.03.2017 08:08

Page 44: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

propose SW packages for their implementation and assign developers responsibility.

Task 5.2: Interconnect management (EXTOLL; M6-M36)

This Task will develop advanced fault tolerance, resiliency and management functionality for the interconnection networks in a MSA-system. This includes an efficient and automatic handling of dynamic changes in topology (due to changes in module composition), and handling of transient and permanent failures of links and nodes. Different classes of failures demand suitably orchestrated actions from different levels of the stack. Therefore, strong co-design and collaboration with the other Tasks in WP5, such as the resource manager or job scheduler, will be pursued.

MSA also poses new challenges on network routing, since different modules with different internal networking structures may lead to very complex topologies. Advanced software will be developed to federate the different network topologies across modules, resulting in a global topology. This Task will explore deterministic and adaptive routing strategies to maximise network availability, resiliency, usability and performance.

Results will be reported in Deliverables D5.3, D5.4 and D5.5.

Task 5.3: Inter-module network bridging (JUELICH, Intel, EXTOLL; M6-M36)

A crucial part of the MSA is communication between modules since each one may internally use a different interconnect technology. To federate these different networks, this Task will develop network bridges to provide transparent and efficient communication between modules. The main result will be a high-bandwidth, low-overhead communication solution enabling transfers between nodes on the different modules of the DEEP-EST prototype. Its implementation will follow the application’s demand on inter-module bandwidth and latency.

The network bridging will take place in gateway nodes developed by Tk4.5 that use standard server technology and provide the network interfaces that have to be bridged. The software implementation will use daemon processes running on these gateway nodes which mediate between the different protocols and APIs of the respective network technologies. The Cluster-Booster Protocol as developed in DEEP is an important input here.

Depending on the differences between the bridged networks in their protocols and functionality, these daemons may have to actively forward message chunks between the two interfaces, or they can rely on autonomous data transfer capabilities of the network adapters. In the latter case setting up an appropriate mapping between address spaces would be sufficient. Nevertheless, even in the former case performance improvements are expected, e.g. by using shared memory message buffers and avoiding unnecessary CPU-driven copy operations. Key objective of this Task is to extend the number of combinations between interconnect technologies that are ready for use in MSA systems.

The Task is led by JUELICH’s linked third party ParTec, which will also develop the daemons and integrate them into the communication framework. Results will be reported in Deliverables D5.3, D5.4 and D5.5.

Task 5.4: Resource management (JUELICH; M6-M36)

Resource management in this Task includes monitoring and controlling of the node-local resources, reporting their status to and taking requests from the global job scheduler. A dedicated entity located on each node of the system creates local processes of a distributed MPI session and explicitly assigns computing and other resources that the local node offers to these processes. This entity is also responsible for supervising processes during their lifetime, accounting resource usage, and to ensure a proper clean-up after termination. Part of the supervising activity is the forwarding of standard I/O channels and signals.

This resource management entity will be embodied by the ParaStation management daemon,

DEEP-EST Page 44 of 122 Last saved 27.03.2017 08:08

Page 45: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

which will in turn interact with SLURM as the batch system and job scheduler (see Tk5.5) via its psslurm plugin. To meet the MSA needs, the existing ParaStation process manager infrastructure has to be extended to deal with the heterogeneity and diversity of different system parts as well as the resulting process structure of application bundles running on them. Extensions for managing resources that are not represented by regular MPI processes (e.g. the NAM or HPDA segments) have to be developed. The local management daemons form a comprehensive network across the whole MSA system and will serve to consolidate information across modules to facilitate the orchestration of the whole system.

This Task will be led by JUELICH’s linked third party ParTec. Results will be reported in Deliverables D5.3, D5.4 and D5.5.

Task 5.5: Job scheduler (BSC, JUELICH, Intel; M6-M36)

On a MSA-system, applications will either use resources within a single module only, or run across different modules either at the same time, or successively in a workflow like model. This requires scalable scheduling and co-allocation of resources for jobs within and across modules.

In this Task, the widely used Open Source scheduler SLURM will be extended with features for efficient and scalable scheduling on MSA-systems. The proposed SLURM scheduler will include a parallel scheduling scheme where job scheduling for each module can be done independently of the other modules, and a communication mechanism between the parallel scheduling instances to enable co-allocation of resources. This Task will explore two implementation strategies and select the most convenient one for the project for realisation:

Implement separate scheduler plugins for each module based on SLURM’s internal plugin mechanism and enabling them to run in parallel. A communication mechanism that connects and shares job/resource information with different plugins will be developed to schedule jobs across modules. This approach fits in the standard mechanism of SLURM extensions.

Modify the central scheduler to be multithreaded to enable parallel scheduling of jobs within a module. The parallel schedulers will enable coordination through shared memory to schedule jobs that require resources across modules.

This approach creates multiple advantages such as being able to specify different scheduling policies for each module, ensure faster response times and improve scheduling scalability. This Task will also investigate the need for and benefits of dynamic scheduling of workflows: since types and quantities of resources often differ during runtime, dynamic resources scheduling potentially improves job turnaround time and overall throughput. Furthermore, it will be explored how to use the energy data provided by Tk5.6 and information on data location supplied by Tk6.4 in order to improve the module-local scheduling of resources.

This Task will have two main outcomes, reported in Deliverables D5.3, D5.4 and D5.5: a new module-aware version of SLURM capable of parallel scheduling with efficient communication between the parallel schedulers, and a novel highly parameterised policy that will consider module and job co-allocation to improve overall system performance.

Task 5.6: System monitoring and RAS plane (BADW - LRZ , UEDIN; M6-M36)

Reliable and precise monitoring and emergency features will be essential for a successful hardware bring-up and reliable operation of MSA systems. The power measurements and analysis tools developed in this Task will provide the necessary insight to analyse and optimise the operational parameters of the system in terms of energy efficiency. BADW-LRZ and UEDIN will leverage their experience and knowledge in power and energy monitoring during the co-design phase in order to ensure the required energy/power counters/sensors will be available in the prototype systems providing a unique opportunity to obtain data that

DEEP-EST Page 45 of 122 Last saved 27.03.2017 08:08

Page 46: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

would not be available in of-the-shelf hardware.

The highly-scalable monitoring solution developed in the DEEP project will be extended in four directions: detailed, high-frequency power measurements; hooks for emergency measures (e.g. in case of over-temperature, etc.), Web-based real-time visualisation of monitoring data, and sophisticated analysis tools for monitoring data.

Furthermore the Task will conduct an analysis of different sensor data, including virtual sensors derived from the physical ones. A multitude of performance metrics can be computed on-the-fly. They will be fed back to the batch scheduler to provide per-job statistics

Deliverables Deliv. Date

D5.1 Collection of software requirements [JUELICH; Tk5.1]List of requirements that the WP5 and WP6 software must fulfil, and preliminary specification of the overall software architecture.

M6

D5.2 Software specification [JUELICH; Tk5.1]High level specifications of the complete system software stack, including description of interfaces and interdependencies between the packages.

M12

D5.3 Prototype software implementation [JUELICH; all WP5-Tasks]Initial software implementation of all software components. Report on status of their integration, functionality and capabilities.

M18

D5.4 Complete system-SW implementation [JUELICH; all WP5-Tasks]Full software implementation with the full required functionality, ready for deployment on the DEEP-EST prototype.

M24

D5.5 Software support report [JUELICH; all WP5-Tasks]Report on final software optimisations and summary of the maintenance experience (e.g. installation on the system, bug correction, user feedback)

M36

Relevant Milestones Lead Partner

Deliv. Date

MS3 Initial co-design input collected JUELICH M4MS4 System hardware and software architecture JUELICH M6MS6 Hardware and software specifications JUELICH M12MS7 Initial application analysis models and first full

implementation of system-SW and programming environment available

BSC M18

MS9 Prototype delivered to Jülich, SW and applications ready for deployment

JUELICH M24

MS10 Prototype bring-up complete, SW and programming environment installed on the system

ETH-Aurora

M25

DEEP-EST Page 46 of 122 Last saved 27.03.2017 08:08

Page 47: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

3.1.3.6 Work Package 6: Programming environment

Work package number 6 Lead beneficiary BSCWork package title Programing environmentParticipant number 1 2 4 9 10Short name of participant JUELICH Intel BSC UEDIN FHG-ITWMPerson/month per participant

42 10 60 9 30

Start Month 6 End Month 36

Objectives Extend ParaStation MPI to meet the requirements of the MSA and DEEP-EST prototype. Enhance OmpSs programming model to support the MSA. Support and enhance programming models for HPDA on the MSA. Extend BeeGFS and SIONlib to efficiently support the MSA and DEEP-EST prototype. Develop a common application-based checkpoint/restart interface based on pragmas.

Description of workThis WP will enhance the programming environment developed in the DEEP/-ER projects to support the heterogeneous MSA and its highly dynamic workloads. For this ParaStation MPI, OmpSs and BeeGFS will be extended and linked to the resource manager (RM) and job scheduler. The same holds for HPDA frameworks used by WP1’s applications.

In addition, the programming models will be adapted to enable applications to fully leverage the DEEP-EST prototype. Progress with fast on-package memory and cache line addressable NVM will very likely impact at least some of the DEEP-EST prototype modules. This does suggest the extension of programming models for easy data placement and migration, which will cover the NAM. Moreover, the need to enable straightforward use of accelerator technologies (like FPGAs tightly coupled to CPUs in the DAM) will require API extensions and enhancements of the HPDA frameworks used in the project.

Finally, a pragma-based interface for the DEEP-ER application-based check pointing libraries will be developed, and new features will be investigated that fully support the dynamics of a MSA-system while matching the requirements of evolving and malleable applications.

Task 6.1: ParaStation MPI (JUELICH; M6-M36)

ParaStation MPI will be adapted to the specific requirements of the MSA and optimised for the DEEP-EST prototype. This includes features for modularity-aware message passing between process groups linked together in an evolved and multi-stage topology. For instance the internal communication patterns of collective MPI operations will be optimised for modular topologies, and new hardware features like the GCE (Tk4.4) will be exploited for improved scalability and efficiency. The ParaStation MPI extensions will be transparent to applications.

In addition, extensions of MPI for hybrid programming models such as OmpSs (Tk6.2) and for MSA modules/components that cannot easily be represented by regular MPI processes will be developed. This includes memory resources such as the NAM, which will not run application code on its CPU, the GCE network acceleration engine, the Network Federation components developed in Tk4.5, and also links to HPDA frameworks typically not based on MPI. Care will be taken to stay close to the MPI standard; more experimental extensions (e.g. for supporting concepts like Active Messages) will also be considered if required).

This Task will be led by JUELICH’s linked third party ParTec.

DEEP-EST Page 47 of 122 Last saved 27.03.2017 08:08

Page 48: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Task 6.2: OmpSs programming model (BSC; M6-M36)

The OmpSs offloading features developed in DEEP/-ER will be integrated with the resource manager and job scheduler developed in WP5 to support efficient execution of evolving and malleable applications on MSA systems. Moreover, OmpSs will be extended to make the most of the specific hardware characteristics of the ESB and DAM modules.

Enabling application developers to easily use the large number of cores and the complex memory hierarchy expected for the ESB require new programming model and runtime system features. The OmpSs tasking model will be refined to enable the runtime system to discover the high level of parallelism required to use the many-core ESB processor efficiently. In addition, the OmpSs runtime system will leverage on-package High Bandwidth Memory (HBM) to speed up scalar and array reductions. Finally, the runtime system will be extended to manage the HBM as a software cache to transparently accelerate OmpSs applications.

The DAM will likely contain high capacity cache-line addressable NVM and a tightly integrated FPGA that can be programmed with OpenCL. Both hardware features will be seamlessly integrated in the OmpSs tasking model, with the OmpSs runtime transparently managing the NVM resources and application developers being able to mix tasks and OpenCL kernels without using the OpenCL host API. These extensions will be used for some of the HPDA applications and the underlying frameworks identified in Tk6.3.

Task 6.3: Data Analytics programming model (Intel, JUELICH, BSC, UEDIN; M6-M36)

Data analytics and machine learning applications often rely on powerful frameworks that provide high-level data processing and machine learning functionality. Examples of these are Hadoop and Spark for the former, and TensorFlow, Caffe, or Theano for the latter. The initial co-design phase of the project will identify the frameworks and libraries used by the WP1 applications. These will be ported to and optimised for the MSA in general and the DAM of the DEEP-EST prototype. A detailed technical discussion of the selection and optimisation plans will be part of D6.1. It will also detail plans for optimisation, which can include the judicious use of OmpSs and adaptation to enable distributed execution via ParaStation MPI.

Furthermore, it will be necessary to integrate the selected frameworks with the scheduling systems and user tools used on the prototype, and enable pre-loading, curation, and management of data on the DAM in a way that supports workflows commonly used for machine learning or data analytics. Focus will be on ensuring that a range of frameworks can be active on the DAM at any one time, across a range of different nodes, as required by the jobs running on the DAM at a given point, by working closely with Tk5.4 and Tk5.5.

For D6.2, fully functional versions of the above programming model(s) will be produced on early DAM nodes/SDVs. For D6.3, a version which is optimised for the final node and interconnect architecture will be established, and during the rest of the project, bug fixes and further optimisations will be provided as required by the WP1 applications.

Task 6.4: I/O, file system, and storage (FHG-ITWM, JUELICH; M6-M36)

The DEEP-ER project introduced a concept for caching data in proximity to the compute nodes based on BeeOND (“BeeGFS on Demand”). BeeOND is a framework to create temporary parallel file systems, e.g. across all nodes used in a compute job, using their locally attached storage (e.g. flash or NVM). Such a "per-job-filesystem" can serve as a cache layer by the applications and is ideally created by the resource manager. In this Task, BeeOND will be integrated with the resource manager of Tk5.4. An advanced implementation of the same concept will explore SLURM’s capability to stage files to and from global storage into fast local “burst buffers” controlled by the resource manager. This enables dedicated shared parallel filesystem for jobs to speed up I/O and minimise concurrent accesses to global storage.

DEEP-EST Page 48 of 122 Last saved 27.03.2017 08:08

Page 49: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

In a MSA, such a BeeOND instance might spread across different compute modules. BeeOND will be extended to support storage pools and to allow data placement on a specific pool, minimising communication across module boundaries.

Currently, BeeGFS storage servers utilise POSIX-compliant file systems. Storage hardware and interfaces are expected to change significantly in the next few years. It is important for the MSA to be ready for this and, therefore a plugin architecture will be developed for the BeeGFS storage server. For the DEEP-EST prototype byte addressable memory devices (e.g. NVDIMMs) will be supported as storage backends. This will enable the use of the NAM as a fast BeeGFS storage target.

Finally, the BeeGFS monitoring facilities will be extended to use a time series database. This will allow gathering statistics (e.g. operations from specific clients) over time and can serve as an additional basis for application analysis and benchmarks performed in WP2.

A second pillar of the I/O concept is SIONlib40 as a means to efficiently map Task-local I/O on parallel file systems like BeeGFS. Building upon the DEEP-ER developments, SIONlib will be adapted to support additional requirements posed by the MSA.

Task 6.5: Resiliency (BSC, JUELICH; M6-M36)

Evolving, malleable and workflow applications pose a challenge for traditional application-based checkpointing libraries. Well-known libraries such as SCR or FTI only support one checkpoint location and lack enough flexibility for complex applications. This Task will develop a flexible, portable and convenient interface based on pragmas enabling the application developer to specify which data has to be saved/restored, with compilation and runtime systems managing data serialisation/ deserialisation and I/O activities via FTI or SCR libraries. New check-pointing features enabled by the pragma interface will be also explored, such as support for multiple checkpoint locations, and for incremental and differential checkpoints, which promise to significantly reduce checkpoint footprint and overhead.

Deliverables Deliv. Date

D6.1 Design and specification of programming environment [BSC; all Tasks]Detailed specification of the overall programming model.

M9

D6.2 Prototype programing environment implementation [BSC; all Tasks]Initial programming environment setup ready to be tested, including all involved components and interfaces.

M18

D6.3 Complete programming environment implementation [BSC; all Tasks]Fully functional and optimised programming environment available and ready to be deployed on the prototype

M24

D6.4 Programming environment support report [BSC; all WP6-Tasks]Report on final optimisations and summary of the maintenance experience (e.g. installation on the system, bug correction, user feedback)

M36

Relevant Milestones Lead Partner

Deliv. Date

MS6 Hardware and software specifications JUELICH M12MS7 Initial application analysis models and first full implementation of

system software and programming environmentBSC M18

MS9 Prototype delivered to Jülich, SW and applications ready for deployment

ETH-Aurora

M24

40 For each Deliverable the lead partner and the tasks involved are given.

DEEP-EST Page 49 of 122 Last saved 27.03.2017 08:08

Page 50: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

MS10 Prototype bring-up complete, SW and programming environment installed on the system

ETH-Aurora

M25

3.1.3.7 Work Package 7: Dissemination

Work package number 7 Lead beneficiary BADW-LRZWork package title DisseminationParticipant number 1 3 9Short name of participant JUELICH BADW-LRZ UEDINPerson/month per participant

3 24 9

Start Month 1 End Month 36

Objectives Communicate key features, objectives and outcomes of DEEP-EST to a wide range of

stakeholders. Identify innovation opportunities and supporting partners in leveraging these and liaise

with HPC industry, lobbying and special interest groups and standardisation bodies. Promote European partnerships for HPC development and attract industrial and

academic HPC users to test the DEEP-EST prototype. Implement a targeted education and training programme for project internal staff as well

as distribute the know-how gained in DEEP-EST to external communities.

Description of workAbove and beyond the Cluster-Booster architecture as demonstrated in the DEEP/-ER projects, DEEP-EST now introduces the MSA as a significant innovation that will bring further improvements in scalability, system performance, throughput and efficiency, and facilitate the convergence of HPC and HPDA. WP7 supports this effort with three main activities.

First, the team will proactively communicate on the project objectives, approach and outcomes to raise awareness and establish strong relations with key stakeholders (especially with potential users in academia and industry via a dedicated early access programme). In this way, collaboration with major players in the European HPC arena will be facilitated, thus contributing to the implementation of the SRA and finally to position Europe as technological leader in the race towards Exascale and HPC/HPDA convergence. Specific selected target groups are detailed in Section 2.2.1.

Second, WP7 will carry out actions for the dissemination and exploitation of project results and provide support for leveraging innovation opportunities. Close cooperation between the DDG lead, the PM and WP7 within the IC and the establishment of strong ties between WP7 and the project partners will ensure the success of such activities.

Third, WP7 will ensure that knowledge transfer and necessary trainings within the project are arranged and that education of the external public takes place.

Task 7.1: Communication and Outreach (BADW-LRZ; M1-M36)

This Task will define and implement the main communication strategy for DEEP-EST as outlined in Section 2.2. It will ensure that key messages are communicated appropriately to the various stakeholders and target groups and that communication and outreach activities of all Tasks in WP7 are streamlined and form a coherent public appearance. An essential aspect of the outreach activities will be the collaboration with key players in the European HPC arena, which has been established already during the DEEP/-ER projects and will be continued and extended.

DEEP-EST Page 50 of 122 Last saved 27.03.2017 08:08

Page 51: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Specific activities include: Create a detailed communication plan, keep it up-to-date and include success metrics

(e. g. website statistics, numbers of followers and likes in social media). Develop the branding strategy of the DEEP projects family including an update of the

corporate design reflected in all collateral and the Web site. Define and update specific key messages for DEEP-EST target groups. Create a constant flow of digital information via the project website, social media and

regular e- newsletters. Present DEEP-EST at relevant events, conferences and trade shows like SC or ISC

in the academic programme and on the show floor. Put special focus on media relations: maintain and expand existing close contacts to

key media in IT, industry and science and establishing new contacts.

Task 7.2: Innovation, Dissemination and Exploitation (BADW-LRZ; M1-M36)

Innovation is a focal point for dissemination and exploitation. In close collaboration with the IC, this Task integrates with the overall communication strategy and will leverage synergies wherever possible. Specific activities will include:

Manage the IC, gathering technical input from the DDG, coordinating with WP8 on IPR issues, and supporting individual partners.

Implement the open access strategy as outlined in Section 2.2. Participate in industry focused events (e.g. industry round-tables organised by

partners or conferences with industry focus). Liaise with lobby groups (like ETP4HPC) and relevant special interest groups (like

EEHPC or openHPC) on a regular basis to inform them on the project progress. Support partner’s communication and outreach activities as needed. Develop a web repository containing documentation, access to the software

developed, early success stories, and best practices. Write policy briefings to inform political stakeholders and lobbying organisations of

how project achievements advance Europe’s technological HPC leadership.

Task 7.3: Early access programme for industrial and academic users (UEDIN; M1-M36)

Task 7.3 will promote European HPC development partnerships and attract industrial and academic HPC users to test the DEEP-EST prototype. Special attention will be given to SMEs. DEEP-EST application partners will drive the co-design effort that is essential for the project’s success. It is important to enlarge the user community for DEEP-EST to gain further insights and feedback from academic domains and in particular from industrial users.

Specific activities include: Establish an (industrial) user group with focus on SMEs. Leverage existing networks

through collaboration with European projects such as Fortissimo. Arrange Open Calls for participation among industrial users. Raise awareness about the Web repository created by Tk7.2. Run awareness days to promote the DEEP-EST platform to as widely as possible. Produce proof points of using the DEEP-EST prototype demonstrating key features

and benefits academic and industrial users (case study booklet).

Task 7.4: Training and Education (JUELICH; M1-M36)

In close cooperation with Tk1.1, a systematic education and training programme will be defined to provide partners with the knowledge required to effectively implement their contribution. Furthermore, the results of the project will be shared outside of DEEP-EST.

Specific activities include: Arrange targeted internal trainings and workshops focusing on aspects relevant for

the R&D efforts, like e.g. application optimisation on of Xeon Phi, OmpSs resiliency &

DEEP-EST Page 51 of 122 Last saved 27.03.2017 08:08

Page 52: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

fault tolerance, use of JUBE etc. Make training material available via the website. Liaise with partners like PRACE (training program, summer school) or Women In HPC

to co-organise trainings; Apply for conference workshops at SC or ISC and European HPC Summit Week.

Deliverables Deliv. Date

D7.1 Communication plan, toolkit and owned channels [BADW-LRZ; Tk7.1]Details of the communication and brand strategy including a toolkit with materials like logos and collateral templates and DEEP-EST communication channels.

M4

D7.2 Repository for training material [JUELICH; Tk7.4]Design, access information and initial contents for the training repository.

M12

D7.3 Early access programme for industrial and academic users [UEDIN; Tk7.3]Report on the established academic and industrial user group.

M18

D7.4 Policy briefing [BADW-LRZ; Tk7.2]Briefing for political deciders and lobbying groups detailing conclusions from project results and impact on HPC development in Europe.

M24

D7.5 Case study booklet [UEDIN; Tk7.3]Success stories and best practices gained from the early access program.

M34

D7.6 Final outreach activity [BADW-LRZ; Tk7.2]Final brochure or project video, or an event or a combination of these. It will be decided during the project which will be deemed most appropriate.

M36

Relevant Milestones Lead Partner

Deliv. Date

MS2 Quality control plan, dissemination plan, website and social channels available

BADW-LRZ

M4

MS8 Midterm innovation review(IC review of innovation options that have arisen during the first half of the project term, and of IC processes.)

BADW-LRZ

M18

DEEP-EST Page 52 of 122 Last saved 27.03.2017 08:08

Page 53: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

3.1.3.8 Work Package 8: Coordination

Work package number 8 Lead beneficiary JUELICHWork package titleParticipant number 1Short name of participant JUELICHPerson/month per participant

64

Start Month 1 End Month 36

Objectives Target- and result-oriented, management of the project, including risk, conflict and IP. Effective project communication, both internal and with the EC: deliverables and periodic

reports handed on time to the EC. Quality control of results and deliverables following the quality control definitions. Transparent financial management and control: periodic reports handed yearly to the EC.

Description of workThe partner JUELICH is the one with largest management responsibilities and is responsible for collecting the information and preparing the periodic progress reports. The other partners are committed by the Consortium Agreement to support the Project Manager on WP8. The partners support effort is implicitly included on their person-months on the respective WPs.

Task 8.1: Management (JUELICH; M1-M36)

The management structure of the project is organised on levels of different responsibilities (see details in Section 3.2). The top level authority of the project is the Board of Partners (BoP) headed by the Project Coordinator. The day to day operation and the implementation of the goals defined by the BoP will be done through a Project Management Team (PMT) headed by a designated Project Manager (PM) located at the Coordinator’s site. The PM will be supported by the administrative and financial departments of the Coordinator.

The PMT will ensure that the work assigned to the different WP teams will progress according to schedule and within the allocated budget. The progress will be monitored regularly on a quarterly basis. Exceptional conditions or problems will be handled according to the Risk Management (see Section 3.2.6) or in case of conflicts between members according to the Conflict Management (see Section 3.2.7). Intellectual Properties are regulated by the Consortium Agreement and managed by the PM. The PMT will be supported by the Team of Work Package leaders (ToW). The ToW is a structural instrument used by the PMT for administration, coordination, monitoring and reporting. A technical oriented group, the Design and Development Group (DDG), is responsible for the coherent design of the hardware and software produced in DEEP-EST. In case of changes in design and development or in danger of overrun budget limits of hardware costs the PM together with the ToW will define new priorities and update the project plan. All the bodies of management will be established during the kick-off meeting.

Task 8.2: Communication (JUELICH; M1-M36)

The PMT is the interface between the consortiums and the EC, and is also responsible for

DEEP-EST Page 53 of 122 Last saved 27.03.2017 08:08

Page 54: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

setting up the needed tools for project-internal communication. Annual periodic reports will be prepared and submitted to the Commission including certificates of the financial statements when required. Midterm reports will be delivered after months 6, 18, and 30 of the project. The PM will be responsible for the submission of the formal deliverables as defined by the contract. The PM will prepare and organise the annual project reviews.

The PMT will setup and maintain a Collaborative Workspace system for the secure exchange of project information, project results, deliverables, meeting minutes, etc. to facilitate easy and timely inter-project communication, in addition to mailing lists. The underlying software will be BSCW, which is already used in DEEP/-ER. A video and teleconferencing infrastructure is in place for the project. The PM will organise project meetings twice a year including the BoP and support meetings by the WP-Teams as required. Face-to-face meetings will be held typically at a partner site. For the intra-project communication JUELICH will be supported by its linked third party ParTec.

Task 8.3: Quality control (JUELICH; M1-M36)

Project results and formal deliverables will undergo a quality control process of project internal verification. Internal reviewers will be assigned to review documents and deliverables prior to approval by the PMT and submission to the EC (see Section 3.2.2). The process and the required review templates will be made available to the consortium via the BSCW.

JUELICH will be supported in this Task by its linked third party ParTec.

Task 8.4: Financial management (JUELICH; M1-M36)

The financial status reflecting actual vs. planned effort and actual vs. planned expenditures will be maintained within the project on a quarterly basis, and audited financial reports will be submitted to the commission as required. Special focus is on managing the budget for the provision of the hardware development platforms and the DEEP-EST prototype, which is initially stocked at the Coordinator for later distribution over the responsible partners (see Section 3.4.2).

Deliverables Deliv. Date

D8.1 Quality control plan [JUELICH; Tk8.3]Definition of the quality control processes and templates for internal verification and document review for all project results and deliverables.

M4

D8.2 Midterm management report at M6 [JUELICH; all Tasks]Report on the progress of the project without cost statements

M7

D8.3 Periodic progress report at M12 [ JUELICH; all Tasks]Report on the progress of the project without cost statements

M13

D8.4 Midterm management report at M18 [JUELICH; all Tasks]Report on the progress of the project without cost statements

M19

D8.5 Periodic progress report at M24 [JUELICH; all Tasks]Report on the progress of the project without cost statements

M25

D8.6 Midterm management report at M30 [JUELICH; all Tasks]Report on the progress of the project without cost statements

M31

D8.7 Periodic progress report at M36 [JUELICH; all Tasks]Report on the progress of the project without cost statements

M36

D8.8 Final report [JUELICH; all Tasks]Description of the technical and scientific results of the project

M36

Relevant Milestones Lead Deliv.

DEEP-EST Page 54 of 122 Last saved 27.03.2017 08:08

Page 55: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Partner DateMS1 Management structure and bodies established JUELIC

HM1

MS2 Quality control plan, dissemination plan, website and social channels available

BADW-LRZ

M4

3.1.3.9 List of Work Packages

WP No.

Work Package Title Lead Participant No.

Lead Participant Short Name

Person Months

Start Month

End Month

1 Applications 1 JUELICH 287 1 362 Benchmarking and modelling 4 BSC 108 1 363 System architecture 2 Intel 78 1 364 Prototype development 5 ETH-

Aurora315 6 36

5 System software and management 1 JUELICH 244 1 366 Programming environment 4 BSC 151 6 367 Dissemination 3 BADW-

LRZ36 1 36

8 Coordination 1 JUELICH 64 1 36TOTAL 1283

Table 4: List of Work Packages

3.1.3.10 List of DeliverablesDeliverable No.

Deliverable name WP No.

Lead participant

Type41 Dissemination level42

Delivery Date43

D1.1 Application co-design input 1 JUELICH R PU 4D1.2 Application use cases and traces 1 JUELICH OTHER PU 9D1.3 Application distribution strategy 1 JUELICH R PU 12D1.4 Initial application ports 1 JUELICH R CO 24D1.5 Final report on applications

experience1 JUELICH R PU 36

D2.1 DEEP-EST benchmark suites 2 JUELICH R PU 9D2.2 Initial application analysis and

models2 BSC R PU 18

D2.3 Benchmarking, evaluation and prediction report

2 BSC R PU 36

D3.1 System architecture 3 JUELICH R PU 6D3.2 High level system design 3 JUELICH R PU 12D3.3 Tests for prototype assessment 3 Intel R PU 18D3.4 Prototype assessment 3 Intel R PU 27

41 http://www.adept-project.eu42 http://www.fz-juelich.de/ias/jsc/EN/Expertise/Support/Software/SIONlib/_node.html43 R: Document, report; DEM: Demonstrator, pilot, prototype, plan designs; DEC: Websites, patents filing, press and media actions, videos, etc.; OTHER: Software, technical diagram, etc.

DEEP-EST Page 55 of 122 Last saved 27.03.2017 08:08

Page 56: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Deliverable No.

Deliverable name WP No.

Lead participant

Type Dissemination level

Delivery Date

D3.5 Modular Supercomputer architecture assessment and outlook

3 Intel R PU 36

D4.1 Cluster Module design 4 ETH-Aurora

R CO 12

D4.2 Data Analytics Module design 4 Megware R CO 12D4.3 Network Federation, fabri3, NAM

and GCE designs4 EXTOLL R CO 12

D4.4 Extreme Scale Booster design 4 ETH-Aurora

R CO 15

D4.5 Network Federation, fabri3, NAM and GCE

4 EXTOLL DEM CO 20

D4.6 Cluster Module 4 ETH-Aurora

DEM CO 20

D4.7 Extreme Scale Booster 4 ETH-Aurora

DEM CO 24

D4.8 Data Analytics Module 4 Megware DEM CO 24D4.9 Final DEEP EST prototype report 4 ETH-

AuroraR PU 30

D5.1 Collection of software requirements 5 JUELICH R PU 6D5.2 Software specification 5 JUELICH R PU 12D5.3 Prototype software implementation 5 JUELICH OTHER PU 18D5.4 Complete system-SW

implementation5 JUELICH OTHER PU 24

D5.5 Software support report 5 JUELICH R PU 36D6.1 Design and specification of

programming environment6 BSC R PU 9

D6.2 Prototype programming environment implementation

6 BSC OTHER PU 18

D6.3 Complete programming environment implementation

6 BSC OTHER PU 24

D6.4 Programming environment support report

6 BSC R PU 36

D7.1 Communication plan, toolkit and owned channels

7 BADW-LRZ

R PU 4

D7.2 Repository for training material 7 JUELICH OTHER PU 12D7.3 Early access programme for

industrial and academic users7 UEDIN R CO 18

D7.4 Policy briefing 7 BADW-LRZ

R PU 24

D7.5 Case study booklet 7 UEDIN R PU 34D7.6 Final outreach activity 7 BADW-

LRZDEC PU 36

D8.1 Quality control plan 8 JUELICH R PU 4D8.2 Midterm management report at M6 8 JUELICH R CO 7D8.3 Periodic progress report at M12 8 JUELICH R CO 13D8.4 Midterm management report at M18 8 JUELICH R CO 19D8.5 Periodic progress report at M24 8 JUELICH R CO 25

DEEP-EST Page 56 of 122 Last saved 27.03.2017 08:08

Page 57: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Deliverable No.

Deliverable name WP No.

Lead participant

Type Dissemination level

Delivery Date

D8.6 Midterm management report at M30 8 JUELICH R CO 31D8.7 Periodic progress report at M36 8 JUELICH R CO 36D8.8 Final report 8 JUELICH R PU 36Table 5: List of deliverables

3.1.4 Graphical representation of interdependenciesFigure 4 describes the main interdependencies between Work Packages:

Figure 4: Pert diagram of WP-interdependencies

Many tasks in the project are related or depending on each other. The above diagram shows, at the WP-level, the interdependencies that are most important for achieving the goals of the project. These interdependencies are represented by arrows and can stand for intensive collaboration from one WP to another, or for the flow of information between WPs.

To assure a coordinated design of all hardware and software layers, most of the interactions between WP1, WP2, WP3, WP4, WP5 and WP6 are done within the Design and Development Group (DDG), constituted by technical experts from all of those WPs.

Further interdependencies between the WPs:

WP1 formulates the application requirements with respect to the hardware and software capabilities and provides them through the DDG to the rest of technical WPs (WP3 to WP6). In the later stages of the project, it evaluates the usability and performance of SW and HW developments and gives feedback on bugs and issues.

WP2 evaluates the project developments with benchmarks, models the system and predicts its performance at large scale. Initial results of this evaluation are fed to WP3 as co-design feedback. A close relation is established with WP1, which provides WP2 with the application benchmarks and traces for modelling and predictions.

WP3 collects requirements from WP1, WP2, WP5 and WP6 and combines it with the technical constraints seen by the WP4 partners. With all this information, WP3 designs the high-level architecture and the main characteristics of each module, passing it to WP4 for construction, and verifying later that the resulting prototype fulfils the previously defined requirements. In parallel, WP3 deploys early technology evaluators and Software Development Vehicles used by the rest of WPs.

WP4 is responsible for the construction, test, installation and maintenance of the DEEP-EST prototype. It takes the high level specifications from WP3 and elaborates

DEEP-EST Page 57 of 122 Last saved 27.03.2017 08:08

Page 58: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

the precise hardware design, including board development, cooling, integration, power monitoring, etc. After installation, WP4 instructs the software packages (WP5 and WP6) on how to operate it and gathers their feedback on usability and behaviour.

WP5 and WP6 work closely together developing a consistent software stack, with the high level design settled in Tk5.1. WP6 implements the programming environment on top of the system- and management software developed in WP5. Interfaces between both, and requirements from one to the other will be clearly identified. Representatives of both WPs belong to the support team in WP1, where they collect the application’s requirements and their feedback on software usability.

WP7 gathers information from WPs 1 to 6 on the status of the developments as well as on results and lessons learned for communication and dissemination activities. In close cooperation with WP1, it also organises the internal training program. Last but not least, WP7 supports the rest of WPs on open access publication issues.

WP8: is responsible for the management and coordination of the project and influences therefore all other WPs, defining the organisation structure, the work plan, and the quality criteria that all must follow when preparing deliverables. WP1 receives input from all to report on the project status on midterm and periodic reports.

3.2 Management structure, milestones and proceduresThe DEEP-EST project will be managed by the Jülich Supercomputing Centre (JSC) of the Forschungszentrum Jülich GmbH (JUELICH). JSC has a large experience in managing and coordinating European projects, such as DEEP, DEEP-ER, and PRACE 1IP, 2IP and 3IP.

The management structure is designed to ensure that the project objectives are achieved according to the Description of Work (DoW) and, most important, that all results are geared towards enabling Exascale computing in the upcoming decade. In DEEP-EST a very similar management structure that had been defined in the predecessors DEEP/-ER projects will be applied. Only a new body has been introduced: the Innovation Council (IC).

Figure 5 presents the overall management structure. Guiding principle is maintaining short communication lines between Work Packages (WPs), to enable a coherent design and development of the DEEP-EST concept, and to react flexible in case of upcoming risks and changes.

3.2.1 Management bodiesThe actors and their different roles and the responsibilities are:

Coordinator is the contractual partner of the EC for the project. The DEEP-EST Coordinator is JUELICH, which is represented by the Principal Investigator (PI). JUELICH provides an experienced Project Manager (PM) responsible for the day-to-day project management, and an administrative project office, including financial and legal services.

The Coordinator will: process beneficiaries’ cost statements and distribute payments according to the

Consortium Agreement (CA), monitor contracts and commitments to ensure that the consortium meets its

obligations according to the Grant Agreement (GA) and the project’s CA – based on the DESCA – and negotiate any changes that may be necessary.

Board of Partners (BoP) is the top level authority of the project. It is an assembly of the representatives of the partners, which have the permission to make commitments on behalf of their institutions. The main role of the Board is:

taking decisions in case of changes in the consortium (e.g. partner entering/leaving), deciding upon major changes in the work distribution or schedule and approving

requests for DoW-amendments, when needed,

DEEP-EST Page 58 of 122 Last saved 27.03.2017 08:08

Page 59: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

approving budget-shifts and re-allocation and/or re-distribution of resources affecting the project at large,

approving deliverables, periodic-, midterm- and final reports on request of the PM, approving actions to be taken when external opportunities for innovation (new

improved product, service or process) arise, taking IP-aspects into account.

If risks or deviations from the work plan would arise, the BoP will be informed by the PM and brought to decisions. The BoP will meet in a plenary meeting twice a year and will convene tele-/videoconferences when necessary or upon request of one of its members.

Figure 5: Management structure

Principal Investigator (PI) represents the project as a whole and is the chairperson of the BoP. The PI has the overall responsibility for the progress of the project.

Project Manager (PM) is the person responsible for the execution of the project plan. The PM is heading the project office, located at the Coordinator and supported by the administrative and financial functions of the Coordinator. The PM of DEEP-EST is responsible to ensure:

the internal communication, the quality control of the reports and deliverables, the financial management, the communication with the EC, the organisation of BoP-meetings project review meetings as required by the CA, the management of Intellectual Properties according to the CA.

The PM is also the leader of WP8. The PM will report to the BoP and implement the Board’s decisions and she/he will act as the primary contact person with the European Commission.

DEEP-EST Page 59 of 122 Last saved 27.03.2017 08:08

Page 60: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Project Management Team (PMT) is headed by the PM and operates the day to day work of the project. The PMT will ensure that the work assigned to the different WP teams will progress according to schedule and within the allocated budget. The progress is monitored and periodically reported to the BoP. Exceptional conditions or problems will be analysed and suggestion for mitigation will be handed over to the BoP.

Team of the Work Package Leaders (ToW) is an instrument used by the PMT to monitor the status of the project and to establish a short information and controlling channel to the project members. It is chaired by the PM. Using the ToW, the delegation of administrative work to the WP-leaders is done coherently. The ToW ensures that deadlines are met and that dependencies are respected by the collaborators. It has the responsibility to identify potential problems or necessary changes in the work plan and alert the PMT accordingly.

Each WP-leader has the following responsibilities inside her/his corresponding WP:

ensure that the agreed work plan is carried out and dates of the deliverables and milestones are adhered to,

coordinate the participants in the WP, provide quarterly reports and input to the half-yearly progress reports, report deviations from the work plan to the PMT, identify WP-level risks, track them and propose alternative actions if needed, make recommendations for changes to the schedule of deliverables or execution

of the WP, through the Project Manager, to the BoP, communicate PMT/BoP decisions to the members of his/her WP.

The meetings of the ToW will take place monthly (face to face or by tele/videoconferences).

Design and Development Group (DDG): is the co-design heart of the project. In this forum cross-WP technical discussions take place and design decisions are made. The DDG is constituted by technical experts from all WPs. This close interaction between hardware, software and application developers leads to an intensive co-design of the system: design changes can be discussed in the forefront and feedback from developers of the operating software or of the programming environment can be used to influence the design process. The meetings of the DDG take place biweekly, generally via tele/videoconferences.

The DDG is responsible for:

assuring a coherent design of the MSA, taking into account the interplays and dependencies between the different hardware components of the system, as well as the requirements from software, modelling and applications teams,

enabling and fostering an iterative co-design process, issuing recommendations to the individual WPs responsible for their execution, informing the ToW about design changes with dependencies to the WPs making technical recommendations to address deviations from the work plan, monitoring technical developments outside DEEP-EST and requirements within

the HPC community and consider them in the evolution of the overall design, based on their technical expertise, identify external/internal opportunities for

innovation and bring them to discussion in the Innovation Council.

The DDG-leader is a full PMT-member. Technical conflicts between WPs or within the DDG will be handed first to the PMT, and ultimately to the BoP (see Section 3.2.7).

Innovation Council (IC): is a body that monitors, fosters, supports and publicises the innovations performed in DEEP-EST. The IC is led by the leader of Tk7.2 (“Innovation, Dissemination and Exploitation”). Further members of the IC are the DDG-leader (who reports on technical innovations), the PM (who addresses IP-questions following the CA, conflicts, and other management-related topics), and nominated representatives from industrial and academic partners who are likely to generate or bring innovations to the project. The IC will meet monthly via teleconference and in person twice a year.

DEEP-EST Page 60 of 122 Last saved 27.03.2017 08:08

Page 61: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

The IC is responsible for:

identifying opportunities for innovation arising within or around DEEP-EST, support the partners in publicising their innovations through the project

communication channels and other dissemination and communication materials, addressing within the spirit of the CA IP-issues that might complicate or hinder the

exploitation of project innovations by the partners, report on innovations and their status in progress and midterm reports.

Further details on the project’s innovation management are given in Section 3.2.3.

3.2.2 Quality managementProcedures for management of quality will be defined at the beginning of the project and documented in D1.1. This quality control plan includes formal structures and procedures for all kind of internal and external documents. Scientific or technical documents must get through an internal review process before they may be submitted for any kind of external publication. This will guarantee the fulfilment of the requirements by the project and the EC in terms of branding, style of layout, mandatory information, acceptable use of English, regulations of Intellectual Properties, etc. All deliverables follow an internal review process before submission to the EC. Each deliverable will be first reviewed by an internal reviewer (a participant of the DEEP-EST project who is not involved in the WP from which the document is coming from), and then by a member of the PMT. Comments are sent back to the author and the final version of the document is sent to the project’s EC-Officer by the PM. The assignment of reviewers (both internal and PMT) will be done with at least six months in advance. The list of all deliverables with their corresponding authors and the reviewers will be available to all project members on the project internal document repository (BSCW). A template for the preparation of the deliverables will be stored in the same location and is to be used by all authors, to guarantee the uniformity of the format in all project’s deliverables.

3.2.3 Innovation managementIn all architecture, hardware, software and application fields the DEEP-EST project goes well beyond the state-of-the-art. The ultimate goal of DEEP-EST is to demonstrate the advantages and potential of its innovations, and maximise their impact for the benefit of the HPC community and the whole society. All partners, and very especially the industrial partners, strive for exploitation, either commercially or academically, of the innovations performed individually or jointly within DEEP-EST (see Section 2.2.1.2). The IP-regulation required for facilitating and fostering innovation and exploitation will be settled within the CA.

Innovation in DEEP-EST is an overall project effort in which all partners and bodies are involved. All the individual initiatives will be brought together and managed by the Innovation Council (IC), described in Section 3.2.1. This body will ensure that opportunities for innovation are not wasted due to lack of information, communication or support, and that the innovative nature of the project developments is well described in dissemination material.

Technical innovations will be identified within the DDG, who will monitor also developments outside of the project with potential impact (positive or negative) for DEEP-EST. The IC will then address the involved partners and offer them support for increasing the visibility of their results and solve any legal or formal issues that might be hindering them from exploiting these. The IC will also support the dissemination and lobbying activities performed by the individual partners with the goal of bringing the project innovations to the market. Conflicts that might arise and hinder innovation will be handed over to the PMT (see Section 3.2.7).

3.2.4 Internal communicationA large part of the internal communication in the project will be done via Email. For this purpose, a set of e-mail lists for different groups of participants will be setup. An infrastructure for tele/videoconference is already in place, as well as a proven Collaborative

DEEP-EST Page 61 of 122 Last saved 27.03.2017 08:08

Page 62: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Workspace System (BSCW), used to share reports, protocols of meetings and other documents. Additionally to the remote communication techniques, face-to-face meetings will be regularly organised. Consortium meetings will take place at least twice per year, to strengthen the ties between the participants and discuss the status of all project activities. The BoP also meets face-to-face (F2F) at least twice per year, usually coupled with the consortium meetings. Invitations to meetings in writing are sent at least two weeks prior to the meeting, including agenda and proposals for decisions. Minutes from BoP, ToW, DDG, and consortium F2F meetings are accessible to all project members on the BSCW.

3.2.5 MilestonesThe PMT will monitor the progress of activities leading to Milestones through the ToW meetings. Every six months, the midterm and progress reports will describe the status of the Milestones expected to be reached at the corresponding project time. If delays or deviations from the work plan occur, the underlying reasons will be properly explained and the date when the missing Milestones are expected to be finally reached will be reported.

Milestone number

Milestone name Related WP(s)

Due date (in month)

Means of verification

MS1 Management structure and bodies established

8 1 Minutes of kick-off meeting

MS2 Quality control plan, dissemination plan, website and social channels available

7, 8 4 D4.1, D4.2, D8.1

MS3 Initial co-design input collected 1, 2, 3, 5 4 D1.1MS4 System HW and SW architecture 3, 5 6 D3.1, D5.1MS5 DEEP-EST benchmark suites defined 1, 2 9 D1.2, D2.1MS6 Hardware and software specifications 3, 4, 5, 6 12 D3.2, D4.1,

D4.2, D4.3, D5.2, D6.1

MS7 Initial application analysis models and first full implementation of system-SW and programming environment available

2, 5, 6 18 D2.2, D5.3, D6.2

MS8 Midterm innovation review 7 18 Minutes of IC-review meeting

MS9 Prototype delivered to Jülich, SW and applications ready for deployment

1, 4, 5, 6 24 D1.4, D4.5, D4.6, D4.7, D5.4

MS10 Prototype bring-up complete, SW and programming environment installed on the system

4, 5, 6 25 System available to application developers

Table 6: List of Milestones

3.2.6 Risk managementRisks seen in the forefront of the project, especially related to the development of the evaluation platforms and the DEEP-EST prototype, have been carefully taken into account in the proposed timeline and in the budget calculation.

The staged approach in the evolution path of DEEP-EST utilising evaluation platforms and test vehicles, provides at each step verifiable results to proceed with the next development. Additionally, the deployment of Software Development Vehicles (SDVs) reduces the dependency of the application and software part of the project with respect to the prototype

DEEP-EST Page 62 of 122 Last saved 27.03.2017 08:08

Page 63: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

development, minimising the impact on the overall project of eventual delays on the availability of some hardware module of the DEEP-EST prototype.

Risks will be also addressed in the design phase of the overall software stack, by carefully defining the interfaces between the various components and their functionality. First implementations of the software components will be created as quasi-independent elements, bringing them together after their basic functionality has been individually verified.

The main risks identified at proposal stage are described in the Risk Management List (RML) (Table 7), which details for each item its level of likelihood, the WPs involved, and the proposed risk mitigation measures. The RML is also as tool to monitor the status of risks during the course of the project, and will be regularly updated in the progress reports. If a risk became manifest, an individual strategy will be developed by the DDG and implemented by the relevant WPs to circumvent or mitigate the problems. Changes in the work plan will be reported and explained in the periodic reports. If necessary, the Coordinator will propose an amendment of the DoW to the EC-officer.# Description of risk

(Likelihood level Low/Medium/High)

WPs involved

Proposed risk mitigation measures

1 Node boards delayed due to delayed component availability (medium)

WP3, WP4

Identify alternative technologies during the initial design phase of the project, and be ready to use them. Extend software evaluation time on other platforms and deploy SDVs with functionality as close as possible to the DEEP-EST prototype.

2 Delays on board implementation due to unexpected thermal, mechanical and electrical challenges (medium)

WP4 Integrate standard boards or keep as close as possible to reference design. Several evaluation board stages implemented and validated before the DEEP-EST prototype implementation

3 Performance of components or architecture does not meet the expected goals for latency, bandwidth and overall performance (low)

WP1 to WP6

Intensive co-design phase in first months to identify the required functionality.Staged prototype development allows fine-tuned balancing in the early stages of the project.

4 Job scheduler and resource manager not able to efficiently support the modularity of the system (low)

WP5 Prototypes and mockups allow for an early assessment of the provided performance.Re-iterate the design-cycle to achieve the required efficiency.

5 Stability or quality of system software not sufficient to start application evaluations in time (low)

WP5, WP6, WP1, WP2

Intensive co-design phase in first months to identify the functionality required. Benchmarks and tests with early versions of software components to promptly identify issues.

6 Application not portable as a whole to the DEEP-EST prototype with reasonable effort (low).

WP1 Select a suitable part of the application that mimics the important performance characteristics, or develop a mockup.

7 Costs exceed the planned budget due to component costs uncertainty or unexpected design changes due to new technologies (low).

WP4,WP8

All partners will investigate means to activate additional funding for compensation. If this is not successful, scale the size of the prototype to the available budget.

DEEP-EST Page 63 of 122 Last saved 27.03.2017 08:08

Page 64: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

# Description of risk (Likelihood level Low/Medium/High)

WPs involved

Proposed risk mitigation measures

8 Deadlines on the verge of being missed or original planning turns out to be overly optimistic (medium).

All Regular tracking of technical progress through WPs, ToW and DDG, will ensure that schedule problems are identified early. DDG will discuss solutions or work-arounds for problems, if necessary asking the BoP for a decision.

Table 7: Risk Management List (RML), describing the critical risks for implementation (Table 3.2b)

Inherent to any large project, some additional risks (such as changes of key persons on one partner’s side, withdrawal of a partner from the project, or problems to recruit personal) could hit the DEEP-EST project. The PMT will continuously survey for upcoming risk situations that may become manifest. Using all means of the Conflict Management described in Section 3.2.7, the Project Manager in agreement with the BoP and the EC will propose reallocation of resources and/or redistribution of work load among the rest of the partners.

3.2.7 Conflict managementThe primary mechanism for decision-making within the project will be by consensus. Dissents between members of a WP are solved by the WP leader. If this fails or the dissent is related to various WPs, the case will be escalated to the level of the DDG for technical dissents, and to the ToW for organisational dissents. The Project Manager will act as a mediator in these two bodies. The last level of escalation is reached when there is a need to involve the BoP. Conflicts within the contractual partners are referred directly to the BoP.

3.3 Consortium as a wholeThe DEEP-EST consortium brings together commercial and academic institutions covering the full value chain (system developers software developers e-infrastructure providers end-users in a variety of HPC areas). Seven universities or research institutions, five e-infrastructure providers (including three PRACE Hosting Members), and five industrial companies (three of which are SMEs: Megware, EXTOLL and JUELICH’s linked third party ParTec) constitute the DEEP-EST team. Together they cover a wide range of engineering and scientific disciplines.

3.3.1 Complementary and completenessDEEP-EST is a co-design, holistic project and as such it addresses all three HPC pillars: hardware, system software, and applications. The DEEP-EST consortium reflects this fact, with each partner having a distinct area of responsibility within the project, and an overall match of the partners’ complementary roles. The table displays the subjects addressed by DEEP-EST and names the partners tackling the specific tasks.

Area/domain Partners of the DEEP-EST consortiumArchitectural challengesSystem architecture

JUELICH introduces the MSA concept, leads and coordinates the overall co-design efforts in the project.

Processors architecture

Intel provides general purpose Xeon, many-core Xeon Phi, and Xeon+FPGA processor architecture.

Memory hierarchy

UHEI and EXTOLL GmbH will develop the new memory concept of Network Attached Memory (NAM).Intel will provide information about and access to HBM and NVM technologies, as incorporated in their platforms.

DEEP-EST Page 64 of 122 Last saved 27.03.2017 08:08

Page 65: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Interconnect EXTOLL GmbH provides the highest performance European HPC network EXTOLL, improved in DEEP-EST with better management and resilient software support, and extends it by a Global Collective Engine (GCE).JUELICH’s linked third party ParTec will develop software layers for efficiently bridging different between network technologies and enable the utilisation of the GCE in ParaStation MPI.Intel will provide information and access to Omni-Path interconnect.

System integration and installationSystem design

ETH-Aurora will build two of the three DEEP-EST modules: the Cluster Module (CM) and the Extreme Scale Booster (ESB). The latter will profit from the Aurora expertise in deploying high density, direct water cooled HPC systems.Megware will integrate the Data Analytics Booster (DAM), building upon its expertise on delivering mid-scale compute systems to clients across Europe.EXTOLL GmbH will provide the fabri3 concept, which improves the integration of the EXTOLL interconnect into the system.Intel provides technical support on the integration of their processor, memory and interconnect technology.JUELICH brings its expertise in co-designing mid- and large-scale HPC systems (e.g. DEEP/-ER, JURECA, JUROPA, QPACE).

Cooling BADW-LRZ, ETH-Aurora and Megware provide concepts for energy-aware systems and direct water cooling.

Power monitoring

BADW-LRZ and UEDIN supply advanced monitoring tools, while ETH-Aurora and Megware will integrate sensors into the system design.

Infrastructure JUELICH provides the local infrastructure for the prototype installation and brings its expertise in the integration of Tier-0 systems and hardware prototypes.

Computing CentresPRACE hostse-Infrastructure providers

BADW-LRZ and JUELICH as part of the Gauss centre for Supercomputing (Germany), and BSC (Spain).UEDIN (UK) and NCSA (Bulgaria) operate HPC infrastructures in their countries.

SoftwareCluster management

JUELICH together with its linked third party ParTec develops the cluster operation system ParaStation. Extensions will be developed to support the DAM.JUELICH’s linked third party ParTec will also extend the resource management software psslurm to efficiently operate a MSA-system.

Scheduler BSC and JUELICH will jointly extend SLURM to deal with the complex scheduling of jobs with dynamic demands in a MSA-system.

Programming models

Adaptation of ParaStation MPI by JUELICH’s linked third party ParTec and improvements in OmpSs by BSC. Intel and UEDIN bring their expertise in HPDA programming models.

Resiliency and I/O software

BSC will further develop the resiliency mechanisms first introduced in DEEP-ER.FHG-ITWM will optimise the BeeGFS file system and JUELICH the I/O library SIONlib.

Performance models

BSC brings Extrae/Paraver as performance analysis tools, Dimemas and their efficiency model to predict the performance of applications.

Energy modelling

BADW-LRZ and UEDIN bring expertise in analysing power consumption of running applications and modelling the influence e.g. processor frequency on energy efficiency.

Applications and co-designApplications have been chosen to cover a wide range of different scientific domains. Special attention has been given to select codes with extreme scale processing requirements.Support team JUELICH, BSC, Intel, and JUELICH’s linked third party ParTec will support the

DEEP-EST Page 65 of 122 Last saved 27.03.2017 08:08

Page 66: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

application developers in analysing and adapting their codes to the DEEP-EST environment. The support team will trigger and foster co-design. JUELICH will also coordinate the applications work, ensuring a coherent and timely progress of the work.

Key research fields

Neuroscience: NMBUMolecular dynamics: NCSASpace weather simulation: KULeuvenRadio astronomy: ASTRONEarth sciences and data analytics: UoIHigh energy physics and big data: CERN

Table 8: Areas covered in DEEP-EST and distribution of partners roles.

The relatively high number of partners is required to cover all important areas a co-design project of this kind must address. The complementarities are reflected by the assignment of the partners to the WPs and Tasks, which matches their different knowledge and skills. The amount of personnel resources assigned to each of them (see Section 3.4) ensures the ability of the partners to fulfil their assigned tasks.

Project management becomes crucial in a consortium of this scale. JUELICH, which has largely demonstrated its capabilities in the management of other large research projects and collaborative initiatives, has been chosen as the Coordinator of the DEEP-EST project.

It shall be stressed that the DEEP-EST partners have been chosen not only for the scientific discipline that they represent or the technical background they bring to the project, but also for their ability to engage fully with the other members of the team. The partners have proven professional skills in various collaborations in the past showing their strong commitment to HPC at its cutting edge. Exemplary cases of such joint endeavours are the DEEP and DEEP-ER projects. Both consortia are led by JUELICH and comprise the partners Intel, BADW-LRZ, UHEI, ETH-Aurora’s parent company Eurotech, BSC, FHG-ITWM, JUELICH’s linked third party ParTec, KULeuven, and ASTRON44. In DEEP/-ER JUELICH, ETH-Aurora, Intel, UHEI, BADW-LRZ, and JUELICH designed and built the hardware prototypes; ParTec, JUELICH, BSC and FHG-ITWM developed software stack and programming environment; and KULeuven and ASTRON gained experience with the Cluster-Booster concept.

Further relevant collaborative projects in which the DEEP-EST partners have been involved are given in the partner descriptions of Section 4.

3.3.2 Industrial and commercial involvementThe DEEP-EST project counts amongst its members on the strength of five industrial or commercial institutions: ETH-Aurora, Megware, EXTOLL, Intel, and JUELICH’s linked third party ParTec. All of them will directly profit from the project’s outcome, which is expected to bring them an advantage with respect to their competitors. In particular, ETH-Aurora and Megware will exploit the design of the modules for which they are responsible, including them into their product portfolio and bringing them to the market. They will also profit from a close collaboration with the HPC technology providers Intel and EXTOLL, which may constitute the foundation of future joint commercial relationships. The latter will benefit from being included in the spectrum of components that these integrators will offer to their clients. Full solutions may be provided including the software and management components developed in the DEEP-EST collaboration. JUELICH’s linked third party ParTec will be one of the main beneficiaries of such future system solutions, as it will provide commercial support for the cluster management and software part. The same applies to ThinkParQ, a spin-off company of FHG-ITWM, which will commercially support BeeGFS and its DEEP-EST

44 PU = Public, fully open, e.g. web; CO = Confidential, restricted under conditions set out in Model Grant Agreement; CI = Classified, information as referred to in Commission Decision 2001/844/EC.

DEEP-EST Page 66 of 122 Last saved 27.03.2017 08:08

Page 67: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

extensions. Further areas in which IP will be generated in the project and how industrial and other partners will exploit it have been summarised in Table 2 in Section 2.2.1.2.

DEEP-EST Page 67 of 122 Last saved 27.03.2017 08:08

Page 68: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

3.4 Resources to be committedDEEP-EST mobilises 1283 person months (about 35 full-time equivalent staff) for a project duration of 36 months and a total EC grant request of slightly below 15 M€.

3.4.1 Person months distributionIt is estimated that 111 of the 300 PMs from JUELICH will be contributed by its linked third party ParTec (see Section 4.2.1). In a similar manner, 8 PMs from the 75 PMs corresponding to partner Intel will be contributed by its linked third party Intel Iberia (Section 4.2.2).

Particip. Number

Participant Short Name

WP1

WP2

WP3

WP4

WP5

WP6

WP7

WP8

Total PMs per Particip

ant1 JUELICH 45 18 23 27 78 42 3 64 3002 Intel 8 12 26 3 16 10 0 0 753 BADW-LRZ 0 18 3 7 53 0 24 0 1054 BSC 18 48 0 0 42 60 0 0 1685 ETH-Aurora 0 0 8 140 0 0 0 0 1486 Megware 0 0 6 30 0 0 0 0 367 UHEI 0 0 6 31 2 0 0 0 398 EXTOLL 0 0 6 77 41 0 0 0 1249 UEDIN 0 12 0 0 6 9 9 0 3610 FHG-ITWM 0 0 0 0 6 30 0 0 3611 KULeuven 36 0 0 0 0 0 0 0 3612 ASTRON 36 0 0 0 0 0 0 0 3613 NCSA 36 0 0 0 0 0 0 0 3614 NMBU 36 0 0 0 0 0 0 0 3615 UoI 36 0 0 0 0 0 0 0 3616 CERN 36 0 0 0 0 0 0 0 36Total PMs 287 108 78 315 244 151 36 64 1283

Table 9: Summary of staff effort

Figure 6 (left) shows graphically the distribution of resources over WPs. The colour coding corresponds to the different work-topics. A balanced distribution of resources exists between the R&D activities of hardware, software, application development and modelling, which receive each about 30% of the available personnel funding. Dissemination and management take together 8% of the total, sufficient to assure the visibility and success of the project without consuming the resources needed for R&D.

Figure 6: Left: PM distribution over Work Packages. Right: Grant-request for the different cost categories

DEEP-EST Page 68 of 122 Last saved 27.03.2017 08:08

Page 69: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

3.4.2 Budget overviewTable 10 summarises the costs of the project for each of the partners and their corresponding funding request. JUELICH’s linked third party ParTec is detailed in a separate row just below the corresponding main partner. Other partners bringing linked third parties have chosen to display the total sum (partner plus linked third party).

Direct Subcontracting

partnerperson month

direct personnel costs (€)

travel expenses (€)

audit expenses (€)

dissemination or hardware (€)

dissemination or hardware (€)

total direct costs (€)

total indirect costs (€)

requested EU-contrib. (€)

JUELICH DE 189 1,325,680 45,000 5,000 3,500,000 500,000 5,375,680 343,920 5,719,600ParTec DE 111 950,489 18,000 5,000 0 0 973,489 243,372 1,216,862

2 Intel DE 75 666,352 21,800 5,000 0 0 693,152 173,288 866,4403 BADW-LRZ DE 105 708,723 30,000 2,000 58,500 33,500 832,723 199,806 1,032,5294 BSC ES 168 840,000 18,000 3,000 0 0 861,000 215,250 1,076,2505 Aurora IT 148 635,703 18,000 5,000 0 0 658,703 164,676 823,3796 Megware DE 36 160,374 18,000 5,000 0 0 183,374 45,843 229,2177 UHEI DE 39 221,361 18,000 0 0 0 239,361 59,840 299,2018 EXTOLL DE 124 783,072 18,000 5,000 0 0 806,072 201,518 1,007,5909 UEDIN UK 36 242,991 18,000 0 0 0 260,991 65,248 326,238

10 FHG-ITWM DE 36 279,807 18,000 0 0 0 297,807 74,452 372,25911 KULeuven BE 36 240,966 18,000 0 0 0 258,966 64,741 323,70712 ASTRON NL 36 305,598 18,000 5,000 0 0 328,598 82,149 410,74713 NCSA BG 36 154,630 18,000 0 0 0 172,630 43,158 215,78814 NMBU NO 36 230,473 18,000 0 0 0 248,473 62,118 310,59115 UoI IS 36 292,288 18,000 0 0 0 310,288 77,572 387,86016 CERN CH 36 286,066 18,000 0 0 0 304,066 76,017 380,083

total 1,283 8,324,573 348,800 40,000 3,558,500 533,500 12,805,373 2,192,968 14,998,342

all values given over the project period of 3 yearsTotalODC

1

Table 10: Budget table

Budget is distributed among the different cost categories as displayed on the right side of Figure 6. About 70% of the total EC-funding is dedicated to personnel (direct costs), with the remaining 30% distributed between travel (‘Other Direct Costs’ ODC), audit (ODC), and dissemination material and hardware (both combining ODC and ‘subcontracting’) – see also Section 3.4.3.

Partners with hardware construction or procurement responsibilities – including SDVs, evaluators, and the DEEP-EST prototype itself – are ETH-Aurora, Megware, UHEI, EXTOLL, and JUELICH. The exact break-down of costs for evaluators and prototype components is not yet known, since it largely depends on the exact system configuration (e.g. relative size of the compute modules), and on the costs the components that will constitute them. The system configuration will be decided in a stringent co-design approach during the first months of the project. Also within the same period the pricing of the prototype components developed outside the project (e.g. third generation Intel Xeon Phi) will be investigated. Based on these two factors, the consortium will present during the first project review (expected latest at M12) a proposal on the prototype configuration and the corresponding budget distribution amongst the responsible partners. Until then, budget reserved for hardware procurement is temporarily stocked at the Coordinator (JUELICH).

The DEEP-EST prototype will be installed at Jülich. Regarding installation costs, partner JUELICH requests EC-funding only for the material (e.g. cooling pipes, pumps) and external services (e.g. plumbers) needed to adapt the infrastructure of the computer room where the system will be located. JUELICH brings the operational costs – which for a system of this size are estimated in about 280.000 €/year alone in electricity – as in-kind contribution.

Regarding dissemination, the budget assigned to WP7-leader BADW-LRZ is allocated to: (a) open access publications, (b) information material like flyers, brochures, posters etc.; (c) website programming; (d) final outreach activities; (f) potentially a video project; and finally

DEEP-EST Page 69 of 122 Last saved 27.03.2017 08:08

Page 70: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

the lion’s share (~55%) (g) presence at trade shows and conferences or self-organised events like e.g. the awareness days. Roughly 2/3 of the dissemination budget are other direct costs, about 1/3 is reserved for subcontracting (e.g. video production).

3.4.3 Other Direct Costs and subcontractingThe only DEEP-EST partner requesting over 15% of its budget as Other Direct Costs is the Coordinator JUELICH. Two are the categories in which such costs are requested: ‘travel’ and ‘other goods and services’, with the largest amount of budget in the latter category dedicated to hardware procurement. As mentioned in Section 3.4.2, the assignment of the hardware costs to JUELICH is only temporary and a re-distribution amongst the partners with hardware responsibilities will be decided in the first project review (expected latest at M12), and formalised right afterwards through a DoW-amendment.

1/JUELICH Cost (€) JustificationTravel 45,000 Coordinator responsibilities require more travelling that the

average partners, to represent the project in conferences and EC-organised events.

Equipment 0Other goods and services

3,505,000 3,500,000 € for hardware procurement, including evaluators, software development vehicles, and the final DEEP-EST prototype. This hardware budget is assigned to JUELICH only temporarily. Once the exact system configuration is decided via co-design and first estimations of the components costs are available, the hardware budget will be re-distributed between the responsible partners. The formal budget re-distribution will be presented to the reviewers and fixed in a DoW-amendment.

The remaining 5,000 € correspond to audit costs.Total 3,550,000

Table 11: Other direct cost’ items

The total hardware budget is estimated in 4 M€, most of which will be accounted as ODC, (3,5 M€). It is expected that some of the responsible partners will make use of subcontracting (e.g. ETH-Aurora, see third party table in Section 4.1.5) to execute minor construction activities (e.g. component mounting on a PCB, or mechanical assembly of some parts). Therefore, a part of the hardware budget (500 k€) is being requested under the category ‘subcontracting’.

In the benefit of building the largest possible prototype, the consortium renounces to the 25% indirect costs associated to hardware under the cost category ‘ODC’. This ensures that the total EC-funding reserved for hardware procurement (4 M€) will indeed be fully dedicated to the purchase of the system components. Therefore, in Table 10 the overhead for the ODC-hardware has been excluded from the calculation of the indirect costs and the total Grant Request. In Part A of the proposal, since the indirect costs cannot be adjusted, only the Grant Request position has been modified to subtract the respective indirect costs.

DEEP-EST Page 70 of 122 Last saved 27.03.2017 08:08

Page 71: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4 Members of the consortium

4.1 Participants

4.1.1 Forschungszentrum Jülich GmbH (JUELICH)

The Forschungszentrum Jülich (JUELICH) – a member of the Helmholtz Association – is one of the largest research centres in Europe. It pursues cutting-edge interdisciplinary research addressing the challenges facing society in the fields of health, energy and the environment, and information technologies. Within the Forschungszentrum, the Jülich Supercomputing Centre (JSC) is one of the three national supercomputing centres in Germany as part of the Gauss Centre for Supercomputing (GCS). Presently, JSC operates some of the highest performance computing systems in Europe.

JSC has over 30 years expertise in providing supercomputer services to national and international user communities. It undertakes research and development in HPC architectures, performance analysis, HPC software and tools, and grid computing. The success of the computers DEEP, QPACE and JUROPA demonstrates the competence of JSC in the field of computational sciences and system architecture technologies. JSC has established three jointly operated laboratories to develop and test new promising HPC architectures and prototypes to enter the Exascale era, as well as to foster co-design strategies: the Exascale Innovation Centre – a cooperation with IBM Böblingen – the ExaCluster Laboratory –a cooperation with Intel GmbH and ParTec GmbH– and the NVIDIA Application Lab – a cooperation with NVIDIA –. JUELICH is also part of the ParaStation Consortium, responsible for the development of the ParaStation cluster management software.

JUELICH role in the DEEP-EST projectJUELICH is the Coordinator of the DEEP-EST project and responsible for the management activities in WP8. On the technical side JUELICH drives the overall concept and technical designs as leader of the Design and Developers Group (DDG), and takes a strong co-design approach reinforced through its leadership of WP1 and the application support team (Tk1.1). JUELICH will contribute to the development of scheduler and resource management in WP5, and to the adaptation of the SIONlib and SCR libraries in WP6. Finally, JUELICH will help in benchmarking activities with JUBE (Tk2.1) and co-organise training initiatives in Tk7.4. JUELICH leads WP1 and WP8, while WP5 is led by its linked third party ParTec.

JUELICH Key people:Prof. Dr. Dr. Thomas Lippert (Male) is the Director of the Institute for Advanced Simulation (IAS) at Forschungszentrum Jülich, Germany. He is the head of the Jülich Supercomputing Centre (JSC), a division of the IAS, acts as managing director of the John von Neumann-Institut for Computing and as director of the Jülich-Aachen Research Alliance, section JARA-HPC. He holds the chair for Computational Theoretical Physics at the Bergische Universität Wuppertal, Germany. He is member of the Gauss Centre for Supercomputing e.V. His research interests cover the field of computational particle physics, parallel and numerical algorithms and cluster computing. He has published more than 200 scientific articles.

Dr. Estela Suarez (Female) joined JSC in 2010. She studied Astrophysics at the University Complutense of Madrid, from which she obtained her Diploma in 2004. In 2010 she obtained her PhD in Physics from the University of Geneva. At JSC she belongs to the Technology Department and is member of the ExaCluster Laboratory. She has been the Project Manager of the DEEP and DEEP-ER projects.

DEEP-EST Page 71 of 122 Last saved 27.03.2017 08:08

Page 72: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Prof. Dr. Norbert Eicker (Male) is working at JSC since 2004. In 2001 he received his PhD in Theoretical Physics from University of Wuppertal. In 1999 he had the technical lead for the 128 node ALiCE-Cluster in Wuppertal. From 2001 to 2004, he was with ParTec working on the Cluster Middleware ParaStation. He was lead-architect of the 3000-node JUROPA/HPC-FF system that went into operation in 2009. Since 2013 he is professor for Parallel Hardware and Software Systems at the Bergische Universität Wuppertal, Germany. At JSC, he is responsible for Cluster-Computing and exploring new technologies for the DEEP and DEEP-ER projects, where he leads the Design and Development Group (DDG).

Relevant publications:Eicker, N.; Lippert, T.; Moschny, T.; Suarez, E: “The DEEP Project - An alternative approach to heterogeneous cluster-computing in the many-core era”. Concurrency and computation 28 (8) 2016, 2394-2411. DOI: 10.1002/cpe.3562

Eicker, N.; Lippert, T.; Moschny, T.; Suarez, E: “The DEEP project: Pursuing cluster-computing in the many-core era”. Proc. of the 42nd International Conference on Parallel Processing Workshops (ICPPW) 2013, Workshop on Heterogeneous and Unconventional Cluster Architectures and Applications (HUCAA), Lyon, France, October 2013, pp. 885-892. DOI: 10.1109/ICPP.2013.105

Alvarez Mallon, A.; Eicker, N.; Innocenti, M.E.; Lapenta, G.; Lippert, T.; Suarez, E.: “On the scalability of the clusters-booster concept: a critical assessment of the DEEP architecture” FutureHPC '12, Proceedings of the Future HPC Systems: the Challenges of Power-Constrained Performance, ACM New York, NY, USA, 2012, 978-1-4503-1453-4, Article No. 3. DOI: 10.1145/2322156.2322159

Relevant projects:JSC has successfully managed and contributed to numerous national and European projects. Some examples:

DEEP (“Dynamical Exascale Entry Platform”, FP7-ICT-287530): JUELICH was the Coordinator and has driven the hardware and software designs in the project, as well as the application support.

DEEP-ER (“DEEP – Extended Reach”, FP7-ICT-610476): JUELICH is the Coordinator and drives the hardware and software designs in the project, as well as the application support and the I/O software development.

PRACE (“Partnership for Advanced Computing in Europe”): JUELICH has been responsible for the coordination of the PRACE project and its three implementation phases, contributing to the evolution of the Research Infrastructure, dissemination, technical operations, petascaling, and future technologies.

Significant infrastructure:JUELICH hosts several HPC system and prototypes which are available to the DEEP-EST members as software development vehicles. Most of the software can be developed in standard clusters. Intel Xeon Phi processors are also available and will be used to prepare applications to run on the Extreme Scale Booster. Some examples:

JURECA: production system with 1872 compute nodes, each one with two Intel Xeon processors (Haswell generation), from which 75 are populated with 2 NVIDIA K80 GPU. The interconnect is InfiniBand EDR. This system can be used for large scale tests with applications.

DEEP System: Prototype system built within the DEEP project. It is constituted by a Cluster and a Booster. The Cluster part contains 128 Xeon (Sandy Bridge) nodes in

DEEP-EST Page 72 of 122 Last saved 27.03.2017 08:08

Page 73: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

an InfiniBand QDR network. The Booster part contains 384 Intel Xeon Phi (KNC) co-processors in an EXTOLL network.

DEEP-ER Prototype: JUELICH hosts the prototype built within the DEEP-ER project, a Cluster-Booster system with Xeon and Xeon Phi (KNL) processors. The Cluster part is already installed and contains 16 Intel Xeon (Haswell) nodes and the installation of the Booster side is foreseen in Q4-2016. The chosen network technology is EXTOLL Tourmalet on both sides of the system. The DEEP-ER Prototype is a good platform for application preparation and software development.

Third Parties:Does the participant plan to subcontract certain tasks (please note that core tasks of the project should not be sub-contracted)

N

If yes, please describe and justify the tasks to be subcontracted

Does the participant envisage that part of its work is performed by linked third parties

Y

If yes, please describe the third party, the link of the participant to the third party, and describe and justify the foreseen tasks to be performed by the third party:

ParTec: see section 4.2.1

Does the participant envisage the use of contributions in kind provided by third parties (Articles 11 and 12 of the General Model Grant Agreement)

N

If yes, please describe the third party and their contributions

DEEP-EST Page 73 of 122 Last saved 27.03.2017 08:08

Page 74: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4.1.2 Intel Deutschland GmbH (Intel)

Intel currently has approx. 100,000 employees in more than 60 countries and serves customers in more than 120 countries. The company designs and manufactures a variety of essential technologies, including microprocessors and chipsets and the additional hardware, software and related services that together serve as the foundation for many of the world’s computing devices.

Over the last decade, Intel has evolved from a company that largely serves the PC industry, to a company that increasingly provides the vital intelligence inside all things computing. In fact, one third of Intel’s revenue is associated with products beyond the PC. Hardware and software products by Intel, and subsidiaries such as McAfee, power the majority of the world’s data centres, connect hundreds of millions of cellular handsets and help secure and protect computers, mobile devices and corporate and government IT systems. Intel technologies are also inside intelligent systems, such as in automobiles, automated factories and medical devices.

Through these products and services, Intel aims to deliver engaging, consistent and secure experiences across all Internet-connected devices, whether in the home, car, office or pocket. To this end, the company applies its unique technology and manufacturing strengths, global reach and brand, along with its substantial on-going investments in R&D and its extensive worldwide business ecosystem.

Established in 1976, Intel Germany GmbH supports markets in Europe, Middle East and Africa (EMEA). In 2015, it was merged with the Intel Mobile Communications GmbH, creating Intel Deutschland GmbH. In addition to Intel’s core semiconductor business, Munich Intel employees work in research and development, engineering, and sales and marketing, and provide core organisation support.

The Intel Deutschland GmbH R&D organisations are part of Intel Labs Europe, a network of more than 20 R&D centres spanning the region as well as a variety of Intel business units. Intel GmbH manages the technical co-operation with HPC developers in Europe, and performs collaborative research towards Exascale at the ExaCluster lab in Jülich.

Intel role in the DEEP-EST projectIn the DEEP-EST project Intel Deutschland GmbH will participate through resources of the Jülich ExaCluster Lab. It will lead the system architecture and design (WP3), and significantly contribute to the design and development of the prototype system, working closely with the Original Equipment Manufacturers (OEMs) in the project. In addition, Intel will contribute substantially to the benchmarking and modelling activities in WP2. Finally, Intel will provide advice and support for the application analysis, porting and optimisation activities on Intel platforms in WP1, and participate in a similar function in WP5.

Intel Key people:Hans-Christian Hoppe (Male) is an Intel Principal Engineer and the director of the ExaCluster Lab at Research Center Jülich. He has a long track record in HPC R&D, with an emphasis on programming models and tools, application analysis and characterisation, and pathfinding for future HPC platforms. His achievements include significant impact on the MPI message-passing standard, the first seamless and high performance Grid infrastructure Unicore, pioneering use of virtualisation in Grid/Cloud systems, and the Intel Cluster Tools line of SW products. He has ample experience in about a dozen of European Union funded R&D projects spanning Frameworks 3 through 7.

Dr. Marcelo Cintra (Male) joined Intel in 2011, after 11 years working in the academic staff of the University of Edinburgh, where he still holds a position as Honorary Professor. He has a long track record with publications and funded grants in HPC. His recent R&D work

DEEP-EST Page 74 of 122 Last saved 27.03.2017 08:08

Page 75: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

focuses on non-volatile memories for HPC and he holds a US Patent on the topic. He has been recognised with five HiPEAC paper awards, one best paper nomination in PACT’11, and three Intel Division Recognition Awards. He has been in the Program Committee of key conferences in the field, such as SC, ICS, IPDPS and ISCA. He is a Senior Member of both the ACM and the IEEE. He is also a certified Project Management Professional of the PMI. Marcelo is participating significantly to both the DEEP-ER and NEXTGenIO projects.

Suraj Prabhakaran is a Software Engineer working at Intel since June 2016. Before joining Intel, he was working as a Research Scientist at TU Darmstadt and the German Research School for Simulation Sciences (GRS). He will receive his Doctoral degree from TU Darmstadt by the end of 2016. He holds a Master’s degree in Software Systems Engineering from RWTH Aachen University and Bachelor’s degree in Computer Science from Anna University, India. Mr. Prabhakaran specialises in High Performance Computing and has particularly focused on adaptive job scheduling and resource management during his doctoral tenure. He is also experienced in research and development of middleware such as adaptive programming systems and parallel and communication libraries. He has made contributions to the DEEP project in his capacity as a researcher from GRS, which was a partner institute of the project. With Intel, he participates in the NEXTGenIO Horizon 2020 project.

Relevant publications:Joshi, A.; Nagarajan, V.; Cintra, M: “Efficient Persist Barriers for Multicores”, Proceedings of the 48th International Symposium on Microarchitecture, MICRO 2015, 660-671. DOI: 10.1145/2830772.2830805.

McPherson, A. J.; Nagarajan, V.; Sarkar, S.; Cintra, M.: “Fence Placement for Legacy Data-Race-Free Programs via Synchronization Read Detection”, ACM Transactions on Architecture and Code Optimization (TACO), volume 12, issue 4, article 46, 2015. DOI: 10.1145/2835179.

McPherson, A. J.; Nagarajan, V.; Cintra, M.: “Static Approximation of MPI Communication Graphs for Optimized Process Placement”, Intl. Workshop on Languages and Compilers for Parallel Computing (LCPC), Lecture Notes in Computer Science, volume 8967, 2014, 268—283. DOI: 10.1007/978-3-319-17473-0_18.

Prabhakaran, S.; Neumann, M.; Rinke, S.; Wolf, F.; Gupta, A.; Kalé, L.: “A Batch System with Efficient Scheduling for Malleable and Evolving Applications”, Proceedings. of the 29th IEEE International Parallel and Distributed Processing Symposium (IPDPS), Hyderabad, India, pages 429-438, IEEE Computer Society, May 2015. DOI: 10.1109/IPDPS.2015.34.

Prabhakaran, S.; Iqbal, M.; Rinke, S.; Wolf, F.: “A Dynamic Resource Management System for Network-Attached Accelerator Clusters”, Proceedings of the 42nd International Conference on Parallel Processing (ICPP), Workshop on Scheduling and Resource Management for Parallel and Distributed Systems (SRMPDS), Lyon, France, pages 773-782, October 2013. DOI: 10.1109/ICPP.2013.91.

Relevant projects:FP7 and Horizon 2020 projects in the HPC area, in which Intel Deutschland GmbH (and its precursors legal organisations) and Hans-Christian Hoppe or Marcelo Cintra did participate in a relevant technical role.

PEPPHER (“Performance Portability and Programmability for Heterogeneous Many-core Architectures”): EU FP7-ICT Grant Agreement no. 248481, 1st October 2010 – 30th September 2013The PEPPHER devises a unified framework for programming and optimising applications for architecturally diverse, heterogeneous many-core processors to ensure performance portability.

DEEP-EST Page 75 of 122 Last saved 27.03.2017 08:08

Page 76: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

DEEP (“Dynamical Exascale Entry Platform” - FP7-ICT-287530, 1st December 2011 – 31st

May 2015). DEEP is one of the EC’s Exascale projects and is developing a novel, Exascale-enabling supercomputer architecture based on the concept of compute acceleration in conjunction with a software stack focused on meeting Exascale requirements.

DEEP-ER (“Dynamical Exascale Entry Platform – Extended Research” – FP7-ICT-610476, 1st October 2013 – 30th September 2016): The goal of the DEEP-ER is project is to update the Cluster-Booster architecture introduced by the DEEP project and extend it with additional parallel I/O and resiliency capabilities.

NEXTGenIO (“Next-Generation I/O for Exascale” - H2020-FETHPC-671591, 1st October 2015 – 30th September 2018): The goal of the NEXTGenIO project is to prototype a highly efficient and scalable storage architecture for HPC and high-performance data analytics based on advanced and potentially disruptive non-volatile memory technology by Intel.

Third Parties:Does the participant plan to subcontract certain tasks (please note that core tasks of the project should not be sub-contracted)

N

If yes, please describe and justify the tasks to be subcontracted

Does the participant envisage that part of its work is performed by linked third parties

Y

If yes, please describe the third party, the link of the participant to the third party, and describe and justify the foreseen tasks to be performed by the third party:

Intel Iberia: see Section 4.2.2

Does the participant envisage the use of contributions in kind provided by third parties (Articles 11 and 12 of the General Model Grant Agreement)

N

If yes, please describe the third party and their contributions

DEEP-EST Page 76 of 122 Last saved 27.03.2017 08:08

Page 77: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4.1.3 Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities (BADW-LRZ)

The Leibniz Supercomputing Centre (Leibniz-Rechenzentrum, LRZ) is part of the Bavarian Academy of Sciences and Humanities (Bayerische Akademie der Wissenschaften, BAdW) and functions as the IT service provider for Munich’s universities and a growing number of scientific institutions in the greater Munich area and in the state of Bavaria. The Centre is a front-runner in HPC, both nationally and internationally. It supports outstanding research and education in various scientific domains by providing stable, professional, reliable, secure, and energy efficient IT services based on the latest IT technologies.

As one of six hosting members of the European supercomputers in the PRACE consortium, BADW-LRZ is a major player on the European level. The centre operates SuperMUC, a top-level supercomputer with 250,000 x86 cores and a peak performance of over 6 PFlop/s, as well as a number of general purpose and specialised clusters, including an Intel Xeon Phi Cluster (SuperMIC). In addition, it is a member of the “Munich Data Science Centre”, providing the scientific community with large-scale data archiving resources and Big Data technologies. Furthermore, it operates a powerful communication infrastructure called Munich Scientific Network (MWN) and is a competence centre for high-speed data communication networks.

BADW-LRZ has a long-standing tradition in research on energy-efficient operation of data centres. This not only includes the development of energy efficient computer architectures, but also improvements to the building infrastructure or the efficient use of waste heat. It was one of the first HPC centres to deploy direct-liquid cooled compute resources and remains a strong advocate of warm-water cooling. An additional research focus is on scalable, high-frequency monitoring solutions that facilitate energy-efficient HPC operations. LRZ is an active member in the Energy Efficient HPC Working Group and leads the Energy Efficiency Working Group of the European Technology Platform for High Performance Computing (ETP4HPC).

BADW-LRZ role in the DEEP-EST projectBADW-LRZ will contribute to WP2 in Tk2.4 and further develop its energy modelling tools to accommodate the MSA concept. In WPs 3 and 4 the centre will leverage its profound experience in energy efficient HPC operations to contribute towards an energy efficient system design. It will also work in WP5 as leader of Tk5.6 (“system monitoring and RAS plane”) where it will extend the scalable monitoring framework, R&D activities that had been started in the DEEP project. Additionally, BADW-LRZ will lead WP7 on dissemination and training – an activity continued from both the DEEP and DEEP-ER projects.

BADW-LRZ Key people:Prof. Dr. Dr. Arnt Bode (Male) obtained a PhD in informatics from the University of Karlsruhe in 1975. He joined the Friedrich-Alexander-University of Erlangen as researcher in the field of parallel and distributed computer architectures in 1976 and got his graduation as “Dr.-Ing. habil.” in 1984. Since 1987 Prof. Bode is a full Professor for Informatics at the Technical University of Munich (TUM). His main research areas are Computer Architecture, Distributed and Parallel Computing Systems, High Performance Computing, Parallel Tools and Environments for Parallel Systems, Digital University IT Infrastructure and eLearning. From 1999 till 2008 he was Vice President and CIO of the TUM. He joined BADW-LRZ as Chairman of the Board of Directors in 2008.

Dr. Herbert Huber (Male) obtained a PhD in physics from the Ludwig-Maximilians-University of Munich (LMU) in 1998. He joined BADW-LRZ in 1997 and has been head of the “High-Performance Systems” department at BADW-LRZ since 2012. The main focus of his work and his research interests are energy-efficient high-performance computer infrastructures

DEEP-EST Page 77 of 122 Last saved 27.03.2017 08:08

Page 78: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

(processor and system architecture, interconnection network, file systems) and supercomputing centres.

Dr. Michael Ott (Male) received his PhD in computer science from the Technical University of Munich (TUM) in 2010 for his work in high performance bioinformatics. Before he joined BADW-LRZ in 2012 he was a postdoc with the Commonwealth Scientific and Industrial Research Organisation (CSIRO) in Canberra, Australia. He is now a senior researcher in the “High-Performance Systems” division of BADW-LRZ and mainly involved in EU projects such as DEEP, Mont-Blanc, and PRACE and also leads the Energy Efficiency Working Group in the ETP4HPC. His research focuses on energy-efficiency and scalable monitoring, but he still keeps up an interest in bioinformatics, computer architecture, and parallel programming.

Ms. Sabrina Eisenreich (Female) is PR Manager for European Research Projects at BADW- LRZ. Currently she is the WP-leader for communication in the FP7-project DEEP-ER, a role she continued from the previous project DEEP. Additionally, she supports the dissemination in Mont-Blanc 2 and ComPat. Last but not least, she leads the cooperation between the European Exascale Projects, a loose cooperation of FP7 Exascale projects that collaborate for conferences, events, workshops, and the like. Sabrina Eisenreich holds a Master of Arts in communication science from University of Hamburg and University of Aarhus (double-degree) having specialised on PR by European Union institutions. Before joining BADW-LRZ in 2014, she worked for an international PR agency for several years where she consulted on, developed, and implemented PR campaigns for major clients in the ICT sector dealing with topics like cloud computing, big data, data analytics, or IT security.

Relevant publications:Shoukourian, H.; Wilde, T.; Auweter, A.; Bode, A.; Piochacz, P.: “Towards a unified energy efficiency evaluation toolset: an approach and its implementation at Leibniz Supercomputing Centre (LRZ)”; Hilty L. M., Aebicher, B.; Anderson; G.; Lohmann, W. (eds); ICT4S 2013: Proceedings of the First International Conference on Information and Communication Technologies for Sustainability, pp. 276 – 281, Zürich, Februar 2013

Shoukourian, H.; Wilde, T.; Auweter, A.; Bode, A.; Tafani, D.: “Predicting Energy Consumption Relevant Indicators of Strong Scaling HPC Applications for Different Compute Resource Configurations”. Spring Simulation Multi-Conference 2015, At Alexandria, Virginia, USA, Simulation Series Vol 47, No. 4; ISBN: 9781510801011.

Wilde, T.; Auweter, A.; Shoukourian, H.; Bode, A.: “Taking advantage of node power variation in homogenous HPC systems to save energy”. In Proceedings of High Performance Computing 30th International Conference, Frankfurt, Germany, 2015, pp. 376-393

Shoukourian, H.; Wilde, T.; Auweter, A.; Bode, A.: “Monitoring Power Data: A first step towards a unified energy efficiency evaluation toolset for HPC data centers”. Journal of Environmental Modelling & Software (Elsevier), 56, 2013

Relevant projects:BADW-LRZ has successfully contributed to numerous national and European projects. Some examples:

DEEP: BADW-LRZ was the lead of WP7 on energy efficiency. Mont-Blanc: BADW-LRZ is contributing to the Tasks on system monitoring, energy-

aware scheduling and leading the efforts on energy accounting. PRACE: BADW-LRZ is/has been leading the Tasks on HPC prototyping ETP4HPC: BADW-LRZ is the designated lead of Energy Efficiency Working Group

DEEP-EST Page 78 of 122 Last saved 27.03.2017 08:08

Page 79: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Significant infrastructure:BADW-LRZ hosts several HPC systems and test systems which are available for BADW-LRZ developments and can be made available to other DEEP-EST members as software development vehicles. Examples include:

DEEP Energy Efficiency Evaluator: This is a smaller replica of the DEEP System based at Jülich Supercomputing Centre. It was used for the development of the DEEP energy monitoring system.

ARM test system: A two-node AppliedMicro X-Gene evaluation system. SuperMIC: An Intel Xeon Phi (1st generation KNC) Cluster. Intel Xeon Phi 2nd generation (KNL) cluster (to be procured until end of 2016)

Third Parties:Does the participant plan to subcontract certain tasks (please note that core tasks of the project should not be sub-contracted)

N

If yes, please describe and justify the tasks to be subcontracted

Does the participant envisage that part of its work is performed by linked third parties

N

If yes, please describe the third party, the link of the participant to the third party, and describe and justify the foreseen tasks to be performed by the third party:

Does the participant envisage the use of contributions in kind provided by third parties (Articles 11 and 12 of the General Model Grant Agreement)

N

If yes, please describe the third party and their contributions

DEEP-EST Page 79 of 122 Last saved 27.03.2017 08:08

Page 80: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4.1.4 Barcelona Supercomputing Center (BSC)

The Barcelona Supercomputing Center (BSC) was established in 2005 and serves as the Spanish national supercomputing facility. The centre hosts MareNostrum, one of the most powerful supercomputers in Europe and its mission is to research, develop and manage information technologies in order to facilitate scientific progress. BSC is recognised as a first-class research centre in supercomputing and in scientific fields that demand it, such as Life and Earth Sciences and Engineering. BSC has over 350 staff from 41 countries engaged in multidisciplinary scientific collaboration and innovation.

BSC is a hosting member of the PRACE distributed supercomputing infrastructure and an active participant in HiPEAC, the ETP4HPC and other international forums such as BDEC. The centre develops technologies for Exascale within the BSC-led Mont-Blanc project, in the DEEP and DEEP-ER projects and the Human Brain Flagship project. BSC has also established joint research centres on Exascale with Intel and IBM. BSC also coordinated the RISC project to create a network of HPC research centres in Latin America and the EU. In 2011, BSC was one of only 8 Spanish research centres to be recognised by the national government as a “Severo Ochoa Centre of Excellence” for its track record and research roadmap on computing and applications. BSC has collaborated with industry since its creation, and has participated in projects with companies such as ARM, Bull and Airbus as well as numerous SMEs. BSC has also established joint research centres with as Microsoft, NVIDIA and Spanish oil company Repsol. The centre has participated in over eighty EC Framework Programme research projects.

The Computer Sciences Department of the BSC focuses on building upon currently available hardware and software technologies and adapting them to make efficient use of supercomputing infrastructures. The department proposes novel architectures for processors and memory hierarchy and develops programming models and innovative implementation approaches for these models as well as tools for performance analysis and prediction. In addition, the department is working on resource management at various component levels (processor, memory, storage) and for different execution environments, including Grid and e-Business platforms and application optimisation.

BSC role in the DEEP-EST project:BSC will lead both the Benchmarking and Modelling (WP2) and Programming Environment (WP6) work packages. Additionally, BSC will significantly contribute to Applications (WP1) and System Software (WP5) work packages. The main technical activities that will be carried by BSC are the extension of the SLURM scheduler (Tk5.5), the performance prediction and extrapolation of applications (Tk2.3) and scheduling policies (Tk2.2), the enhancement of OmpSs to support the Highly Scalable Booster and Data Analytics modules (Tk6.2) and the development of a new resiliency interface based on pragmas (Tk6.5). Finally, BSC will be part of the support team providing advice and support for the application analysis, porting and optimisation activities (Tk1.1).

BSC Key people:Prof. Jesús Labarta (Male) is full professor at the Computer Architecture department at UPC since 1990. Since 1981 he has been lecturing on computer architecture, operating systems, computer networks and performance evaluation. His research interest has been centred on parallel computing, covering areas from multiprocessor architecture, memory hierarchy, parallelising compilers and programming models, operating systems, parallelisation of numerical kernels, metacomputing tools and performance analysis and prediction tools. He has leaded the technical work of UPC and BSC in more than 20 R&D projects and the research collaboration with IBM in the framework of the MareIncognito project. Since 1995, he has been director of CEPBA and since 2005 he is Director of the Computer Sciences department at BSC.

DEEP-EST Page 80 of 122 Last saved 27.03.2017 08:08

Page 81: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Prof. Eduard Ayguadé (Male) is full professor at the Technical University of Catalonia (UPC) and the associate director of the Computer Science Department within the BSC. His research interests cover the areas of multicore architectures, and programming models and compilers for high-performance architectures. He has published around 200 publications in these topics and participated in several research projects with other universities and industries, in framework of the European Union programmes or in direct collaboration with technology leading companies. Currently Prof. Ayguade is Associated Director for Research of the Computer Science Department at the BSC

Dr. Vicenç Beltran (Male) is a senior researcher at BSC. He is currently working on distributed programming models for HPC. He received his engineering degree (2004) and Ph.D. (2009) in Computer Science from the Technical University of Catalonia (UPC). His research interests include programming models and domain specific languages for HPC, operating systems and performance analysis and tools. He has worked in several EU and industrial projects.

Judit Gimenez (Female) researcher at UPC/BSC, has been working in the area of parallel computing since 1989 when she obtained her computer science degree. Her first work was in the development and support of parallel systems based on transporters. She has been involved in technology transfer activities participating in different initiatives to promote the usage of parallel computing by European SMEs. She has been responsible for the development and distribution of performance tools for the last 14 years, being the leader of the Performance Tools team in the Computer Sciences department at BSC since its creation.

Relevant publications:Bellens, P.; Pérez, J.M.; Badia, R.M.; and Labarta, J.: “A Study of Speculative Distributed Scheduling on the Cell/B.E.”, Proceedings of the 25th IEEE International Parallel & Distributed Processing Symposium (2011), pp. 140-151.

Bellens, P.; Pérez, J.M.; Badia, R.M.; and Labarta, J.: “Making the Best of Temporal Locality: Just-in-Time Renaming and Lazy Write-Back on the Cell/B.E.”, International Journal of High Performance Computing Applications, Vol. 25, No. 2, pp. 137-1747, 2011.

Beltran, V. and Ayguade, E.: “Optimizing Resource Utilization with software-based Temporal Multi-Threading (sTMT)”, Proceedings of the 19th International Conference on High Performance Computing (HiPC 2012), pp. 1-10.

Planas, J.; Badia, R. M.; Ayguadé, E.; and Labarta, J.; “Hierarchical task based programming with StarSs”, International Journal of High Performance Computing Applications, Vol. 23, No. 3, pp. 284- 299, 2009.

Sainz, F.; Bellón, J.; Beltran, V. and Labarta, J.: “Collective Offload for Heterogeneous Clusters”, 2015 IEEE 22nd International Conference on High Performance Computing (HiPC), Bangalore, 2015, pp. 376-385. doi: 10.1109/HiPC.2015.20

Relevant projects:BSC has taken a major role in the following relevant projects:

DEEP – Dynamical Exascale Entry Platform December 2011- September 2016. DEEP-ER- DEEP Extended Reach –October 2013-March 2017. INTERTWinE - Programming Model INTERoperability ToWards Exascale. September

2015 – August 2018. Mont-Blanc and Mont-Blanc2 projects – European scalable and power efficient HPC

platform based on low-power embedded technology- October 2011- September 2016. IBM-BSC Technology Center for Supercomputing. IBM Research. January 2013-

December 2015. BSC/Intel Exascale Laboratory – October 2011-October 2017.

DEEP-EST Page 81 of 122 Last saved 27.03.2017 08:08

Page 82: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Significant infrastructure:BSC hosts and will provide access for testing purposes to the following infrastructures:

The Tier 0 PRACE MareNostrum III supercomputer with peak performance of 1.1 Petaflops, with 48896 Intel Sandy Bridge processors in 3056 nodes, and 84 Xeon Phi 5110P in 42 nodes, with more than 100.8 TB of main memory and 2 PB of GPFS disk storage.

MinoTauro - a cluster with 128 Bull B505 blades, with 1,536 processors, 256 M2090 NVIDIA GPU cards, 3,07 GB of main memory.

Third Parties:Does the participant plan to subcontract certain tasks (please note that core tasks of the project should not be sub-contracted)

N

If yes, please describe and justify the tasks to be subcontracted

Does the participant envisage that part of its work is performed by linked third parties

N

If yes, please describe the third party, the link of the participant to the third party, and describe and justify the foreseen tasks to be performed by the third party:

Does the participant envisage the use of contributions in kind provided by third parties (Articles 11 and 12 of the General Model Grant Agreement)

Y

If yes, please describe the third party and their contributions: CSIC and UPC, see Section 4.2.4

DEEP-EST Page 82 of 122 Last saved 27.03.2017 08:08

Page 83: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4.1.5 Aurora S.r.l. (ETH-Aurora)

Aurora S.r.l. (ETH-Aurora) is the subsidiary of the Eurotech Group dedicated to the High Performance Computing (HPC) business. It inherits from its parent company the activities, resources, customers and know-how of the Eurotech HPC division, and boasts over 15 years of experience in designing, manufacturing, delivering and supporting supercomputers. ETH-Aurora has a relentless focus on energy efficient computing, and a commitment to continuous innovation through R&D, nurtured in house and in collaboration with the most important research centres in Europe.

The Eurotech group has been participating and supporting many Research initiatives that involve Networks of Scientists, Universities and Research Centres and has been strongly involved in the DEEP (Dynamical Exascale Entry Platform) and in its successor DEEP-ER (DEEP Extended Reach) FP7 projects. Such activities are now under the responsibility of its ETH-Aurora subsidiary.

In the HPC segment ETH-Aurora designs and manufactures the Aurora family of supercomputers, able to scale to several Petaflops today, but conceived with Exascale scalability in mind. Energy efficiency and industry leading density are among the key features and development guidelines of Aurora supercomputers.

ETH-Aurora attaches great importance to the study and development of advanced leading edge technologies. This allows maintaining a competitive advantage over a long period and anticipating the evolution of future scenarios and reference markets. In this context, the DEEP-EST project represents a very important opportunity for ETH-Aurora to widen, enforce and deepen its know-how and expertise on the high performance computing platforms.

ETH-Aurora role in the DEEP-EST projectETH-Aurora is the provider of two (out of three) hardware modules for the prototype of the DEEP-EST system. In the framework of this activity ETH-Aurora will contribute to the definition of the overall architecture of the DEEP-EST prototype and is committed to being a strong contributor to the architecture of the Cluster Module and the Extreme Scale Booster. This co-design activity will allow ETH-Aurora to design the hardware to be supplied to the DEEP-EST project to demonstrate the key architectural concepts introduced with DEEP, expanded in DEEP-ER and which are underlying the DEEP-EST project as well. ETH-Aurora will also lead the installation of the systems and participate in their maintenance until the end of the project.

ETH-Aurora Key people:Igor Zacharov (Male) is Senior HPC Solution Architect for ETH-Aurora and provides Aurora contact point for the DEEP-EST project. Within the WP4 Igor will coordinate the task of the design, production and the installation of all the modules for the DEEP-EST prototype. Igor has served in similar engineering roles in SGI and T-Platforms, joining Eurotech in January 2014. Igor holds PhD in Physics for scientific work done at NIKHEF(Amsterdam) and CERN (Switzerland).

Fabio Gallo (Male) has been Vice President and General Manager of the Eurotech HPC strategic business unit since October 2014 is leading Eurotech’s ETH-Aurora subsidiary. Before joining the Eurotech group, Fabio served in several roles in top HPC companies, such as IBM, SGI, Linux Networx and Scali. More recently he has been Vice President and Director of HPC Solutions at Bull in France, where he had a leading role in the international expansion of the business. Prior to joining Eurotech Fabio was CEO of iOpener Media, a German company active in the games and motorsports industries. Fabio holds an M.Sc. in Electronic Engineering from the University of Rome (“La Sapienza”).

DEEP-EST Page 83 of 122 Last saved 27.03.2017 08:08

Page 84: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Relevant projects: DEEP (“Dynamical Exascale Entry Platform”, FP7-ICT-287530):

In the framework of this project Eurotech supplied the Cluster and the Booster machines to the project. The Booster machine was designed and manufactured completely within the scope of DEEP and presents state-of-the art technology with totally liquid cooled machine for energy efficiency.

DEEP-ER (“DEEP – Extended Reach”, FP7-ICT-610476). In the framework of the DEEP-ER project Eurotech designed and manufactured the Booster prototype (installation Q4/2016). In this project Eurotech used its 2nd generation hot water cooling technology for the reduction of the weight and increase of the density of the prototype machine.

Third Parties:Does the participant plan to subcontract certain tasks (please note that core tasks of the project should not be sub-contracted)

Y

If yes, please describe and justify the tasks to be subcontracted

ETH-Aurora plans to subcontract manufacturing of the PCB boards that may be designed in the framework of DEEP-EST to a contract manufacturer specialised in this task. Also certain steps in the PCB design that require usage of specialised tools (eg. component placement and routing on the PCB) may be subcontracted to a specialised party.The manufacturing of mechanical parts may be outsourced to a specialised 3rd party, as well as the assembly tasks when these are scheduled for mass-production.

Does the participant envisage that part of its work is performed by linked third parties

Y

If yes, please describe the third party, the link of the participant to the third party, and describe and justify the foreseen tasks to be performed by the third party:

Eurotech S.p.A., ETH-Aurora parent company. Eurotech S.p.A., in addition to being ETH-Aurora’s parent company, can provide specialised personnel and skills complementing ETH-Aurora’s own, to carry out some of the activities assigned to ETH-Aurora within DEEP-EST.

Eurotech S.p.A., see Section 4.2.3

Does the participant envisage the use of contributions in kind provided by third parties (Articles 11 and 12 of the General Model Grant Agreement)

N

If yes, please describe the third party and their contributions

DEEP-EST Page 84 of 122 Last saved 27.03.2017 08:08

Page 85: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4.1.6 Megware Computer Vertrieb und Service GmbH (Megware)

Megware is a Chemnitz-based provider for HPC solutions. Founded in 1990, it established itself first in the growing market for personal computers. For more than 26 years, inventive genius, customer reach and flexibility have been Megware’s guiding principles.

In 2000, Megware received their first order in the field of High Performance Computing. The Megware “CLiC” system for the Chemnitz University of Technology was acknowledged as having the best price performance ratio worldwide and rose to position 126 of the renowned TOP500 list.

As of today, two systems in the TOP500 are Megware clusters rendering Megware the number 1 German supercomputing manufacturer. Megware TOP500 clusters are located at the universities in Tuebingen and Heidelberg/Mannheim.

Beyond the Top500, Megware’s HPC systems are used for enterprise computing purposes in a wide range of industries as well as in universities and research institutions within Germany and Europe.

As a full-service provider in the field of supercomputing, Megware provides assembly and testing of complete systems up to the handing-over on a turn-key basis including all required services and system support. To stay at the forefront of technology, Megware’s R&D team develops innovative in-house HPC solutions including new compute server designs, direct liquid cooling, power monitoring and management as well as Megware’s own ClustWare system management software.

Megware role in the DEEP-EST projectWithin the DEEP-EST project, Megware will assume development leadership of the Data Analytics Module (DAM).

Megware Key people:Jörg Heydemüller (Male) is HPC representative at Megware. He has been with Megware since June 2008. In his role Joerg supports researchers, scientist’s engineers from public and industry in all over Europe in the implementation of High Performance Cluster solutions and is responsible for establishing relationships with key leaders of HPC and he is in charge of worldwide relations, including building strategic relations with international partners. Prior to Megware Joerg worked for various IT Companies as sales leader and purchaser for 10 years.

Axel Auweter (Male) joined Megware as HPC development manager in June 2016. Prior to joining Megware, he was a research associate at Leibniz Supercomputing Centre (LRZ) where he was responsible for LRZ's research activities on energy-efficient HPC system design and operation. In this role, he also acted as the WP- leader on energy efficiency in the DEEP project. His academic background is in computer architecture, system level programming and operating systems.

Thomas Blum (Male) joined Megware 2004 as a student to write his thesis about “parallel file systems in production HPC Linux clusters”. He later became a full time employee in 2005 as HPC engineer. Since entering the company, he was involved in several system installations for a certain number of Top500 projects. His responsibilities include the system design beforehand and the responsibility for running benchmarks. He works tightly integrated into technical sales and the project team which allows him to put this experience from various projects and versatile technologies into new HPC systems. Through Megware’s close partnering with all leading suppliers within the HPC market, he is also connected to the technical teams of the major system component vendors and their R&D to optimise Megware cluster systems for production grade reliability and performance.

DEEP-EST Page 85 of 122 Last saved 27.03.2017 08:08

Page 86: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

André Singer (Male) is employed at Megware since 1993. He has 20 years of experience as project coordinator and was responsible for Megwares most valuable cluster projects in recent years. In this role, André coordinates production, shipment, deployment and testing of (sub-)systems to be delivered. He will be the main point of contact at Megware for system and bring-up and will devote people and resources to solve any potential problems during this phase.

Relevant projects:Megware maintains successful relations with German and European universities and has participated in many joint research projects. Examples:

FAST (“Find a Suitable Topology for Exascale Applications”): Megware provides tools for fine-grained power measurements and tools for seamless integration of batch scheduling systems into the system management software.

Energie-POKER: This project developed a framework for automated benchmarking with respect to performance and power consumption. To achieve this, the framework interacts with the SLURM resource management software, energy control frameworks like Intel Node Manager and a highly scalable NoSQL backend for storing the relevant benchmark metrics.

Third Parties:Does the participant plan to subcontract certain tasks (please note that core tasks of the project should not be sub-contracted)

N

If yes, please describe and justify the tasks to be subcontracted

Does the participant envisage that part of its work is performed by linked third parties

N

If yes, please describe the third party, the link of the participant to the third party, and describe and justify the foreseen tasks to be performed by the third party:

Does the participant envisage the use of contributions in kind provided by third parties (Articles 11 and 12 of the General Model Grant Agreement)

N

If yes, please describe the third party and their contributions

DEEP-EST Page 86 of 122 Last saved 27.03.2017 08:08

Page 87: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4.1.7 Ruprecht-Karls Universitaet Heidelberg (UHEI)

The Ruprecht Karl University of Heidelberg is one of the leading German universities. As a comprehensive university it grants Bachelor, Master and Ph.D. degrees in almost all fields of the sciences and humanities. The University of Heidelberg is widely recognised for the high quality research in all these areas.

The Institute for Computer Engineering (ZITI) dedicates its research and teaching activities to the understanding and implementation of complex systems in information technology. One research aspect is to analyse how new results in mathematics and fundamental physics may lead to innovative and intelligent computer systems. Another research focus is the application of new technologies and methods in computer engineering to sensing and instrumentation in physics, astronomy, biology, medicine and other natural and life sciences.

The Computer Architecture Group at the University of Heidelberg, as system architects, holds a profound expert knowledge in the area of design space analysis, hardware design of processors and devices, interconnection networks, and software driver development, especially for the construction of large computing clusters. All levels of system design are covered, starting at the application programming interface, e.g. MPI, through the efficient design of device drivers finishing at custom build hardware devices based on standard cells (ASICs) and FPGAs. The group mainly focuses on the design of parallel architectures which achieve their high performance by improving communication between computational devices/units. Scaling such systems is a great challenge to the architecture of the interconnection network (IN) and the network interface controller (NIC). Special attention is paid on the interface between software and hardware to setup communication instructions. With EXTOLL a new interconnection network has been designed and implemented in 65nm ASIC technology which aims specifically for the needs of HPC. In the EU-Projects DEEP and DEEP-ER, the Computer Architecture Group has participated in research, design and development of the interconnection network hardware, the attachment of accelerators and memory to the network and the development of supporting software.

UHEI role in the DEEP-EST project:UHEI is conducting research in field of network centric functions, where the attachment of special functions can be accessed from anywhere in the network and thus accelerated by special hardware. Evaluated and tested are such functions in FPGAs and programmable processors. First results are the NAM and next steps will lead to more such special functions like the global collective operation support.

UniHD key people:Prof. Dr. Ulrich Brüning (Male) is a professor in the Department of Mathematics and Computer Science at the University of Heidelberg and the Institute for Computer Engineering (ZITI). He leads the Computer Architecture Group described above. Prof. Brüning research interests include the areas of hardware design, parallel computing and interconnection network.

Juri Schmidt (Male) is currently working as a researcher with the Computer Architecture Group of the University of Heidelberg. His research interests include parallel architectures, hardware and firmware architectures for HPC network devices. Juri Schmidt is the architect of the Network Attached Memory (NAM) extension for EXTOLL.

Relevant publications:Schmidt, J.; Fröning, H.; Brüning, U.: “Exploring Time and Energy for Complex Accesses to a Hybrid Memory Cube”, The international Symposium on Memory Systems (MEMSYS 2016) Oct. 3-6, 2016, Washington D.C, USA. [Accepted]

DEEP-EST Page 87 of 122 Last saved 27.03.2017 08:08

Page 88: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Neuwirth, S.; Frey, D.; Bruening, U.: “Communication Models for Distributed Intel Xeon Phi Coprocessors”, 21st IEEE International Conference on Parallel and Distributed Systems (ICPADS 2015), Dec. 14-17, 2015, Melbourne, Australia.

Schmidt, J.; Brüning, U.: “openHMC - A Configurable Open-Source Hybrid Memory Cube Controller”, 10th IEEE International Conference on ReConFigurable Computing and FPGAs, Dec. 7-9, 2015, Mayan Riviera, Mexico.

Neuwirth, S.; Frey, D.; Nüssle, M.; Brüning, U.: “Scalable Communication Architecture for Network-Attached Accelerators”, 21st IEEE International Symposium on High Performance Computer Architecture (HPCA 2015), Feb. 7-11, 2015, Bay Area, California, USA.

Relevant projects:UHEI has contributed to the following European projects:

DEEP (“Dynamical Exascale Entry Platform”, FP7-ICT-287530): UHEI has provided the interconnection technology for the booster part of the system.

DEEP-ER (“DEEP – Extended Reach”, FP7-ICT-610476): UHEI has provided the interconnect with the NAM and accelerator access, and worked in the bring-up and debugging of the booster

Research infrastructure:UHEI has established a modern design environment for hardware and firmware, which uses the most advanced EDA tool for Cadence, Keysight, Xilinx and Altera. Small test systems for hardware verification are available.

Third Parties:Does the participant plan to subcontract certain tasks (please note that core tasks of the project should not be sub-contracted)

N

If yes, please describe and justify the tasks to be subcontracted

Does the participant envisage that part of its work is performed by linked third parties

N

If yes, please describe the third party, the link of the participant to the third party, and describe and justify the foreseen tasks to be performed by the third party

Does the participant envisage the use of contributions in kind provided by third parties (Articles 11 and 12 of the General Model Grant Agreement)

N

If yes, please describe the third party and their contributions

DEEP-EST Page 88 of 122 Last saved 27.03.2017 08:08

Page 89: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4.1.8 EXTOLL GmbH (EXTOLL)

EXTOLL GmbH was established in 2011 as a spin-off of the University of Heidelberg. The EXTOLL research project has already been pursued for more than 5 years by the Computer Architecture Group of the University of Heidelberg prior to the founding of EXTOLL GmbH which fully licensed all research results and continues to develop the technology, still in cooperation with the university, notably the groups of Prof. Brüning and Prof. Fröning. The company develops interconnect technology (IP, hardware, firmware, management ¬software und middleware) for high-performance computing applications. Due to close relationships with Intel and other companies a highly elaborated and efficient technology could be developed that perfectly links to current and upcoming computer architectures. EXTOLL GmbH is venture capital backed and has raised up to now in two investment rounds funding solely from investors from Germany. In 2013 EXTOLL GmbH started production of its Tourmalet ASIC, a truly high-performance HPC interconnection network (see also below). The research and development team at EXTOLL has a vast experience and knowledge to develop the complete stack for communication in HPC, from low-level ASIC and hardware design, PCB design, manufacturing to system architecture, verification and necessary software development.

The EXTOLL interconnection technology was already used in its FPGA based implementations in different projects, including the FP7 EU project DEEP. With the Tourmalet ASIC a very high-performance implementation is available as a product, where one of the first use cases was in the FP7 project DEEP-ER. EXTOLL Tourmalet is also a unique interconnection solution for HPC because it was completely developed in Europe by an European SME.

EXTOLL role in the DEEP-EST project:EXTOLL will be the provider of the fast low latency interconnect ASIC Tourmalet for the Booster and some other parts of the heterogeneous system. Furthermore EXTOLL will specify and develop the fabri3 interconnect and the required firmware/software to operate the network.

EXTOLL key peopleDr. Mondrian Nüssle (Male) serves as one of the two Managing Directors of EXTOLL GmbH and CTO. He holds a doctorate degree from university of Mannheim, with numerous related publications as well as one patent. He provides deep knowledge from 10+ years of experience in computer architecture, software for HPC systems, low-level software components, RTL hardware design and system level design. Through his previous projects at the Computer Architecture Group of the University of Heidelberg, he substantially contributed to the EXTOLL concept. He has participated in several industrial R&D projects in various roles, starting with ATOLL networks and including co-operations with AMD and SUN Microsystems. As technical managing director of EXTOLL, he coordinates operative and strategic technical issues: Architecture, RTL design, development of FPGA-based EXTOLL technology as well as software development.

Dr. Sven Kapferer (Male) is a senior engineer with EXTOLL GmbH. He holds a doctorate degree from University of Heidelberg and a diploma in Computer Engineering from University of Mannheim. He has vast experience in architecting, designing and implementing complex VLSI ASIC systems. He is also an expert in PCB and RTL design. He previously worked for the Computer Architecture Group of the University of Heidelberg and was involved in numerous industrial R&D projects. Dr. Kapferer will be contributing to the hardware aspects within the DEEP-EST project.

Tobias Groschup (Male) is software engineer with the EXTOLL GmbH. He holds a master degree in computer science and has worked on the EXTOLL software stack for several

DEEP-EST Page 89 of 122 Last saved 27.03.2017 08:08

Page 90: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

years. His experience is mainly in the areas of software engineering, software architecture and development. He is the architect and primary developer of the EXTOLL Management Software and also responsible for several other packages. As software developer he will be working on the software aspects related to the EXTOLL interconnection network within DEEP-EST.

Relevant Publications Nüssle, M.; Fröning, H.; Kapferer, S.; Brüning, U.: “Accelerate Communication, not Computation!”, High Performance Computing Using FPGAs, p. 507-542, Vanderbauwhede, Wim; Benkrid, Khaled (Eds.), Springer, 2013.

Fröning, H.; Nüssle, M.; Litz, H.; Leber, C.; and Brüning, U.: “On Achieving High Message Rates”, 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), May 13-16, 2013, Delft, The Netherlands.

Neuwirth, S.; Frey, D.; Nüssle, M.; Brüning, U.: “Scalable Communication Architecture for Network-Attached Accelerators”; 21st IEEE International Symposium on High Performance Computer Architecture (HPCA 2015), Feb. 7-11, 2015, Bay Area, California, USA.

Relevant Products EXTOLL-Tourmalet: Tourmalet implements an HPC network on a single chip. Tourmalet is the first ASIC implementation of the EXTOLL technology and is specifically designed for the High Performance Computing (HPC) market. As a single chip solution no external switches are required, EXTOLL’s six links (each providing a bandwidth of 100 Gb/s) and the internal programmable crossbar allow clusters with arbitrary topologies. Its ultra-low latency of less than 600 ns and the extremely high sustained message rate of more than 100 million MPI messages per second combined with a direct topology are unique features in the HPC market. A PCI Express x16 Gen3 host interface provides maximum bandwidth to the host processor. The PCIe core is not only able to run as an endpoint but does also offer root port functionality which allows direct attachment of accelerators via PCI Express without the need for a CPU. EXTOLL GmbH, a European SME, completely designed a 65 nm ASIC chip implementing Tourmalet including the complete back-end design. The die of the ASIC is about 11.4x14mm in size and features approximately 270 million transistors, making it one of the largest and most complex ASICs developed completely by a European SME.

Relevant Projects DEEP (Dynamical Exascale Entry Platform) is one of the Exascale projects funded by

the EU 7th framework programme (ICT-287530). A new cluster architecture was defined for the DEEP project: it combines a standard cluster connected through InfiniBand with an innovative, highly scalable Booster constructed of Intel Xeon Phi co-processors (Booster Nodes). EXTOLL has been selected as the interconnect for the Booster by the DEEP project.

DEEP-ER (DEEP-Extended Reach, ICT-610476) is the successor project of DEEP. EXTOLL was selected as the interconnect for the DEEP-ER system, including the Xeon Phi Knights-Landing based Booster part and the Xeon based cluster. A hybrid topology consisting of different interconnected k-ary n-cubes (Hypercube for the cluster, torus for the booster, ring for the storage servers) interconnected to each other was realised, spearheading the hierarchical topology envisioned for the MSA in DEEP-EST.

InfrastructureEXTOLL GmbH has access to the complete infrastructure needed for deep-sub-micron ASIC design, PCB design and organisation of manufacturing. A lab environment for PCB/HW

DEEP-EST Page 90 of 122 Last saved 27.03.2017 08:08

Page 91: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

bring-up and test is available and will be used in the course of the project. Also, small scale test installation of an EXTOLL fabric is available for software development purposes.

Third Parties:Does the participant plan to subcontract certain tasks (please note that core tasks of the project should not be sub-contracted)

N

If yes, please describe and justify the tasks to be subcontracted

Does the participant envisage that part of its work is performed by linked third parties

N

If yes, please describe the third party, the link of the participant to the third party, and describe and justify the foreseen tasks to be performed by the third party:

Does the participant envisage the use of contributions in kind provided by third parties (Articles 11 and 12 of the General Model Grant Agreement)

N

If yes, please describe the third party and their contributions

DEEP-EST Page 91 of 122 Last saved 27.03.2017 08:08

Page 92: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4.1.9 The University of Edinburgh (UEDIN)

The University of Edinburgh (UEDIN) is one of Europe’s leading research universities and is represented in this project by its supercomputing centre, EPCC (http://www.epcc.ed.ac.uk). Established in 1990 with the aims of accelerate the effective exploitation of high performance computing systems throughout academia, industry and commerce, the organisation is at the forefront of HPC Service provision and research in Europe. EPCC has a full-time staff of 90, and hosts a large array of HPC systems including the 118,080 core, 2.5 Petaflops Cray XC30 - based UK National HPC Service - ARCHER. EPCC works with a wide variety of scientific and industrial partners.

The centre has a long history of working with HPC vendors and hardware manufacturers to design leading edge novel HPC systems including the QCDOC system, the Maxwell FPGA system and, most recently, the IBM BlueGene/Q. EPCC works closely with world leading computational scientists to tackle problems that can be solved only by using capability computing. UEDIN has an international reputation in the application support of Computational Science and work closely with researchers in the areas of Chemistry, Biological Sciences, Physics, Mathematics and Engineering. In addition to its role as a HPC service provider to academia, EPCC provides a wide variety of services to industry including: HPC application design, development and re-engineering; HPC application performance optimisation; distributed computing consultancy and solutions; HPC facilities access; project management for software development; and data integration and data analysis consultancy.

UEDIN role in the DEEP-EST project:UEDIN will participate in Benchmarking and Modelling (WP2), providing expertise in both HPC system and application benchmarking, and in power and energy simulation and monitoring. In WP5 (System software and management) UEDIN will contribute energy and power monitoring expertise to Tk5.6. Additionally, UEDIN will collaborate with Intel on the data analytics programming environment (Tk6.3) in WP6, bringing its experience working with a range of different HPDA communities (bio-informatics, banks, etc.) and a range of different HPDA platforms to ensure a successful integration of data analytics into the MSA platform that DEEP-EST is developing. Finally, UEDIN will be heavily involved in the dissemination and outreach activities in the project, building on our experience with many other European, International, and national projects and events UEDIN has run. Particularly, UEDIN is heavily involved in diversity and dissemination activities, such as the Women in HPC and Diversity in HPC networks, and outreach events such as the Big Bang Fair in the UK.

UEDIN Key people:Prof. Mark Parsons (Male) holds a personal chair in High Performance Computing at the University and is the director of UEDIN. He is also the Associate Dean for e-Research at Edinburgh and is an acknowledged international expert in high-performance and data intensive computing. Mark is the Project Coordinator for the Fortissimo and Fortissimo 2, and for NEXTGenIO.

Adrian Jackson (Male) is a Research Architect at UEDIN where he has worked for the past 15 years. He is involved with a wide range of academic and industry partners to provide HPC expertise and effort and currently is leading UEDIN’s Intel Parallel Computing Centre (IPCC), where UEDIN works on porting and optimising large academic software packages for Intel Xeon and Xeon Phi processors. Previously he led UEDIN’s involvement in the NAIS (Numerical Algorithms and Intelligent Software) and Nu-FuSE (the G8 funded Nuclear Fusion Simulation at Exascale) projects. His research interests are in the field of parallel programming languages and application optimisation, where he works with a wide range of application scientists, including in fields such as Computational Fluid Dynamics, and Nuclear

DEEP-EST Page 92 of 122 Last saved 27.03.2017 08:08

Page 93: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Fusion simulation, to prepare their codes for Peta and Exascale systems. He is also involved with novel parallel programming languages/techniques, such as MDMP (Managed Data Message Passing), an approach he is developing for investigating Exascale communication issues, and OpenACC (where he represents UEDIN). He has also developed and teaches on a number of courses as part of UEDIN's Masters Programme and for external organisations and academic parties.

Relevant publications:Sawyer, M. and Parsons, M.: “Challenges facing HPC and the associated R&D priorities: a roadmap for HPC research in Europe”. PlanetHPC report. 2013. http://www.planethpc.eu/images/stories/planethpc_roadmap_2013.pdf

Jackson, A.; Hein, J. and Roach, C.: “Optimising performance through unbalanced decompositions”, IEEE Transactions on Parallel and Distributed Systems 99, 2014

Weiland, M. and Johnson, N.: “Benchmarking for power consumption monitoring”. Springer Computer Science – Research and Development, 2014. DOI: 10.1007/s00450-014-0260-1.

Relevant projects:UEDIN has successfully managed and contributed to numerous national and European projects. Some examples:

PRACE (“Partnership for Advanced Computing in Europe”): UEDIN has led the UK’s technical work in all the PRACE projects to date.

Adept (“Addressing Energy in Parallel Technologies”, FP7 project): UEDIN leads this project that is addressing the challenge of energy and power consumption of parallel software and hardware; including developing benchmarks, power monitoring hardware and techniques as well as a power/performance prediction tool.

NEXTGenIO (“Next Generation I/O for Exascale”, Horizon 2020 Research and Innovation programme under Grant Agreement no. 671951): UEDIN is the project co-ordinator for this project that is investigating hardware and software solutions for I/O on very large scale (Exascale level) systems.

IPCC (Intel Parallel Computing Centre): UEDIN has been an Intel parallel computing centre since April 2014, working on exploiting Xeon Phi hardware for computational simulation.

Third Parties:Does the participant plan to subcontract certain tasks (please note that core tasks of the project should not be sub-contracted)

N

If yes, please describe and justify the tasks to be subcontracted

Does the participant envisage that part of its work is performed by linked third parties

N

If yes, please describe the third party, the link of the participant to the third party, and describe and justify the foreseen tasks to be performed by the third party:

Does the participant envisage the use of contributions in kind provided by third parties (Articles 11 and 12 of the General Model Grant Agreement)

N

If yes, please describe the third party and their contributions

DEEP-EST Page 93 of 122 Last saved 27.03.2017 08:08

Page 94: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4.1.10 Fraunhofer ITWM (FHG-ITWM)

The Fraunhofer Gesellschaft (FhG) is Europe's largest organisation for application-oriented research. Founded in 1949, the organisation undertakes applied research that drives economic development and serves the wider benefit of society. Its services are solicited by customers and contractual partners in industry, the service sector and public administration. The majority of the more than 20,000 staff are qualified scientists and engineers.

The Fraunhofer Institute for Mathematics (ITWM) in Kaiserslautern, Germany, focuses on mathematical approaches to practical challenges like optimisation and visualisation. With computer simulations being an indispensable tool in the design and optimisation of products and production processes, real models are being replaced by virtual models and mathematics play a fundamental role in the creation of this virtual world. Core competences of the ITWM include processing of large data sets, drafting of mathematical models, problem-solving in numerical algorithms, summarisation of data sets, interactive optimisation of solutions, and visualisation of simulation and sensor data.

As part of Fraunhofer ITWM, the Competence Center for High Performance Computing (CC-HPC) develops innovative HPC solutions for the industry and participates in national and international research programs. Since its foundation in 2002, the CC-HPC focuses primarily on parallel application development and development of HPC tools. This includes the communication middleware GPI (Global Address Space Programming Interface), the GPI-Space programming environment for parallel and big data applications, and the BeeGFS File System, formerly known as FhGFS.

FHG-ITWM role in the DEEP-EST project:FHG-ITWM will provide support for the BeeGFS file system(s) used. Furthermore Fraunhofer ITWM will extend BeeGFS to create added value with regard to the concept of modularity and to support the achievement of the project’s objectives.

FHG-ITWM Key people:Dr. Franz-Josef Pfreundt (Male) studied Mathematics, Physics and Computer Science, receiving a Diploma in Mathematics and a PhD in Mathematical Physics (1986). From 1986 to 1995, he was Head of the Research Group for Industrial Mathematics at the University of Kaiserslautern. In 1995, he became Department Head of the Fraunhofer Institute for Industrial Mathematics (ITWM). His research topics are: Fluid dynamics, porous media, image analysis and parallel computing. At the ITWM, he founded the departments "Flow in complex structures" and "Models and algorithms in image analysis". Since 1999, he has been Division Director at Fraunhofer ITWM and Head of the "Competence Center for HPC and Visualisation".

Dr. Valeria Bartsch (Female) holds a Ph.D. in Physics obtained at the University of Karlsruhe in 2003 and has worked on hardware and computing techniques for particle physics until 2013. She has a long standing experience working in international scientific collaborations and has spent a substantial amount of her career in European countries not including her home country Germany (namely the UK, France and Switzerland). She has coordinated a group of scientists concerning the Monitoring of the fast event selection called trigger within the ATLAS collaboration at CERN, Geneva between 2010 and 2013.

Christian Mohrbacher (Male) received a diploma in Computer Science from the Baden-Wuerttemberg Cooperative State University in Karlsruhe in 2008. Following his diploma thesis, he joined the CC-HPC at Fraunhofer ITWM and came into contact with different projects in the field of high-performance computing. Currently, he is coordinating the BeeGFS development group at Fraunhofer ITWM.

DEEP-EST Page 94 of 122 Last saved 27.03.2017 08:08

Page 95: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Relevant products and services:BeeGFS file system: high performance parallel file system, free-of-charge product with commercial support

Relevant projects: DEEP-ER (“DEEP – Extended Reach”, FP7-ICT-610476): Fraunhofer ITWM

contributed to extend the BeeGFS file system technology, addressing the growing gap between I/O bandwidth and compute speed.

ExaNest (H2020-FETHPC-2014): Fraunhofer ITWM extends BeeGFS to make use of the architectures developed in the ongoing ExaNest project, mainly a new unified Communication and Storage Interconnect.

Significant infrastructure:Fraunhofer ITWM hosts internal cluster systems for software development and tests, including traditional HPC clusters (~300 nodes), as well as a large SoC-based cluster (Intel Avoton, >1000 nodes), especially for scalability tests.

Third Parties:Does the participant plan to subcontract certain tasks (please note that core tasks of the project should not be sub-contracted)

N

If yes, please describe and justify the tasks to be subcontracted

Does the participant envisage that part of its work is performed by linked third parties

N

If yes, please describe the third party, the link of the participant to the third party, and describe and justify the foreseen tasks to be performed by the third party:

Does the participant envisage the use of contributions in kind provided by third parties (Articles 11 and 12 of the General Model Grant Agreement)

N

If yes, please describe the third party and their contributions

DEEP-EST Page 95 of 122 Last saved 27.03.2017 08:08

Page 96: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4.1.11 Katholieke Universiteit Leuven (KULeuven)

The team at KU Leuven (KULeuven) is the Centre for mathematical Plasma Astrophysics (CmPA), founded 24 years ago as a division of the Mathematics Department of KULeuven and associated with the Leuven Centre for Aero and Space Science, Technology and Applications (LASA). The CmPA includes four permanent staff members, one part-time affiliated staff member and 26 research fellows, postdocs and PhD students.

The KULeuven team has expertise in areas of space and plasma physics with a strong experience in supporting observational and experimental research. Examples are the support to space weather modelling and forecasting. The KULeuven team employs applied mathematics, high performance computing and fundamental plasma physics models. The KULeuven is also involved in computer science and scientific computing efforts in a wide range of applications: industrial processes, astrophysics, nuclear research. The KULeuven team has a large experience in EU networks from previous FP, Marie Curie RTN and international projects funded in Europe by ESA and in the USA by DOE, NASA and NSF.

KULeuven role in the DEEP-EST project:KULeuven will be handing the space weather application in WP1. Using the expertise developed in using iPic3D for the DEEP and DEEP-ER projects KULeuven will deploy a general learning algorithm (GLA) based on neural networks (NN) to forecast the occurrence of flares, and to predict the initial characteristics of coronal mass ejections (CME).

KULeuven Key people:Prof. Dr. Ing. Giovanni Lapenta (Male) - Full professor of Applied Mathematics at KULeuven and consultant at UCLA and University of Colorado Boulder. Lapenta’s career includes:

Recipient of the RD100 prize in 2005 for his role in the development of the HPC software CartaBlanca and Parsek (a predecessor of the application software iPic3D to be used in the DEEP-EST project).

Scientist at Los Alamos National Laboratory (LANL) from 1992 to 2007. Tenured Research Professor, Politecnico di Torino, 1996-2000. Visiting Scientist at the Massachusetts Institute of Technology in 1992. Ph.D. in Plasma Physics (1993) and Master in Nuclear Engineering (1990) at the

Politecnico di Torino, Italy.

Lapenta has been the coordinator of the FP7 projects SOTERIA, SWIFF and eHeroes (on space weather) and node leader for KULeuven for the FP7 projects CASSIS, DEEP and DEEP-ER. Lapenta was the leader of space weather applications for the Intel Exascience Lab and is involved in NASA and DOE projects in USA.

Dr. Jorge Amaya (Male) - Postdoctoral researcher at KULeuven. Amaya is a software and aerospace engineer, and an expert on space weather and computer science having previously worked on DEEP and DEEP-ER. Amaya will lead Tk1.5: Space Weather.

Relevant publications:Lapenta, G. et al., “Secondary reconnection sites in reconnection-generated flux ropes and reconnection fronts”, Nature Physics, 11, 690695, 2015.

Lapenta, G.: “Particle simulations of space weather”. Journal of Computational Physics 231.3, 795-821, 2012.

Lapenta, G. et al.: “Space weather prediction and exascale computing”. Computing in Science & Engineering 15.5, 68-76, 2013.

DEEP-EST Page 96 of 122 Last saved 27.03.2017 08:08

Page 97: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Markidis, S.; Lapenta, G.; Rizwan-Uddin: “Multi-scale simulations of plasma with iPIC3D”. Mathematics and Computers in Simulation, 80(7), 1509-1519, 2010.

Relevant projects:KULeuven has successfully managed and contributed to numerous international projects. Some examples:

DEEP (“Dynamical Exascale Entry Platform”, FP7-ICT-287530): KULeuven brought the iPIC3D application for Space Weather.

DEEP-ER (“DEEP – Extended Reach”, FP7-ICT-610476): KULeuven brought the iPIC3D application for Space Weather.

SOTERIA, www.soteria-space.eu, coordinated by G. Lapenta (KULeuven), was the first ever space weather project funded by the FP7 programme. Soteria focused on the development of databases and models to study space weather in all phases, from the photosphere to the geospace. Soteria involved 16 centres in 11 countries. Soteria started in November 2007 and concluded successfully after 3 years. SOTERIA is now continuing in its spin-off eHeroes. eHeroes, www.eheroes.eu, is also coordinated by G. Lapenta (KULeuven) and shifts the focus on the modelling of the effect of space weather on space exploration. eHeroes involves 15 groups in 12 countries. eHeroes ended in March 2015 after 3 years of work

SWIFF, www.swiff.eu, coordinated by G. Lapenta (KULeuven), was a project to create a forecasting framework based on an integrated approach to couple methods and codes for physics-based high performance computing forecast of space weather. Swiff involved 7 groups in 5 countries. Swiff, completed in January 2014, after 3 years of successful work.

eHeroes, www.eheroes.eu, coordinated by G. Lapenta (KULeuven), is a project to assess and create forecast products for the effect of space weather on space exploration. eHeroes involves 15 groups in 12 countries. eHeroes ended in March 2015 after 3 years of work

Significant infrastructure:KULeuven is the lead developer of the space weather application used in DEEP-EST and keeps the reference public version of the space weather codes iPic3D (supported by Belgian funds and Open Source) and Slurm (supported by the US Air Force but Open Source).

Third Parties:Does the participant plan to subcontract certain tasks (please note that core tasks of the project should not be sub-contracted)

N

If yes, please describe and justify the tasks to be subcontracted

Does the participant envisage that part of its work is performed by linked third parties

N

If yes, please describe the third party, the link of the participant to the third party, and describe and justify the foreseen tasks to be performed by the third party:

Does the participant envisage the use of contributions in kind provided by third parties (Articles 11 and 12 of the General Model Grant Agreement)

N

If yes, please describe the third party and their contributions

DEEP-EST Page 97 of 122 Last saved 27.03.2017 08:08

Page 98: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4.1.12 Stichting ASTRON, Netherlands Institute for Radio Astronomy (ASTRON)

ASTRON (Netherlands Institute for Radio Astronomy) is one of the leading radio-astronomical institutes in the world. Its main mission is to make discoveries in radio astronomy happen, via the development of new and innovative technologies, the operation of world-class radio astronomy facilities, and the pursuit of fundamental astronomical research. Researchers and engineers at ASTRON have an outstanding international reputation for novel technology development and fundamental research in galactic and extra-galactic astronomy.

ASTRON designed, built, and operates the LOFAR (LOw Frequency ARray) telescope. This telescope will open up a new window on the universe by observing at very low radio frequencies (30-220 MHz), enabling groundbreaking research in astronomy and particle physics. LOFAR is the first of a new generation of radio telescopes, and is essentially a distributed sensor network that combines the signals from tens of thousands of simple antennas, leaving the traditional concept of using dishes. Additionally, LOFAR pioneered the idea to process telescope data in software on a supercomputer (as opposed to custom-build hardware), yielding a much more flexible system that supports a variety of observation modes. LOFAR is an important scientific and technological pathfinder for the next generation of radio telescopes - the Square Kilometre Array (SKA) - a global project in which ASTRON plays a leading role. ASTRON also operates the Westerbork Synthesis Radio Telescope (WSRT), one of the most sensitive telescopes in the world. ASTRON hosts JIVE (the Joint Institute for VLBI in Europe) and the NOVA Optical/IR group.

AstroTec Holding B.V (ATH) is ASTRON's holding company, set up to facilitate the transfer of the technology developed by ASTRON for astronomy, to the market place. The valorisation of ASTRON technology and expertise is made in collaboration with regional partners and other major industrial players.

ASTRON role in the DEEP-EST project:

ASTRON will develop the radio-astronomical applications, will take part in the co-design efforts, will use the developed middleware where appropriate, and will assess the performance and energy efficiency of the modules that will be developed within DEEP-EST.

ASTRON key people:Dr. John W. Romein (Male) is senior system researcher in high-performance computing. He joined ASTRON in 2004. He designed and implemented the LOFAR correlator, which processes the large amounts of telescope data on an IBM Blue Gene/P, in real time. He now performs fundamental research on the use of accelerator hardware for (radio-astronomical) signal-processing algorithms, with applications for LOFAR and the Square Kilometre Array. His research interests include high-performance computing, parallel algorithms, networks, programming languages, and compiler construction.

PublicationsRomein, J.W.: “A Comparison of Accelerator Architectures for Radio-Astronomical Signal-Processing Algorithms”, International Conference on Parallel Processing (ICPP'16), pp. 484-489, Philadelphia, PA, August 2016.

Bal, H.; Epema, D.; de Laat, C.; van Nieuwpoort, R.; Romein, J.; Seinstra, F.; Snoek, C.; and Wijshoff, H.: “A Medium-Scale Distributed System for Computer Science Research: Infrastructure for the Long Term”, IEEE Computer, 49(5):54-63, May, 2016.

Romein, J.W.: “An Efficient Work-Distribution Strategy for Gridding Radio-Telescope Data on GPUs”, ACM International Conference on Supercomputer (ICS'12), pp. 321-330, Venice, Italy, June, 2012.

DEEP-EST Page 98 of 122 Last saved 27.03.2017 08:08

Page 99: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

van Nieuwpoort R.V. and Romein, J.W.: “Correlating Radio Astronomy Signals with Many-Core Hardware”, International Journal of Parallel Programming, Volume 39, Number 1, pp. 88-114, February, 2011, DOI: 10.1007/s10766-010-0144-3.

van Nieuwpoort, R.V. and John W. Romein, J.W.: “Building Correlators with Many-Core Hardware”, IEEE Signal Processing Magazine (special issue on "Signal Processing on Platforms with Multiple Cores: Part 2 -- Design and Applications"), Volume 27, Number 2, pp. 108-117, March, 2010.

Relevant projects: DEEP-ER (“DEEP – Extended Reach”, FP7-ICT-610476): ASTRON's main

contribution was the development of a new imager that not only works on the DEEP-ER system, but is also used to image LOFAR observations.

Dome (Dutch ministry of Economic Affairs and province of Drenthe) a collaboration with IBM, where ASTRON investigated various challenges of the SKA (algorithms and machines, storage, optics, microservers, accelerators, novel algorithms, and networking).

Triple-A (NWO Open Competition) With this grant, ASTRON compare various accelerator architectures for radio-astronomical signal-processing algorithms.

DAS-5 (NWO-M) ASTRON participates in DAS-5, a distributed infrastructure across six Dutch universities and institutes. DAS-5 supports computer science related research.

Significant infrastructure:

ASTRON designed, built, and operates the LOFAR radio telescope, the largest low-frequency telescope worldwide and a SKA precursor. ASTRON will use LOFAR data as input for our applications.

Third Parties:Does the participant plan to subcontract certain tasks (please note that core tasks of the project should not be sub-contracted)

N

If yes, please describe and justify the tasks to be subcontracted

Does the participant envisage that part of its work is performed by linked third parties

N

If yes, please describe the third party, the link of the participant to the third party, and describe and justify the foreseen tasks to be performed by the third party:

Does the participant envisage the use of contributions in kind provided by third parties (Articles 11 and 12 of the General Model Grant Agreement)

N

If yes, please describe the third party and their contributions

DEEP-EST Page 99 of 122 Last saved 27.03.2017 08:08

Page 100: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4.1.13 Association National Centre for Supercomputing Applications (NCSA)

NCSA is founded on April 23, 2009 as a non-profit organisation. Since then NCSA has been committing all its available knowledge, experience and resources to strengthening the national High Performance Computing (HPC) capacity and capability and the integration of Bulgaria in the European HPC ecosystem. These include mainly research and scientific activities on new supercomputer systems and parallel architectures as well as algorithms and software enabling in a range of scientific domains. A key pillar in NCSA-BG’s portfolio of activities is the organisation of a range of HPC training courses for a variety of graduates, doctoral students and research teams.

The success of the High Performance Heterogenus Computer Avitohol demonstrates the NCSA’s competence in the field of computational sciences and system architecture technologies. You can find detailed configuration in the following link http://www.top500.org/system/178609 and as you can see our system consists of 300 Xeon Phi 7120P co-processors and 300 Intel E5-2650 v2 processors installed in 150 cluster nodes.

NCSA, in partnership with Sofia University (SU), the Institute of Information and Communication Technologies (IICT-BAS) and the Institute of Mathematics and Informatics of the Bulgarian Academy of Sciences (IMI-BAS), Technical University – Sofia (TU), Technical University – Plovdiv (TU – Plovdiv), Medical University – Sofia (MU) and the software company Rila Solutions, has been working to meet the increased expectations and needs of the national users for computational capacity and application enabling.

NCSA role in the DEEP-EST project:NCSA team will contribute bringing GROMACS to the DEEP-EST project. First tests will be done on KNL-based HPC machines, to later focus the effort on enabling GROMACS molecular dynamics simulation package for DEEP-EST prototype, in particular on the Extreme Scale Booster.

NCSA Key people:Prof. Dr. Sc. Stoyan Markov (Male) is working in the Institute of Information and Communication Technologies, Bulgarian Academy of Science, since 1989. Since 2008 he has been the Head of the National Centre for Supercomputing Applications. From 2013 to 2016 he led the hardware and system architecture team that created the 150 HP Cluster Avitohol. His research interests cover the fields of massively parallel computer architectures, molecular dynamics and molecular mechanics simulations and massively parallel numerical algorithms. He has published more than 71 scientific papers.

Dr. Peicho Petkov (Male) obtained PhD in High Energy and Particle Physics in 2009. He has been working in the field of Molecular dynamics simulations since 2006. He joined the CMS collaboration at CERN in 2001 and since 2010 he has been involved in the PRACE initiative. His main interests encompass high energy physics, molecular simulations, parallel computing and numerical algorithms. Within the PRACE project, his main activities have been running molecular dynamics simulations with GROMACS, NAMD and LAMMPS on BlueGene/P and BlueGene/Q as well as machines with hybrid Intel Xeon – Intel Xeon Phi architecture.

Relevant Publications:Litov, L.; Ivanov, I.; Petkov, P.; Petkov, P.St.; Lilkova, E.; Markov, S.; Ilieva, N.; Nacheva, G.; Petrov, S.: “A new approach to cope with autoimmune diseases: computer simulations and laboratory tests”, Radiotherapy and Oncology, Volume 102, Supplement 1, March 2012, Pages S134-S135

DEEP-EST Page 100 of 122 Last saved 27.03.2017 08:08

Page 101: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Nacheva, G.; Lilkova, E.; Petkov, P.; Petkov, P.St., Ilieva, N.; Ivanov, I.; Litov, L.: “In silico studies on the stability of human interferon-gamma mutant”, Biotechnology & Biotechnological Equipment, 2012, Vol 26 (1), pp 200-204, DOI: 10.5504/50YRTIMB.2011.0036.

Doytchinova, I.; Petkov, P.; Dimitrov, I.; Atanasova, M. and Flower, D. R.: “HLA-DP2 binding prediction by molecular dynamics simulations”. Protein Science. (2011) doi: 10.1002/pro.732

Petkov, P.; Lilkova, E.; Petkov, P.; Ilieva, N.; and Litov, L.: “Application of metadynamics for investigation of human interferon gamma mutated forms”. Proc. 6th International Scientific Conference Computer Science 2011, 2011, pp. 347-352.

Lilkova, E.; Litov, L.; Petkov, P.; Petkov, P.; Markov, S. and Ilieva, N.: “Computer simulations of human interferon gamma mutated forms”. AIP Conf. Proc., 2010, 1203, pp 914-919.

Relevant projects: PRACE-1IP (PRACE First Implementation Phase): NCSA contributed by porting,

optimising and scaling on Intel Xeon Phi several software packages: Astrophysics Software Gadget, massively parallel multiple sequence alignment method based on artificial bee colony, multiple sequence alignment software ClustalW, Massively parallel 3D Poisson equation solver, the ExaFMM library, and analytical generalised Born plus nonpolar implicit solvent (AGBNP2) library.

PRACE-2IP (PRACE Second Implementation Phase): amongst others, NCSA contributed with the implementation of the Analytical generalised Born plus nonpolar implicit solvent (AGBNP2) library and an iterative Poisson solver in DL_POLY.

Significant infrastructure: Avitohol High Performance Heterogeneous Supercomputer: It is a HP Cluster

Platform SL250S GEN8 (150 servers), Intel Xeon E5-2650 v2 8C 2600 GHz CPUs (300 CPUs), non-blocking InfiniBand FDR, 300 Intel Xeon Phi 7120P co-processors. Storage is provided by a storage system with 96 TB of raw disk storage capacity. The system will be used for tests and optimisation of GROMACS on heterogeneous Xeon Phi based HPC systems.

Third Parties:Does the participant plan to subcontract certain tasks (please note that core tasks of the project should not be sub-contracted)

N

If yes, please describe and justify the tasks to be subcontracted

Does the participant envisage that part of its work is performed by linked third parties

N

If yes, please describe the third party, the link of the participant to the third party, and describe and justify the foreseen tasks to be performed by the third party:

Does the participant envisage the use of contributions in kind provided by third parties (Articles 11 and 12 of the General Model Grant Agreement)

N

If yes, please describe the third party and their contributions

DEEP-EST Page 101 of 122 Last saved 27.03.2017 08:08

Page 102: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4.1.14 Norges miljø- og Biovitenskapelige Universitet - Norwegian University of Life Sciences (NMBU)

Founded in 1859, the Norwegian University of Life Sciences (NMBU) today hosts over 5000 students and 1000 staff on its campus outside of Oslo. Over the past two decades, it has built a strong programme in engineering at the Department of Mathematical Sciences and Technology (IMT) with close to 1000 students. A new programme in Data Science is currently under development at IMT. The Computational Neuroscience Group at IMT/NMBU has accumulated comprehensive experience in large-scale simulations of networks of spiking neurons since 2001, co-developing the NEST simulation tool, with a special focus on generating connectivity in large network models. The group also has broad experience in multi-scale modelling of the signal-processing properties of neurons and networks in the early visual and somatosensory systems and has carried out extensive work on the modelling of extracellular potentials (local field potential, multi-unit activity), and has developed new methods and neuroinformatics tools to model and analyse multielectrode data (iCSD, LFPy, LPA).

NMBU role in the DEEP-EST project:NMBU will help DEEP-EST to understand the requirements of the computational neuroscience community on future hardware for large-scale simulations and data analysis, contribute 20 years of experience in optimising runtime and memory requirements of such simulations across architectures, and adapt NEST simulation code to DEEP-EST hardware.

NMBU Key people:Dr Hans Ekkehard Plesser (Male) is associate professor in informatics at the Norwegian University of Life Sciences. As a core NEST developer since 2001, he has made key contributions to hybrid parallelisation, support for advanced and spatially structured connectivity and quality assurance of NEST code. He leads several NEST-related Tasks in the Human Brain Project, including Simulator NEST as a Service and Massively parallel methods for network construction from rules and data. Plesser is a founding member and current president of the NEST Initiative, chairman of the board of the Norwegian Research School for Neuroscience, the Norwegian representative to the Stakeholder Board of the Human Brain Project, and a member of the Scientific Board of the Geilo Winter School in eScience. Together with Eilen Nordlie and Marc-Oliver Gewaltig, he devised a widely adopted table format for the summary presentation of network models in publications. Plesser has considerable academic management experience, serving as head of the Basic Science Section of his Department from 2010 to 2016. He is a guest researcher at the Institute for Neuroscience and Medicine 6, FZ Jülich.

Relevant publications:Kunkel, S.; Schmidt, M.; Eppler, J. M.; Plesser, H.E.; Masumoto, G.; Igarashi, J.; Ishii, S.; Fukai, T; Morrison, A.; Diesmann, M. and Helias, M.: “Spiking network simulation code for petascale computers”. Front Neuroinform, 8:78, 2014.

Nordlie, E. and Plesser, H.E.: “Visualizing neuronal network connectivity with connectivity pattern tables”. Front. Neuroinform., 3:39, 2010.

Nordlie, E.; Gewaltig, M.-O. and Plesser, H. E.: “Towards reproducible descriptions of neuronal network models”. PLoS Comput Biol, 5(8):e1000456, 2009.

Plesser H.E. and Diesmann, M.: “Simplicity and efficiency of integrate-and-fire neuron models”. Neural Comput, 21:353-359, 2009.

Plesser, H. E.; Eppler, J.M.; Morrison, A.; Diesmann, M.; and Gewaltig, M.-O.: “Efficient parallel simulation of large-scale neuronal networks on clusters of multiprocessor

DEEP-EST Page 102 of 122 Last saved 27.03.2017 08:08

Page 103: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

computers”. In Kermarrec, A.-M.; Bougé, L.; and Priol T. editors, Euro-Par 2007: Parallel Processing, volume 4641 of Lecture Notes in Computer Science, pages 672-681, Berlin, 2007. Springer-Verlag.

Relevant projects: HBP (“Human Brain Project”): The computational neuroscience group at NMBU group

is a core partner of the Human Brain Project, contributing to Subprojects 4 (Theoretical Neuroscience), 6 (Brain Simulation Platform) and 7 (High Performance Analytics and Computing Platform), a partner of the Norwegian DigitalBrain project and a partner of the multidisciplinary Centre for Integrative Neuroplasticity (CINPLA) at the University of Oslo.

Significant infrastructure: NEST source code (available GPL v2 or later) and full access to development

history.

Third Parties:Does the participant plan to subcontract certain tasks (please note that core tasks of the project should not be sub-contracted)

N

If yes, please describe and justify the tasks to be subcontracted

Does the participant envisage that part of its work is performed by linked third parties

N

If yes, please describe the third party, the link of the participant to the third party, and describe and justify the foreseen tasks to be performed by the third party:

Does the participant envisage the use of contributions in kind provided by third parties (Articles 11 and 12 of the General Model Grant Agreement)

N

If yes, please describe the third party and their contributions

DEEP-EST Page 103 of 122 Last saved 27.03.2017 08:08

Page 104: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4.1.15 Háskóli Íslands - University of Iceland (UoI)

The University of Iceland, founded in 1911, is a progressive educational and scientific institution, with total revenue of 88.2 M€ per year. It is a research led state university, situated in the heart of Reykjavík, the capital of Iceland. A modern, diversified and rapidly developing institution, the University of Iceland offers opportunities for study and research in almost 400 programmes spanning most fields of science and scholarship: Schools for Social Sciences, Health Sciences, Humanities, Education, Natural Sciences and Engineering. The University employs over 1.300 people, and has over 14.000 students of which there are more than 500 PhD students. In 2016, the University of Iceland was ranked among the 201-250 of the best universities in the world according to the Times Higher Education Supplement. The University of Iceland has participated in numerous projects under the 6th Framework programme, the 7th Framework Programme, and Horizon 2020, both as a coordinator and as a project partner.

Around 300 highly qualified employees at the School of Engineering and Natural Sciences conduct cutting-edge research and teach in programs that offer diverse and ambitious courses in the field of Engineering and Natural Sciences. The School’s research institutes are highly sought after affiliates by international universities and serve a significant role in the scientific community. The Faculty of Industrial Engineering, Mechanical Engineering, and Computer Science is part of the School of Engineering and Natural Sciences. The team of professors and PhD students contribute to the proposal in the field of parallel and scalable machine learning, statistical data mining, distributed systems, and high performance computing.

UoI role in the DEEP-EST projectUoI will be working on earth science data sets with data analytics applications in WP1. The parallel clustering algorithm HPDBSCAN, the parallel classification algorithm piSVM, and the ported Deep Learning neural network DNN will be deployed and modified in order to study the benefits of the DEEP-EST prototype. Expected are lower time-to-solutions for the data analysis of natural and man-made land cover and of large point clouds.

UoI Key people:Prof. Dr. Helmut Neukirchen (Male) is a full professor for computer science at the Faculty of Industrial Engineering, Mechanical Engineering and Computer Science at the University of Iceland, Reykjavik, Iceland. He holds a doctoral degree in computer science on “Languages, Tools and Patterns for the Specification of Distributed Real-Time Tests” from the University of Göttingen, Germany, and a Master's degree (Dipl.-Inform.) in computer science from the University of Aachen, Germany. He is since 2008 at the University of Iceland and his main research fields are in the domains of Software Engineering and Distributed Systems such as Distributed Computing, Big Data and eScience.

Prof. Dr. – Ing. Morris Riedel (Male) is an adjunct associated professor of the University of Iceland. He received his PhD from the Karlsruhe Institute of Technology (KIT) and has held various positions at the JUELICH Supercomputing Centre in Germany. At this institute, he is retaining his position as the head of a specific scientific research group focused on “High Productivity Data Processing” as part of the Federated Systems and Data Division with PhD students at the University of Iceland. Morris Riedel has a large experience on European and international research projects. He acts as one of the co-chairs of the Research Data Alliance (RDA) Big Data (Analytics) Group wherein research activities of point cloud data structures and remotes sensing will be discussed within the international community. His current research focuses on 'high productivity processing of big data' in the context of scientific computing applications as well as innovative parallel and scalable (hierarchical) data structures. He regularly contributes to EGU and AGU sessions in the general earth science community, including the IGARSS remote sensing conference series.

DEEP-EST Page 104 of 122 Last saved 27.03.2017 08:08

Page 105: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Relevant publications:Goetz, M.; Bodenstein, C.; Riedel, M.: “HPDBSCAN – Highly Parallel DBSCAN”, Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC2015), Machine Learning in HPC Environments (MLHPC) Workshop, Austin, 2015

Cavallaro, G.; Riedel, M.; Richerzhagen, M.; Benediktsson, J.A.; Plaza, A.: “On Understanding Big Data Impacts in Remotely Sensed Image Classification Using Support Vector Machine Methods”, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, Issue 99, pp. 1-13, 2015

Lippert, Th.; Mallmann, D.; Riedel, M.: “Scientific Big Data Analytics by HPC”, Publication Series of the John von Neumann Institute for Computing (NIC) NIC Series 48, 417, ISBN 978-3-95806-109-5, pp. 1-10, 2016

Memon, Shahbaz; Riedel, M.; Memon Shiraz; Koeritz, Ch.; Grimshaw, A.; Neukirchen, H.: “Enabling scalable data processing and management through standards-based job execution and the global federated file system.” Scalable Computing: Practice and Experience, Vol 17, No 2, 2016

Glaser, F.; Neukirchen, H.; Rings, Th.; Grabowski, J.: “Using MapReduce for High Energy Physics Data Analysis.” Proceedings of the 2013 International Symposium on MapReduce and Big Data Infrastructure (MR.BDI 2013), IEEE 2013

Relevant projects: eSTICC (NordForsk funded Nordic Center of Excellence): eScience Tools for

Investigating Climate Change at High Northern Latitudes.

Significant infrastructure:UoI hosts two main HPC systems that can be used for testing the contributed HPC codes used in the DEEP-EST project:

GARÐAR: the Nordic High Performance Computing system, formerly owned by the Nordic countries, now owned and operated by UoI. A production system with 3456 cores provided by Intel Xeon E5649 CPUs. The interconnect is 40 Gb/s InfiniBand.

GARPUR: owned by the Icelandic research organisations, located and operated by UoI. A production system with 1120 cores provided by Intel Xeon E2680 V3 CPUs. The interconnect is 56 Gb/s InfiniBand in addition to 10 Gb/s Ethernet.

Third Parties:Does the participant plan to subcontract certain tasks (please note that core tasks of the project should not be sub-contracted)

N

If yes, please describe and justify the tasks to be subcontracted

Does the participant envisage that part of its work is performed by linked third parties

N

If yes, please describe the third party, the link of the participant to the third party, and describe and justify the foreseen tasks to be performed by the third party:

Does the participant envisage the use of contributions in kind provided by third parties (Articles 11 and 12 of the General Model Grant Agreement)

N

If yes, please describe the third party and their contributions

DEEP-EST Page 105 of 122 Last saved 27.03.2017 08:08

Page 106: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4.1.16 European Organisation for Nuclear Research (CERN)

CERN is the world’s largest particle physics laboratory and home of the Large Hadron Collider (LHC), the world’s most powerful accelerator providing research facilities for High Energy Physics researchers across the globe. In the first run of the accelerator, resulting in the Nobel prize-winning discovery of the Higgs boson, the LHC experiments generated up to 27 PB of data per year. This rate has increased since the LHC’s second run starting in 2015. In processing and analysing this data the experiments run more than 2 Million jobs per day on the global distributed computing infrastructure federating more than 200 computer centres worldwide. This is the Worldwide LHC Computing Grid (WLCG) project which is led by CERN, and provides the resources to store, distribute, analyse, and access in close to real-time, LHC data for a global community of more than 10,000 physicists.

The CERN IT department currently has 230 staff, predominantly engineers, who operate one of Europe’s largest research computer centres distributed across two sites in Switzerland and Hungary and supporting about 17,000 users. The department has developed leading expertise in large scale data centres and longstanding collaborations with industrial and academic partners in the fields of high performance computing, advanced networking, large-scale collaborative tools and is a driver of open access to scientific publications and data. It is a major player in the design, development and deployment of grid, cloud and volunteer computing infrastructures with contribution in all the functional areas of computing, data storage and management, security, identity management and more.

CERN role in the DEEP-EST projectCERN will provide an extreme big-data application drawn from the High Energy Physics domain that will be deployed via DEEP-EST. It will demonstrate the feasibility of an integrated data refresh and reduction center for deploying dynamic improvements in code and calibration for eventual analysis use at the CMS experiment. Instead of processing all of the data and simulation when new code or calibrations are available and running a central processing pass, a dynamic trigger will be developed to reprocess data objects when necessary due to changes in the target code or calibrations.

CERN Key people:Robert Jones (Male) is the HNSciCloud45 (Helix Nebula Science Cloud) project Director and former Head of the CERN openlab collaboration46. Bob Jones coordinated the FP7 PICSE47

(Procurement Innovation of Cloud Services in Europe) and Helix Nebula Support Actions, pioneering partnerships between big science and big business that charted the course towards the sustainable provision of cloud computing in Europe. Robert’s past experience in EC projects also includes the position of Technical Director and Project Director of the FP6 and FP7 EGEE projects48 (2004-2010), which established and operated a production grid facility for e-Science spanning 300 sites across 48 countries for more than 12,000 researchers.

Dr. Maria Girone (Female) is the CERN openlab Chief Technology Officer. Former Software and Computing Coordinator of the CMS49 experiment at the LHC750, she was responsible for 70 computing centers worldwide to archive, simulate, process and serve petabytes of data to more than 3000 researchers. Maria Girone initiated and led the WLCG51 Operations

45 Measured in months from the project start date (month 1).

46 All mentioned partners participated in both DEEP and DEEP-ER, excepting ASTRON and FHG-ITWM, which did participate only in DEEP-ER.47 http://www.hnscicloud.eu/48 http://openlab.web.cern.ch/49 http://www.picse.eu/50 www.eu-egee.org51 http://cms.web.cern.ch

DEEP-EST Page 106 of 122 Last saved 27.03.2017 08:08

Page 107: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Coordination team, responsible for the core operations and commissioning of new services in the WLCG and previously she was deputy group leader of the CERN IT Experiment Support group and task leader within the EGI-InSPIRE9 project.

Relevant publications:CERN openlab; “Whitepaper on Future IT Challenges in Scientific Research”, May 2014, https://indico.cern.ch/event/452144/contributions/1960799/attachments/1166587/1682252/CERNopenlabWhitepaperonFutureICTChallengesinScientificResearchV1.3.pdf

Cabrillo, I.;  Cabello, L.; Marco, J.; Fernandez J.; and Gonzalez, I: “Direct exploitation of a top 500 Supercomputer for Analysis of CMS Data”, 2014 J. Phys.: Conf. Ser. 513 032014, http://iopscience.iop.org/1742-6596/513/3/032014

Relevant projects: Openlab: openlab.cern.ch. CERN openlab is a collaboration between CERN and

industrial partners to develop new knowledge in Information and Communication Technologies through the evaluation of advanced solutions and joint research to be used by the worldwide community of scientists working at the Large Hadron Collider.

ICE-DIP: the Intel-CERN European Doctorate Industrial Program, is a Marie Skłodowska-Curie co-funded European Industrial Doctorate scheme hosted by CERN and Intel Labs Europe. ICE-DIP offers research training to 5 Early Stage Researchers (ESRs) in advanced Information and Communication Technologies (ICT). The technical goal of this programme is to research and develop, through a public-private partnership, unparalleled capabilities in the domain of high throughput, low latency, online data acquisition. The interdisciplinary and interconnected research themes are the usage of many-core processors for data acquisition, future optical interconnect technologies, reconfigurable logic and data acquisition networks.

Significant infrastructure:The Large Hadron Collider (LHC): Approximately 600 million times per second, particles collide within the LHC. Each collision generates particles that often decay in complex ways into even more particles. Electronic circuits record the passage of each particle through the CMS detector as a series of electronic signals, and send the data to the CERN Data Centre (DC) for digital reconstruction. The digitised summary is recorded as a "collision event". Physicists must sift through the 30 petabytes or so of data produced annually to determine if the collisions have thrown up any interesting physics.

The Worldwide LHC Computing Grid (WLCG): CERN does not have the computing or financial resources to crunch all of the data on site, so in 2002 it turned to grid computing to share the burden with computer centres around the world. The Worldwide LHC Computing Grid (WLCG) gives a community of over 10,000 physicists near real-time access to LHC data. The CERN data centre manages a 100 PB scientific data archive, including disk caches of over 50 PB to support the LHC experiments. The CERN data centre is split between 2 physical sites, one in Geneva, and one in Budapest, Hungary, connected through 2x100 Gbps dedicated network connections.

Third Parties:Does the participant plan to subcontract certain tasks (please note that core tasks of the project should not be sub-contracted)

N

If yes, please describe and justify the tasks to be subcontracted

Does the participant envisage that part of its work is performed by linked third parties

N

DEEP-EST Page 107 of 122 Last saved 27.03.2017 08:08

Page 108: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

If yes, please describe the third party, the link of the participant to the third party, and describe and justify the foreseen tasks to be performed by the third party:

Does the participant envisage the use of contributions in kind provided by third parties (Articles 11 and 12 of the General Model Grant Agreement)

N

If yes, please describe the third party and their contributions

DEEP-EST Page 108 of 122 Last saved 27.03.2017 08:08

Page 109: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

4.2 Third parties involved in the project Four partners of the DEEP-EST consortium involve Third Parties in the project:

JUELICH has the linked third party ParTec. Intel has the linked third party Intel Iberia. ETH-Aurora has the linked third party Eurotech. BSC has the in-kind third parties CSIC and UPC.

4.2.1 Linked third party of JUELICH: ParTec Cluster Competence Center (ParTec)

The Forschungszentrum Jülich GmbH (JUELICH) has a long duration relationship with the ParTec Cluster Competence Center GmbH (ParTec), certified by the signature of four long-term agreements:

The “ParaStation Consortium”, with the goal to develop the Open-Source cluster operating and management system “ParaStation”. The corresponding contract was signed in 2005 and is still active. No end-of-contract date has been defined.

JURECA: A Collaboration contract is under preparation (expected duration is 5 years, until 2021)

The “Exa-Cluster Laboratory” (ECL), where JUELICH, ParTec and Intel GmbH operate together a joint laboratory with common ownership and the common goal of developing Exascale hardware-software technology. The cooperation agreement has been signed in May 2010 and will be active until May 2020.

As stated above, the formal relationship between JUELICH and ParTec is based on both, MoUs and Cooperation Agreements, with duration beyond the DEEP-EST project, predating and outlasting the EC-GA.

Forschungszentrum Jülich made a decision in 2010 to have ParTec as third party to co-develop the ParaStation software. Since then, when both institutions work together in R&D projects (e.g. in the European-funded projects DEEP and DEEP-ER), ParTec consequently holds a position as linked third party to JUELICH. Due to the history and positive experience of this partnership, ParTec keeps its position as JUELICH’s linked third party also for the DEEP-EST project.

Description of ParTec:For more than 15 years, ParTec Cluster Competence Center GmbH has been a strong general-purpose Cluster specialist. ParTec develops and supports a comprehensive suite of Cluster management tools together with a runtime environment tuned for the largest distributed memory supercomputers in existence today and beyond. Providing also professional services, consultancy, and support, ParTec was elected as the partner of choice in some of the leading HPC sites across Europe. ParTec is privately owned and has its headquarters in Munich, Germany, with branch offices in Karlsruhe, Mannheim and Jülich. As a member of major European and worldwide research consortia (e.g. ETP4HPC, PROSPECT e.V., EOFS, and exascale10) ParTec contributes to the development of the next Petaflop architecture of supercomputers and further, towards Exaflop computing paradigms.

ParTec’s ParaStation ClusterSuite is extensively used in production environments, e.g. on the general purpose JURECA Cluster system with 2.2 PFlop/s peak performance run by the Jülich Supercomputing Centre (JSC) and ranked no. 51 in the June 2016 TOP500 list, as well as on experimental systems. ParTec is the chosen partner for the co-design and the support of the JUROPA 3 and JuAMS cluster systems, likewise located in the Jülich Supercomputing Centre.

ParTec participates in nationally funded research projects like FAST and (as a partner of JUELICH) in EU-funded projects like DEEP and DEEP-ER. There, ParTec contributes to the

DEEP-EST Page 109 of 122 Last saved 27.03.2017 08:08

Page 110: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

runtime and middleware of the DEEP Cluster-Booster system, and recently to modules providing multi-level, application-based fault tolerance. In addition, ParTec supports JUELICH internally at the project management level.

Together with Intel GmbH and Forschungszentrum Jülich, ParTec forms the ExaCluster Laboratory (ECL), developing promising new HPC architectures and prototypes to enter the Exascale era.

ParTec role in the DEEP-EST project:Similar to the DEEP and DEEP-ER projects, ParTec will contribute its ParaStation Cluster Suite to the DEEP-EST project and adapt it to the requirements of the envisioned MSA. In particular, ParaStation MPI - implementing the Message Passing Interface (MPI) standard - including its versatile processes management subsystem, will be enhanced with features for modularity-aware message passing according to constellations of process groups linked together in an evolved and multi-staged topology. ParTec will also support JUELICH in the project management, specifically regarding internal communication and quality management.

ParTec key people:Hugo Falter (Male) is Co-founder and Chief Operating Officer of the ParTec Cluster Competence Center GmbH, a spin-off from the Computer Science Department of the University of Karlsruhe in 1999. Mr. Falter studied law in Regensburg and Munich. Together with a Munich law firm he specialised in bringing innovative technology companies to market. In the Munich law firm Frohwitter, Hugo Falter is responsible for the firm's subsidiary ParTec Cluster Competence Center GmbH.

Thomas Moschny (Male) is working at ParTec since 2008, from 2013 as the Chief Technology Officer (CTO). He has got a Diploma in Theoretical Particle Physics from the University of Wuppertal. From 2000 to 2008 he was member of the group of Prof. Tichy at the CS Department of University of Karlsruhe, working on high performance communication software and parallel programming environments. In addition to the CTO responsibilities his main focus now is on designing and developing monitoring tools for HPC clusters. At ParTec he is the co-leader of the resiliency software WP in the DEEP-ER project.

Dr.-Ing. Carsten Clauss (Male) is working at ParTec since 2013. He holds a diploma and a doctorate degree in Electrical Engineering (with focus on Computer Engineering) as well as a Master's degree in Computer Science. His research interests are parallel, cluster, grid and cloud computing. From 2004 to 2013 he was as research associate of the Chair for Operating Systems at the RWTH Aachen University, Germany. Since end of 2013, he is a software developer and research engineer at ParTec. He published one book, contributed to several book chapters, and authored over 20 workshop and conference papers.

Ferdinand Geier (Male) is working at ParTec since 2002. He got two Diplomas in Electrical Engineering from the University of Applied Science in Munich and the Technical University Munich. He is working in the support department of ParTec focusing system administration and software support. He is involved in the support part of the on-going DEEP-ER project and also in the support of the existing DEEP Cluster-Booster system.

Ina Schmitz (Female) is working as project manager at ParTec since 2010. She has got a Diploma in Natural Science from TU Bergakademie Freiberg. Today, her activities include the internal and external organisation of large projects. At ParTec she is part of the ExaCluster Laboratory and member of the Project Management Team (PMT) in the already finished DEEP project and in the follow-up project DEEP-ER, which will end on March 2017.

DEEP-EST Page 110 of 122 Last saved 27.03.2017 08:08

Page 111: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Relevant publications:Eicker, N.; Lippert, T.; Moschny, T.; Suarez, E.: “The DEEP Project - An alternative approach to heterogeneous cluster-computing in the many-core era”, In Concurrency and computation: Practice and Experience, Vol. 28 (8) 2016, pp. 2394-2411, DOI: 10.1002/cpe.3562

Moschny, T.; Labarta, J.; Gimenez, J.; Knobloch, M.: “The DEEP Programming model and analysis tools”. First joint DEEP/CRESTA/Mont-Blanc Workshop, Barcelona, June 10-11, 2013

N. Eicker, A. Galonska, J. Hauke, M. Nuessle: “Bridging The DEEP Gap - Implementation of an Efficient Forwarding Protocol”, Technical Report FZJ-2014-05536, Intel European Exascale Labs, 2014, http://juser.fz-juelich.de/record/1719822013

Clauss, C.; Moschny, T.; Eicker, N.: “Dynamic Process Management with Allocation-internal Co-Scheduling towards Interactive Supercomputing”, In 1st HiPEAC Workshop on Co-Scheduling of HPC Applications (COSH), European Network on High Performance and Embedded Architecture and Compilation (HiPEAC), January 2016

Pickartz, S.; Clauss, C.; Lankes, S.; Krempel, S.; Moschny, T.; Monti, A.: “Non-Intrusive Migration of MPI Processes in OS-bypass Networks”, In Proceedings of the 1st Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware (IPRDM), IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), May 2016

Relevant projects:ParTec has successfully managed and contributed to numerous national and European projects. Some examples:

DEEP (“Dynamical Exascale Entry Platform”, FP7-ICT-287530): ParTec participated in the DEEP project as JUELICH's linked third party. The main contributions of ParTec were on the middleware of the DEEP system, with a focus on MPI. ParTec also provided installation and administration support for the system. In addition, ParTec was member of the Project Management Team (PMT) and supported JUELICH in various management Tasks.

DEEP-ER (“DEEP – Extended Reach”, FP7-ICT-610476): ParTec participates in the DEEP-ER project as JUELICH's linked third party. In continuation of the efforts within DEEP, ParTec focuses on the middleware of the system, with a special emphasis on resiliency. ParTec also provides installation and administration support for the system. In addition, ParTec is member of the Project Management Team (PMT) and supports JUELICH in various management Tasks.

FAST (“Find a Suitable Topology for Exascale Applications”, 01IH13004G): ParTec participates in the FAST project funded by the German Federal Ministry of Education and Research (BMBF). The project aims at optimising the assignment between parallel applications with different needs and the available resources by developing new scheduling algorithms that rely on virtualisation and process migration techniques. ParTec’s main contribution thus focuses on extending its complete MPI to support process migration in virtualised HPC environments.

4.2.2 Linked third party of Intel: Intel Iberia S.A. (Intel Iberia)

Like the project partner Intel Deutschland GmbH, Intel Corporation Iberia S.A. with headquarters in Madrid is a fully owned subsidiary of Intel Corporation in the United States. It conducts all Intel sales and engineering operations in Spain and Portugal, including the collaboration with the Barcelona Supercomputing Center in Barcelona (Intel-BSC Exascale Lab) and the “Intelligent Dialogue Systems” research laboratory in Seville. All Intel personnel in Spain and Portugal are employed by Intel Iberia S.A.

DEEP-EST Page 111 of 122 Last saved 27.03.2017 08:08

Page 112: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

The Intel-BSC Exascale Lab focuses on dynamic, task-based programming models (OmpSs) and scalable performance analysis and prediction tools. Both areas are highly relevant for enabling applications to improve scalability and (energy) efficiency to make them suitable for Exascale computing.

Intel GmbH has been working closely with Intel Corporation Iberia S.A. for several years within the Intel Labs Europe network:

Since 2010, joint work on enabling applications for Intel Xeon Phi in the Developer Relations division of Intel’s Software and Solutions Group. The HPC team managed by Christopher Dahnken in Munich did include Intel Iberia S.A. employees located at Barcelona, since transferred to the HPC pathfinding team managed by Hans-Christian Hoppe.

Since 2012, Intel Iberia S.A. has been active as a third party to Intel GmbH in the FP7 project DEEP, focusing on the analysis, porting and optimisation of pilot applications to the novel DEEP architecture. This collaboration has been extended to the DEEP-ER FP7 project since October 2013.

Since early 2014, Intel Iberia S.A. works closely with Intel GmbH in both the DEEP and DEEP-ER projects on architecture, system implementation and system SW, taking the place of the personnel of the former Intel Germany Design Center at Braunschweig.

Project tasks performed by Intel Iberia as linked third party from Intel:

In the DEEP-EST project Intel Iberia S.A. will participate as Intel Deutschland GmbH’s linked third party through resources of the Intel-BSC Exascale Lab working from Barcelona, Spain. Contributions will focus on the benchmarking and modelling activities in WP2, and on providing advice and support for application analysis, porting and optimisation on Intel platforms in WP1.

Intel Iberia Key people:Harald Servat (Male) is an HPC expert with strong knowledge in monitoring systems, parallel programming models, compilers and computer architecture. He currently works at Intel Corp. on optimising applications for the next generation HPC systems (including Xeon Phi and FPGA-based systems). Before that, he was the maintainer of the instrumentation library for the BSC performance tools suite (Extrae) while adapting it to new technologies and pursuing large scalability. In 2015, he received his Ph.D. in providing instantaneous metrics combining coarse-grain instrumentation and sampling techniques. His thesis resulted in a tool named Folding that easily points out the nature of the performance bottlenecks and their location in the application code.

Alex Duran (Male) did work as an assistant professor at Universitat Politècnica de Catalunya and as one of the project leads at Barcelona Supercomputing Center for the OmpSs programming model. His research focus was on runtime and compiler techniques for parallel computation, in particular OpenMP and its extension to OmpSs. Alex joined Intel Iberia as a senior software engineer in 2012, where he is responsible for enabling system and application level software for the Intel® Xeon PhiTM line of CPUs, and for exploring enhancements to the OpenMP standard. Alex did provide valuable support to the application activities in both the DEEP and DEEP-ER projects.

Relevant publications:Harald Servat, H.; Llort, G.; Giménez, J.; Labarta,J.:”Detailed and simultaneous power and performance analysis”. Concurrency and computation 28 (2) 2016, 252-273. DOI: 10.1002/cpe.3188

DEEP-EST Page 112 of 122 Last saved 27.03.2017 08:08

Page 113: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

Harald Servat, H.; González, J.; Llort, G.; Giménez, J.; Labarta, J.:” Low-Overhead Detection of Memory Access Patterns and Their Time Evolution”. Euro-Par 2015: Parallel Processing, Lecture Notes in Computer Science 9233 2015, 57-69. DOI: 10.1007/978-3-662-48096-0_5.

Léger, R.; Mallón, D. A.; Duran, A.; Lanteri, S.: “Adapting a Finite-Element Type Solver for Bioelectromagnetics to the DEEP-ER Platform”. Proceedings of the International Conference on Parallel Computing, ParCo 2015, 349-359. DOI: 10.3233/978-1-61499-621-7-349.

Caballero, D.; Royuela, S.; Ferrer, R.; Duran, S.; Martorell, X.:”Optimizing Overlapped Memory Accesses in User-directed Vectorization”. Proceedings of the 29th International Conference on Supercomputing, 2015, 393-404. DOI: 10.1145/2751205.2751224.

Relevant Projects: DEEP (“Dynamical Exascale Entry Platform”, FP7-ICT-287530, 1st December 2011 – 31st

May 2015): DEEP is one of the EC’s Exascale projects and is developing a novel, Exascale-enabling supercomputer architecture based on the concept of compute acceleration in conjunction with a software stack focused on meeting Exascale requirements.

DEEP-ER (“Dynamical Exascale Entry Platform – Extended Research”, FP7-ICT-610476, 1st October 2013 – 30th September 2016) : The goal of the DEEP-ER is project is to update the Cluster-Booster architecture introduced by the DEEP project and extend it with additional parallel I/O and resiliency capabilities.

4.2.3 Linked third party of ETH-Aurora: Eurotech S.p.A. (Eurotech)Eurotech S.p.A. is the parent company of ETH-Aurora, its wholly owned subsidiary dedicated to HPC activities. Eurotech’s long term focus on HPC technology through its HPC division is the foundation of ETH-Aurora’s activities. The Eurotech group portfolio includes such segments as Embedded Systems and Internet of Things (IoT), in a wide range of industries, including transportation, healthcare, defence and so forth.

Eurotech descriptionEurotech is a global company with a strong international focus: founded and still headquartered in Italy, it has operating locations in Europe, North America and Japan. It thrives on the vision that pervasiveness and ubiquity of miniaturised and interconnected computers, and their seamless integration in everyday’s environment, are making objects smarter and infrastructure greener, and ultimately improving human life, making it simpler, safer and more comfortable.

Eurotech role in DEEP - EST Eurotech’s know-how in the design and implementation of sophisticated computing devices, including complex printed circuit boards (PCBs), capable of carrying high-speed signals high-speed has been an asset for the group’s activities in HPC. ETH-Aurora, who inherits Eurotech’s HPC activities, intends to continue using Eurotech’s competencies, where applicable. Likewise, Eurotech is keen on supporting its ETH-Aurora subsidiary in the implementation of leading-edge projects such as DEEP-EST.

Eurotech Key PeopleMarco Carrer (Male) is CTO of EUROTECH where he is leading the company towards the enablement of Internet of Things solutions and applications. Before, Marco had a successful 15-year career in the software industry. He started as an engineer at Sony where he contributed to a pioneering video streaming solution. Later, he joined Oracle in the Server Technologies group where, over the span of 13-year career, he became Senior Director and was directly responsible for the design and development of several innovative Oracle

DEEP-EST Page 113 of 122 Last saved 27.03.2017 08:08

Page 114: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

products in the areas of Web Services, Enterprise Collaboration, and CRM Service. Throughout his career, Marco loved collaborating on Open Source projects and he is currently project lead for Eclipse Kura, a Java framework for IoT gateways. Marco Carrer holds a Laurea in Electronic Engineering from University of Padova and a Master in Computer Science from Cornell University, New York. He is also a certified SCRUM Master and has been awarded ten US patents.

4.2.4 In-kind Third Parties of BSC: Spanish Council for Scientific Research (CSIC) and Universitat Politècnica de Catalunya (UPC)

Some of the work carried out at the Barcelona Supercomputing Center–Centro Nacional de Supercomputación will be contributed free of charge by BSC Third Parties (Article 12 Grant Agreement): Spanish Council for Scientific Research (CSIC) and Universitat Politècnica de Catalunya (UPC).

The BSC is a consortium that is composed of the following member institutions: the Universitat Politècnica de Catalunya and the Spanish and the Catalan governments. Both UPC and the Spanish government (through CSIC) contribute in kind by making human resources available to work on projects. The relationship between BSC and CSIC and UPC (respectively) is defined in an agreement with each institution that was established prior to the start of this project.

4.2.4.1 Consejo Superior de Investigaciones Científicas (CSIC)

Some CSIC researchers carry out their work at universities and research centers based in Spain, institutions with which CSIC actively collaborates. This collaboration takes place within the framework of long-term agreements, ensuring that CSIC researchers are fully integrated into teams and research projects. CSIC has signed collaboration agreements with several entities, including the BSC.

The relationship between BSC and CSIC is defined in an agreement established prior to the start of this project, and thus, not limited to it. BSC is free to use these resources provided by CSIC at will, they are therefore assimilated as "own resources" and will be charged to the project without being considered as a receipt. The cost will be declared by the beneficiary and it will be recorded in the accounts of the third party. These accounts will be available for auditing if required.

CSIC Key people:Dr. Rosa Maria Badia (Female) is a CSIC researcher of the Instituto de Investigación en Inteligencia Artificial (IIIA) affiliated with the BSC. She carries out her research in association with the Barcelona Supercomputing Center - Centro Nacional de Computación on the BSC premises.

4.2.4.2 Universitat Politècnica de Catalunya (UPC)

The High Performance Computing research group of the Computer Architecture Department at the Universitat Politècnica de Catalunya (UPC) is the leading research group in Europe in topics related to high performance processor architectures, runtime support for parallel programming models, performance tuning applications for supercomputing and Cloud Computing.

The High Performance Computing research group at the UPC shares many key resources with the BSC, including several key personnel that will be dedicated to this project. There is a signed Collaboration Agreement between the UPC and the BSC establishing the framework of the relationship between these two entities. According to this agreement, several professors of the UPC are made available to the BSC to work on projects.

DEEP-EST Page 114 of 122 Last saved 27.03.2017 08:08

Page 115: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

5 Ethics and Security

5.1 EthicsThe research carried out in DEEP-EST does not enter any ethics issues in the ethical issue table in the administrative part of the proposal forms.

5.2 SecurityThe DEEP-EST project will not involve any:

activities or results raising security issues or any NO ‘EU-classified information’ as background or results. NO

DEEP-EST Page 115 of 122 Last saved 27.03.2017 08:08

Page 116: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

6 Glossary

AAPI: Application Programming Interface

ASIC: Application Specific Integrated Circuit, Integrated circuit customised for a particular use

ASTRON: Netherlands Institute for Radio Astronomy, Netherlands

Aurora: Name of the subsidiary of the Eurotech Group dedicated to the High Performance Computing (HPC) business. Aurora also refers to Eurotech’s line of cluster systems.

BBADW-LRZ: Leibniz-Rechenzentrum der Bayerischen Akademie der Wissenschaften.

Computing Centre, Garching, Germany

BDA: Big Data Analytics

BDEC: Big Data and Extreme-Scale Computing

BeeGFS: The Fraunhofer Parallel Cluster File System (previously acronym FhGFS). A high-performance parallel file system.

BeeOND: BeeGFS-on-demand, parallel storage based on BeeGFS

BN: Booster Node (functional entity)

BoP: Board of Partners for the DEEP-EST project

BSC: Barcelona Supercomputing Centre, Spain

BSCW: Repository used in the DEEP-EST project to share all project documentation.

CCA: Consortium Agreement

CERN: European Organisation for Nuclear Research / Organisation Européenne pour la Recherche Nucléaire, International organisation

CM: Cluster Module: with its Cluster Nodes (CN) containing high-end general-purpose processors and a relatively large amount of memory per core

CME: Coronal Mass Ejections

CMS: Compact Muon Solenoid experiment at CERN’s LHC

CN: Cluster Node (functional entity)

CNN: Convolutional Neural Networks

COTSL Commercial off-the-shelf

CPU: Central Processing Unit

DEEP-EST Page 116 of 122 Last saved 27.03.2017 08:08

Page 117: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

CSIC: Spanish Council for Scientific Research

DDAM: Data Analytics Module: with nodes (DN) based on general-purpose

processors, a huge amount of (non-volatile) memory per core, and support for the specific requirements of data-intensive applications

DDG: Design and Developer Group of the DEEP-EST project

DEEP: Dynamical Exascale Entry Platform (project FP7-ICT-287530)

DEEP-ER: DEEP - Extended Reach (project FP7-ICT-610476)

DEEP/-ER: Term used to refer jointly to the DEEP and DEEP-ER projects

DEEP-EST: DEEP - Extreme Scale Technologies

Dimemas: Performance analysis tool developed by BSC

DN: Nodes of the DAM

DNN: Deep neural network

DoW: Description of Work

DRAM: Dynamic Random Access Memory. Typically describes any form of high capacity volatile memory attached to a CPU

EEC: European Commission

EEHPC: Energy Efficient High Performance Computing

EEP: European Exascale Projects

EPT4HPC: European Technology Platform for High Performance Computing

ESB: Extreme Scale Booster: with highly energy-efficient many-core processors as Booster Nodes (BN), but a reduced amount of memory per core at high bandwidth

ETH-Aurora: Acronym used to refer to Aurora, the Eurotech Group dedicated to the High Performance Computing (HPC) business

EU: European Union

Eurotech: Eurotech S.p.A., Amaro, Italy

Exascale: Computer systems or Applications, which are able to run with a performance above 1018 Floating point operations per second

EXDCI: European Extreme Data & Computing Initiative

EXTOLL: High speed interconnect technology for HPC developed by UHEI

Extrae: Performance analysis tool developed by BSC

FFFT: Fast Fourier Transform

DEEP-EST Page 117 of 122 Last saved 27.03.2017 08:08

Page 118: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

FHG-ITWM: Fraunhofer Gesellschaft zur Foerderung der Angewandten Forschungs e.V., Germany

Flop/s: Floating point Operation per second

FP7: European Commission 7th Framework Programme

FPGA: Field-Programmable Gate Array, Integrated circuit to be configured by the customer or designer after manufacturing

FTI: Fault Tolerant Interface, a checkpoint/restart library

GGCE: Global Collective Engine, a computing device for collective operations

GFlop/s: Gigaflop, 109 Floating point operations per second

GLA: General Learning Algorithms

GPU: Graphics Processing Unit

GROMACS: A toolbox for molecular dynamics calculations providing a rich set of calculation types, preparation and analysis tools

HH2020: Horizon 2020

HBM: High Bandwidth Memory

HPC: High Performance Computing

HPDA: High Performance Data Analytics

HPDBSCAN: A clustering code used by UoI in the field of Earth Science

HW: Hardware

IIC: Innovative Council

IDC: International Data Corporation

Intel: Intel Germany GmbH, Feldkirchen, Germany

I/O: Input/Output. May describe the respective logical function of a computer system or a certain physical instantiation

IP: Intellectual Property

iPic3D: Programming code developed by the KULeuven to simulate space weather

ISO: International Organisation for Standardisation

JJLESC: Joint Laboratory for Extreme Scale Computing

DEEP-EST Page 118 of 122 Last saved 27.03.2017 08:08

Page 119: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

JUBE: Jülich Benchmarking Environment

JUELICH: Forschungszentrum Jülich GmbH, Jülich, Germany

JURECA: Jülich Research on Exascale Cluster Architectures

KKNL: Knights Landing, second generation of Intel® Xeon PhiTM

KNH: Knights Hill, next generation of Intel® Xeon PhiTM

KULeuven: Katholieke Universiteit Leuven, Belgium

LLHC: Large Hadron Collider (LHC), the world’s most powerful accelerator

providing research facilities for High Energy Physics researchers across the globe

LLNL: Lawrence Livermore National Laboratory

LOFAR: Low-Frequency Array, an instrument for performing radio astronomy built by ASTRON

MMegware: Megware Computer Vertrieb und Service GmbH, Chemnitz, Germany

MHD: Magneto-hydrodynamics

Mont-Blanc: European scalable and power efficient HPC platform based on low-power embedded technology

MoU: Memorandum of Understanding

MPI: Message Passing Interface, API specification typically used in parallel programs that allows processes to communicate with one another by sending and receiving messages

MSA: Modular Supercomputer Architecture

NNAM: Network Attached Memory

NCSA: National Centre for Supercomputing Applications, Bulgaria

NEST: Widely-used, publically available simulation software for spiking neural network models developed by NMBU.

NF: Network Federation within the DEEP-EST prototype

NMBU: Norwegian University of Life Sciences, Norway

NN: Neural Network

NUMA: Non-Uniform Memory Access

NV-DIMM: Non-Volatile Dual In-line Memory Module

DEEP-EST Page 119 of 122 Last saved 27.03.2017 08:08

Page 120: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

NVM: Non-Volatile Memory. Used to describe a physical technology or the use of such technology in a non-block-oriented way in a computer system

NVRAM: Non-Volatile Random-Access Memory

OOA: Open Access

ODC: Other direct costs

OGC: Open Geospatial Consortium

OmpSs: BSC’s Superscalar (Ss) for OpenMP

OpenCL: Open Computing Language, framework for writing programs that execute across heterogeneous platforms

openHPC: A community effort that is initiated from a desire to aggregate a number of common ingredients required to deploy and manage HPC Linux clusters

OpenMP: Open Multi-Processing, Application programming interface that support multiplatform shared memory multiprocessing

PParaStation: Software for cluster management and control developed by JUELICH

and its linked third party ParTec

Paraver: Performance analysis tool developed by BSC

ParTec: ParTec Cluster Competence Center GmbH, Munich, Germany. Linked third Party of JUELICH in DEEP-EST

PFlop/s: Petaflop, 1015 Floating point operations per second

PI: Principal Investigator

piSVM: Parallel classification algorithm

PME: Particle mesh Ewald

PMT: Project Management Team of the DEEP-EST project

PRACE: Partnership for Advanced Computing in Europe (EU project, European HPC infrastructure)

Q

RR&D: Research and Development

RAM: Random-Access Memory

RAS: Reliability, Availability, Serviceability

RDA: Research Data Alliance

DEEP-EST Page 120 of 122 Last saved 27.03.2017 08:08

Page 121: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

RM: Resource Manager

RML: Risk management list used in the DEEP-EST project

SSCR: Scalable Checkpoint/Restart. A library from LLNL

SDV: Software Development Vehicle: HW systems to develop software in the time frame where the DEEP-EEST prototype is not yet available.

SIMD: Single Instruction Multiple Data

SIONlib: Parallel I/O library developed by Forschungszentrum Jülich

SKA: Square Kilometer Array

SLURM: Job scheduler that will be used and extended in the DEEP-EST prototype

Slurm: MHD code developed by KULeuven.

SME: Small and Medium Enterprises

SRA: Strategic Research Agenda prepared by ETP4HPC

SSSM: Scalable Storage Service Module

STEM: Science, technology, engineering and mathematics

STS: Satellite time series

SW: Software

TTFlops: Teraflop, 1012 Floating point operations per second

ThinkParQ: Spin-off company of FHG-ITWM

Tk: Task, Followed by a number, term to designate a Task inside a Work Package of the DEEP-EST project

ToW: Team of Work Package leaders of the DEEP-EST project

TRL: Technology Readiness Levels

UUEDIN: University of Edinburgh, UK

UHEI: Ruprecht-Karls-Universitaet Heidelberg, Germany

UoI: Háskóli Íslands – University of Iceland, Iceland

UPC: Universitat Politècnica de Catalunya. Barcelona, Spain

VWDEEP-EST Page 121 of 122 Last saved 27.03.2017 08:08

Page 122: Abstract · Web viewKey messages will be in-line with the DEEP/ER messaging, especially when talking about overall concepts. Specific DEEPEST messages will be defined and all key

FETHPC-01-2016 Part B – Project Proposal DEEP-EST

WLCG: Worldwide LHC Computing Grid

WP: Work package

Xx86: Family of instruction set architectures based on the Intel 8086 CPU

YZ

DEEP-EST Page 122 of 122 Last saved 27.03.2017 08:08