embedded systems: projects - politecnico di milano from their off-chip network counterparts in...

26
Embedded Systems: Projects Davide Zoni PhD Student email: [email protected] webpage: home.dei.polimi.it/zoni Friday, October, 2013

Upload: lamduong

Post on 18-May-2018

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Embedded Systems: Projects

Davide Zoni PhD Studentemail: [email protected]

webpage: home.dei.polimi.it/zoni

Friday, October, 2013

Page 2: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Contacts & Places

Prof. William Fornaciari (Professor in charge)

email: [email protected]

webpage: home.dei.polimi.it/fornacia

Davide Zoni (PhD Student)

email: [email protected]

webpage: home.dei.polimi.it/zoni

Page 3: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Research Activities

Network-on-Chip (NoC)• Simulation (component design, evaluation)

• Power/performance optimization

• Thermal/performance optimization

• Application oriented design

• Heterogeneous NoC

Reliability• Fault Tolerance

• Thermal issues involved in reliability Project Topics

1. Simulation

2. NoCs

3. Caches

4. Thermal Management

5. Reliability

Caches• Cache hierarchy in multi-cores

• NoC-caches design space exploration

Page 4: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Types of projects

Bibliographic research (2 points max)• state of the art on a specific topic

• material organization and presentation

• comparing different approaches (if possible)

Development project (4 points max)• In depth understanding of the tools you are working with

• Basic theoretical background for the problem

• Coding work on a specific tool

Page 5: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects ' Rules

• Both a written report and an oral presentation are required

– Possibility to earn an extra point if the presentation will be done within the course hours

• The project grade is a weighted sum of the following aspects:

– Written report: contents, style (i.e., formalism), bibliography

– Oral presentation: style (i.e., formalism), answers to questions

– Level of independence in doing the work

– The higher the starting written grade, the higher the expectation in the project

• Each project requires some basic knowledge and involves new learning

• Please avoid copying existing works (e.g.,article, technical reports, etc.)

– You will be encouraged to change the project advisor!

• Once a project has been assigned

– you have 2 months to complete the project (soft deadline)

– you are not allowed to change it for any reason

Page 6: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

The focus of the projects

Page 7: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-01 (Area: Simulation/NoC)

Title: (REVIEW) Snooping coherence protocol in NoC.0

Requirements: Computer architecture

Description:

Snooping coherence represents a class of cache coherence protocols widely employing in bus-based architectures. The switch towards NoC architectures imposes the review of this type of protocol that rely on the snooping possibility offered by a bus and on a total request order. However there are many works that deals with snooping on NoC exploiting virtual bus-based architectures. Moreover, other works start from snooping protocol and redesign it. The goal of this project is a bibliographic research on the current state of the art related to snooping protocols used in NoC-based architectures.

Starting references:

1. Stefanos Kaxiras and Alberto Ros, “Efficient, Snoopless, SoC Coherence”, SoCC 2012

Page 8: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-02 (Area: Simulation/NoC)

Title: (REVIEW) Cache partitioning in NoC based architectures.0

Requirements: Computer architecture

Description:

The possibility to integrate more and more cores on the same chip must be supported using an appropriate cache hierarchy. In particular considering tiled based NoC architectures cache banks and physically split among different tiles, ensuring NUCA-based architectures. The possibility to access a huge amount of memory is traded-off with the increased latency to access far banks allowing for a joint NoC/cache analysis. The student is asked to review the state of the art on cache hierarchy in NoC architectures where the two component are considered at the same time during design stages.

Starting references:

1. Stavros Volos, Ciprian Seiculescu, Boris Grot, Naser Khosro Pour, Babak Falsafi, Giovanni De Micheli, "CCNoC: Specializing On-Chip Interconnects for Energy Efficiency in Cache-Coherent Servers" Networks-on-Chip, International Symposium on, pp. 67-74, 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip, 2012

Page 9: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-03 (Area: Simulation/NoC)

Title: (REVIEW) Private vs Shared LLC in multi-core architectures

Requirements: Computer architecture

Description:

The possibility to integrate more and more cores on the same chip must be supported using an appropriate cache hierarchy. In particular the LLC (last level cache) can be shared or private to each core. Moreover, the possibility to allocate one single or multiple LLC banks should be considered during the architecture design stage. In such a contest NUCA or UCA cache hierarchy can be provided in the final architecture. The student is asked to review the state of the art on cache hierarchy with specific focus on particular design properties for the LLC, i.e. shared/private, UCA/NUCA, single/multiple physical blocks.

Starting references:

1. Changkyu Kim, Doug Burger, and Stephen W. Keckler. 2002. “An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches”. SIGARCH Comput. Archit. News 30, 5 (October 2002)

Page 10: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-04 (Area: Simulation/NoC)

Title: (IMPL.) Multiprogrammed Workload in Cache partitioning in NoC based architectures.0

Requirements: Computer architecture

Description:

The possibility to integrate more and more cores on the same chip must be supported using an appropriate cache hierarchy. In particular considering tiled based NoC architectures cache banks and physically split among different tiles, ensuring NUCA-based architectures. The possibility to access a huge amount of memory is traded-off with the increased latency to access far banks allowing for a joint NoC/cache analysis. The projects required to design a partitioned L2 cache framework considering multiprogrammed workload using GEM5.

Starting references:

1. Stavros Volos, Ciprian Seiculescu, Boris Grot, Naser Khosro Pour, Babak Falsafi, Giovanni De Micheli, "CCNoC: Specializing On-Chip Interconnects for Energy Efficiency in Cache-Coherent Servers" Networks-on-Chip, International Symposium on, pp. 67-74, 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip, 2012

Page 11: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-05 (Area: Simulation/NoC)

Title: (IMPL.) Reorganize the LLC allocation in multicores

Requirements: Computer architecture

Description:

The LLC plays an important role in multicore architectures, since it filters on-chip requests before going to expensive off-chip requests. However, LLC count a great power and area percentage in the chip. Exploiting the high percentage of private blocks the L2 can be reorganized considering the real allocation for shared blocks only. In this perspective, the project requires to implement the baseline strategy of the referred paper in the GEM5 simulator comparing the power-performance measures against standard cache hierarchy implementations.

Starting references:

1. Mario Lodde, Jose Flich, Manuel E. Acacio: Dynamic Last-Level Cache Allocation to Reduce Area and Power Overhead in Directory Coherence Protocols. Euro-Par 2012: 206-218

Page 12: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-06 (Area: NoC)

Title: (IMPL.) NoC XY routing scheme: Buffer and Link utilization with different tile organizations

Requirements: Basic multi-core architecture knowledge. GEM5 simulator.

Description:

Since NoC resemble Man or Wan networks in some aspects, there's a need for routing algorithm to steer packets from sources to destinations. However on-chip networks are quite different from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing algorithms have to be designed. This project aims to explore the power-performance trade-off as well as the traffic distribution for links and buffers using a XY routing scheme in GEM5 combined with different workload mixes. Moreover, the use of different tile structures must be considered.

Starting references:

1. GEM5 website

2. Book chapters Principle and Practice of interconnection networks. Dally. Publisher MK, 2004

Page 13: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-07 (Area: Network-on-Chip)

Title: (REVIEW) Network-on-Chip simulation tools

Requirements: C/C++, Python

Description:

Network-on-Chip interconnect is a critical resource shared among concurrent applications and even worse only few implemented design provide such communication subsystem. In this perspective the main part of the research in this field is related to accurate simulations. The goal of this project is to review a set of simulation tools providing a critical comparison among them. It is also required an interactive tutorial during presentation.

Starting references:

1. BookSim2.0 https://nocs.stanford.edu/cgi-bin/trac.cgi/wiki/Resources/BookSim

2. Garnet http://www.princeton.edu/~niketa/garnet

3. Noxim http://noxim.sourceforge.net/

4. Others ...

Page 14: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-08 (Area: Thermal Management)

Title: (REVIEW) Temperature aware routing algorithms in NoC scenario

Requirements: Basics of computer architecture

Description:

On-chip interconnect provide a flexible and fault tolerant communication subsystem, while a t the same time account up to 30-40% on the entire chip power consumption that is added to the computational logic power consumption. In this perspective thermal aware routing algorithms can provide a way to reduce the thermal impact in multi-core architectures. The student has to review some state of the art thermal aware routing algorithm suitable for NoC enviroment both adaptive and deterministic. Since the research filed is quite huge the main restriction is on planar 2d-mesh and 3d-mesh topologies only.

Starting references:

1. Feiyang Liua, Huaxi Gua and Yintang Yangb, “DTBR: A dynamic thermal-balance routing algorithm for Network-on-Chip”TODO

Page 15: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-09 (Area: NoC Fault Tolerance)

Title: (REVIEW) Fault tolerant aware routing algorithms in NoC scenario

Requirements: Basics of computer architecture

Description:

On-chip interconnect provide a flexible and fault tolerant communication subsystem, the technology scaling poses severe constraints on the reliability of the chip. In this perspective the study of techniques to cope with hard faults in the NoC, avoiding NoC partitioning or worse, is mandatory. Moreover, some routers can be switched off to preserve their functionalities as well as to save power. The project requires a review on the state of the art fault-tolerant-aware routing algorithm suitable for NoC enviroment both adaptive and deterministic. Since the research filed is quite huge the main restriction is on planar 2d-mesh and 3d-mesh topologies only.

Starting references:

1. Samih, A.; Ren Wang; Krishna, A.; Maciocco, C.; Tai, C.; Solihin, Y., "Energy-efficient interconnect via Router Parking," High Performance Computer Architecture (HPCA2013), 2013 IEEE 19th International Symposium on , vol., no., pp.508,519, 23-27 Feb. 2013

Page 16: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-10 (Area: NoC)

Title: (REVIEW) NoC power models

Requirements: Computer architectures

Description:

Network-on-Chip interconnect is a critical resource at least from both performance and power point of view. While a lot of simulation tools have been proposed, there is a single quite accurate and generally accepted tool to perform power analysis, i.e. Orion2.0. However such tool relies on a parametric model of the on-chip network, thus it is not so accurate when the NoC microarchitecture is modified. The project requires to review the state of the art with respect to power modelling proposals for NoCs.

Starting references:

1. Kwangok Jeong; Kahng, A.B.; Lin, B.; Samadi, K.; , "Accurate Machine-Learning-Based On-Chip Router Modeling" Embedded Systems Letters, IEEE , vol.2, no.3, pp.62-66, Sept. 2010

Page 17: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-11 (Area: NoC)

Title: (IMPL.) NoC synthetic traffic generators

Requirements: C/C++, python

Description:

The possibility to mimic multi-core architecture underpinned by NoC and running real benchmarks is appealing but very demanding in term of simulation time and computational resources. To this extent a lot of analysis are conducted employing synthetic traffic patterns that can stress the NoC with a focus on specific subparts. The student must reimplement in GEM5 one of the traffic generator models that is present in BookSim but not in GEM5

Starting references:

1. BookSim2.0 https://nocs.stanford.edu/cgi-bin/trac.cgi/wiki/Resources/BookSim

2. Garnet http://www.princeton.edu/~niketa/garnet

Page 18: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-12 (Area: NoC)

Title: (IMPL.) FPGA framework for NoC router on FPGA

Requirements: Verilog, python

Description:

The on-chip network represents a suitable way to increase the bandwidth, decreasing the contention and with a relatively small increase in latency. To this extent a lot of simulation tools exist to study the impact of different NoC solutions. However the software simulation approach does not provide accurate timing and area estimates, while a lower level approach does, i.e. FPGA. The project asks to provide a complete FPGA framework for NoC evaluation through simulation starting from two well known tools, NetMaker and TrafficGen.

Starting references:

1. NetMaker http://www-dyn.cl.cam.ac.uk/~rdm34/wiki/index.php?title=Main_PageBookSim2.0

2. TrafficGen http://www.tkt.cs.tut.fi/research/nocbench/download.html

Page 19: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-13 (Area: NoC)

Title: (IMPL.) Baseline NoC router on FPGA

Requirements: Verilog

Description:

The on-chip network represents a suitable way to increase the bandwidth, decreasing the contention and with a relatively small increase in latency. To this extent a lot of simulation tools exist to study the impact of different NoC solutions. However the software simulation approach does not provide accurate timing and area estimates, while a lower level approach does, i.e. FPGA. The project asks to design a verilog baseline NoC router without virtual channels.

Starting references:

1. NetMaker http://www-dyn.cl.cam.ac.uk/~rdm34/wiki/index.php?title=Main_PageBookSim2.0

2. TrafficGen http://www.tkt.cs.tut.fi/research/nocbench/download.html

Page 20: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-14 (Area: NoC)

Title: (IMPL.) Baseline NoC traffic generator on FPGA

Requirements: Verilog, python

Description:

The on-chip network represents a suitable way to increase the bandwidth, decreasing the contention and with a relatively small increase in latency. To this extent a lot of simulation tools exist to study the impact of different NoC solutions. However the software simulation approach does not provide accurate timing and area estimates, while a lower level approach does, i.e. FPGA. Starting from the works in the references, the project asks to provide a simple verilog based traffic generator that can be managed from the laptop.

Starting references:

1. NetMaker http://www-dyn.cl.cam.ac.uk/~rdm34/wiki/index.php?title=Main_PageBookSim2.0

2. TrafficGen http://www.tkt.cs.tut.fi/research/nocbench/download.html

Page 21: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-15 (Area: NoC)

Title: (IMPL.) FPGA Dynamic Frequency Scaling Module

Requirements: Verilog, Xilinx ISE

Description:

The possibility to dynamically scale the frequency of the component inside a chip represents a suitable way to reduce dynamic power consumption. However, such a mechanism presents both power and timing overhead as well as requires for dedicated modules, i.e. PLL. In this perspective, the projects requires to explore the possibility to use the PLL modules integrated in the Xilinx FPGA devices to provide the design with different operating frequences.

Starting references:

1. Xilinx website and documentation 1

Page 22: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-16 (Area: NoC)

Title: (REVIEW) One to many routing algorithm

Requirements: Computer architecture

Description:

On-chip interconnect provide a flexible and fault tolerant communication subsystem, while they are focused on one-to-one communication. However, cache coherence protocols can exploit broadcast or multicast communication to provide better performance. The project goal is a review of the state of the art in routing algorithm to allow multicast and broadcast communication in NoC based interconnection subsystems

Starting references:

1. Kirman, N., et al. Leveraging Optical Technology in Future Bus-Based Chip Multiprocessors. In Proceedings of the International Symposium on Microarchitecture, 2006

2. Shacham, A., Bergman, K., and Carloni, L. On the Design of a Photonic Network-on-Chip. In Proceedings of the International Symposium on Networks-on-Chip, 2007.

Page 23: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Projects PR2013-17 (Area: NoC)

Title: (REVIEW) NoC and traffic compression

Requirements: Computer architectures

Description:

Network-on-Chip interconnect is a critical resource shared among concurrent applications and the scarce shared bandwidth should be managed with care. In this perspective the possibility to compress the traffic represents a suitable and appealing idea to reduce NoC utilization, while compression comes not for free, thus an accurate study on its pros and cons should be conducted. The goal of the project is a survey on the state of the art for compression techniques and possibilities to reduce NoC utilization.

Starting references:

1. Jiangjiang Liu, Jianyong Zhang, Nihar R. Mahapatra, “Interconnect system compression analysis for multi-core architectures”. SoCC 2010: 317-320

Page 24: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

What about thesis ?

Page 25: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

Proposals for Thesis

Computer architecture

• Heteronoc: Optimize heterogeneous components for the same noc

• Multi-core reliability and thermal: Optimization thermal map using NoC and core knobs

• NoC-cache optimization: Optimize power/performance with a joint NoC-cache exploratio

• NoC-reliability applied to FPGA: NBTI issues related to buffers in NoC architectures

• NoC compression: dynamically use compression for power/performance trade-off

Operating Systems

(Dynamic Resource management at OS level (Dr. Bellasi))

Wireless Sensor Networks (Prof. Brandolese)

Page 26: Embedded Systems: Projects - Politecnico di Milano from their off-chip network counterparts in micro-architecture, goal/requirements and available resources, thus specific routing

End

That's all... Questions?

Thank You