system benchmarking

1

System [email protected]

2

What is Benchmarking?

Defining Performance in Numeric Format

3

How is it Implemented?

4

WHY BENCHMARKING???

5

Fact : IP’s are from different provider. We Integrate IP’s to create one SoC.

It is important to prove our New SoC gives same or better Performance compared to existing competitor SoC

Why Benchmarking?

6

iPhone and Android Hardware

7

Does the difference come from:

operating system (Windows, Linux, ...; 32/64 bit),

compiler (GCC, Intel, PathScale, ...), - options,

optimized libs (Libc...)?

Validating hardware configuration

Benchmarking Goals

8

Comparing two systems

Checking for regressions

Capacity planning

Reproducing bad behaviour to solve it

Stress-testing to find bottlenecks

Benchmarking Goals

9

Types of Benchmarking

Application -> Real World Software

Synthetic -> Impose the workload on the component like Processor, Memory, Network Devices etc

Parallel -> For Multicore Processors, Servers

Input/Ouput -> For Peripheral

Power -> For low power systems

10

What is Performance?

Two Metrics Response Time (time per task) -> User Experience Throughput (tasks per time) -> Benchmarking

Performance

11

For example: Consider a program which converts QVGA images from the

RGB colour space to YIQ.

An ST231 running at 300MHz can process 207 images a second.

A MIPS24K running at 550MHz can process 168 images a second.

MHz alone is not a good indicator of performance.

How do we benchmark Core Performance?

12

Performance(Tasks/second) =(Avg No of Operations per Cycle) * ( MHz)

(No of Operations Needed to Complete Task)

Why is this? Do we need to consider other factors?

13

The number of operations required to complete the task. This varies, for example, it may be necessary to replace a single floating-

point operation with shift, round and normalise operations to run on an integer core.

Average number of operations per cycle.

This can be improved by Pipelining, Parallelism, etc

14

How we can improve performance?

Software Implementation Compiler Operating System Implementation

Hardware Design Cache Design Pipelining and Parallelism

15

Compiler Optimizations

Optimize the common case -> using fast path

Avoid redundancy -> reuse results

Less code -> remove unnecessary computations

Parallelize -> reorder operations

Fewer jumps -> branch-free code

Loop optimizations -> operate on loops

16

Operating System -> Symmetric Multiprocessing

17

Operating System -> Symmetric Multithreading

18

Hardware -> CPU Cache Design

19

Hardware -> Pipelining and Parallelism Design

Unpipelined

Pipelined

20

Parallelism:

Single Instruction Multiple Data(SIMD) ->

Multiple Instruction Multiple Data(MIMD) ->

21

Interconnect/System Bus

Communication pathway connecting two or more devices

Throughput capacity = (bus clock speed in Hz) * (no of bits wide)

22

Newman Performance Analysis

23

Summary

Benchmarks are for comparing different hardware architectures.

Do not rely solely on microbenchmark results, also check Sanity check results Use a profiler Test your code in real life scenarios under

realistic load (macro-benchmark)

24

QUESTIONS????

system benchmarking

Engineering