software & services group developer products division copyright© 2011, intel corporation. all...

18
Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Essential Performance Advanced Performance Distribute d Performance Efficient Performance Building parallel application using Guided Auto Parallelization Om P Sachan Intel Compiler and Languages 1

Upload: caleb-flemons

Post on 28-Mar-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners.

Essential Performance

Advanced Performance

Distributed Performance

Efficient Performance

Building parallel application using  Guided Auto Parallelization

Om P SachanIntel Compiler and Languages

1

Page 2: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners.

Optimization Notice

2

Optimization Notice

Intel compilers, associated libraries and associated development tools may include or utilize options that optimize for instruction sets that are available in both Intel and non-Intel microprocessors (for example SIMD instruction sets), but do not optimize equally for non-Intel microprocessors. In addition, certain compiler options for Intel compilers, including some that are not specific to Intel micro-architecture, are reserved for Intel microprocessors. For a detailed description of Intel compiler options, including the instruction sets and specific microprocessors they implicate, please refer to the “Intel Compiler User and Reference Guides” under “Compiler Options." Many library routines that are part of Intel compiler products are more highly optimized for Intel microprocessors than for other microprocessors. While the compilers and libraries in Intel compiler products offer optimizations for both Intel and Intel-compatible microprocessors, depending on the options you select, your code and other factors, you likely will get extra performance on Intel microprocessors.

Intel compilers, associated libraries and associated development tools may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include Intel® Streaming SIMD Extensions 2 (Intel® SSE2), Intel® Streaming SIMD Extensions 3 (Intel® SSE3), and Supplemental Streaming SIMD Extensions 3 (Intel SSSE3) instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors.

While Intel believes our compilers and libraries are excellent choices to assist in obtaining the best performance on Intel and non-Intel microprocessors, Intel recommends that you evaluate other compilers and libraries to determine which best meet your requirements. We hope to win your business by striving to offer the best performance of any compiler or library; please let us know if you find we do not.

Notice revision #20110307

Page 3: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners.

Agenda

• Introduction to Guided Auto-parallelization.• Run Guided Auto-parallelization.• Analyze Guided Auto-parallelization reports.• Implement Guided Auto-parallelization

recommendations.

Intel Confidential

3

Page 4: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners. 44/6/2010

Parallelization in Mainstream

• Performance gains coming from more cores per die– Increasing clock frequencies play a smaller role

• Exposes parallelism to the programmer• Every computer is a parallel computer

– Implies most programs must execute in parallel• Parallelism successful in HPC, servers, graphics, ...

– Not widespread in the client domain • Client apps focused on

– Quality user experience– Scalability – Programmer productivity (critical for time-to-market)

Development of multi-threaded apps is hard

Need for a low-cost and effective way of threading apps

Page 5: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners. 54/6/2010

Parallelization in Mainstream

• Requires multi-pronged approach:– Simpler parallel programming models and abstractions– Domain-specific parallel libraries– Compiler auto-parallelization, auto-vectorization, and

data-transformation– Advise user on how to parallelize

– Good debugging tools– Easy-to-use tools for performance analysis

• Tradeoffs between scalability and productivity

Compiler can play an important role in enabling parallelism

Page 6: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners. 64/6/2010

Workflow with Compiler as a Tool

Compiler Application Source

C/C++/Fortran

ApplicationBinary

+ Opt Reports

Identify hotspots, problems

Performance Tools

Simplifies programmer effort in application tuning

Application Source + Hotspots

Compiler in advice-

mode

Advice messages

ModifiedApplication

Source

Compiler (extra

options)

ImprovedApplication

Binary

Page 7: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners.

Intel Confidential

74/6/2010

Compiler as a Tool

• Use compiler as a tool to give selective advice • Initially targets:

– Automatic parallelization of loop-nests– Automatic vectorization of inner-loops– Data transformation suggestions

• Programmer writes serial code – then follows the compiler advice to assert new properties– Does not require a lot of extra time and effort from user

• Code remains performance-portable• Programmer reasons about application properties• Tool based on expertise of “common pitfalls”

– Conservative disambiguation assumptions– Compiler assumes upper-bound is changing inside loop– ...

Page 8: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners. 84/6/2010

How it Works• Targeted for Mainstream and HPC Users• Advice may involve

– suggestions for source-change– adding pragmas – adding new options

• Simple source changes that assert new properties – Add a new pragma for loop if semantics are satisfied– Use a local-variable for the upper-bound of a loop– Initialize scalar variable unconditionally at top of loop– Reorder fields of a structure (or split into two)

• Desired behavior– Each advice is specific using source-level variable names – User does semantic analysis – apply or reject each advice– Advice should be as localized as possible– Following the advice should result in better optimizations

Page 9: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners.

Intel Confidential

10

Activity 1

Prepare and run Sample code

Use lab document

Page 10: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners.

Intel Confidential

114/6/2010

Usage Model

• Two main usage models:– Users compiling with auto-parallelization enabled– Users compiling with no auto-parallelization – but still can gain

from improved vectorization• User can specify regions of a file or routine that are considered

“hot” – Advice will be restricted to the hot region– Default is to provide advice on entire compilation-unit

• Under guide-mode, no executable-code generated– Only output is a set of advice messages

• User not required to use advanced options (IPO, PGO), but advice may change based on options

• User may apply all (or a subset) of the advice – Recompile in normal-mode enables better optimizations

Page 11: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners. 124/6/2010

Usage Model (contd.)

• Advice targeted only for improving application perf– Use tool during the perf-tuning part of the software

development cycle• Each advice has a “VERIFY” part

– User is responsible for checking whether it is “safe” to apply each suggestion

• User not required to use adv options (IPO, PGO)– When IPO is ON in guide-mode, advice will get emitted as part

of link-step• There may be multiple msgs targeting same loop

– User has to apply ALL to get desired optimization• Default debug mode generates no GAP messages

– /Zi implies /Od, override by adding /O2 explicitly

Page 12: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners. 134/6/2010

Limitations• User may have to deal with lots of messages

– Duplicate messages– If no hot region is specified

• User is responsible for semantic verification–possibility of bugs– Adding an ivdep pragma in a loop is an assertion by the user– May lead to errors if user is not diligent with the verification– Good documentation with examples can help mitigate this

• More vector/par-loops – does not always guarantee perf gains• Tool does not guide the user on how to write parallel code• Not a general purpose mechanism to achieve maximum perf

– Turning on GAP will not vectorize EVERY loop– Only a subset where compiler can do an intelligent workaround

Not a panacea for all problems related to parallelization

Page 13: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners.

Intel Confidential

144/6/2010

How to Use GAP

• Targeting Windows and Linux (IA32 & Intel64)• With normal options for the app (-O2 and above), add:

– -Qguide:3 (Mainstream) – -Qguide:4 (HPC)

• No code generation in gap-mode (no executable generated)• Can be used with and without –Qparallel option

Page 14: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners.

Intel Confidential

15

Activity 2

Implementing Guided Auto-parallelization Recommendations, use sample code

Use lab document

Page 15: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners.

Intel Confidential

16

Summary

• Learned Guided Auto-parallelization.

• Analyze Guided Auto-parallelization reports.

• Implemented Guided Auto-parallelization recommendations.

Page 16: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners. 17

Intel Confidential

Page 17: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners.

Optimization Notice

18

Optimization Notice

Intel compilers, associated libraries and associated development tools may include or utilize options that optimize for instruction sets that are available in both Intel and non-Intel microprocessors (for example SIMD instruction sets), but do not optimize equally for non-Intel microprocessors. In addition, certain compiler options for Intel compilers, including some that are not specific to Intel micro-architecture, are reserved for Intel microprocessors. For a detailed description of Intel compiler options, including the instruction sets and specific microprocessors they implicate, please refer to the “Intel Compiler User and Reference Guides” under “Compiler Options." Many library routines that are part of Intel compiler products are more highly optimized for Intel microprocessors than for other microprocessors. While the compilers and libraries in Intel compiler products offer optimizations for both Intel and Intel-compatible microprocessors, depending on the options you select, your code and other factors, you likely will get extra performance on Intel microprocessors.

Intel compilers, associated libraries and associated development tools may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include Intel® Streaming SIMD Extensions 2 (Intel® SSE2), Intel® Streaming SIMD Extensions 3 (Intel® SSE3), and Supplemental Streaming SIMD Extensions 3 (Intel SSSE3) instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors.

While Intel believes our compilers and libraries are excellent choices to assist in obtaining the best performance on Intel and non-Intel microprocessors, Intel recommends that you evaluate other compilers and libraries to determine which best meet your requirements. We hope to win your business by striving to offer the best performance of any compiler or library; please let us know if you find we do not.

Notice revision #20110307

Intel Confidential

Page 18: Software & Services Group Developer Products Division Copyright© 2011, Intel Corporation. All rights reserved. *Other brands and names are the property

Software & Services GroupDeveloper Products Division Copyright© 2011, Intel Corporation. All rights reserved.

*Other brands and names are the property of their respective owners.

Legal Disclaimer

19

INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT.  INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.

Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference www.intel.com/software/products.

BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino Atom, Centrino Atom Inside, Centrino Inside, Centrino logo, Cilk, Core Inside, FlashFile, i960, InstantIP, Intel, the Intel logo, Intel386, Intel486, IntelDX2, IntelDX4, IntelSX2, Intel Atom, Intel Atom Inside, Intel Core, Intel Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo, Intel NetBurst, Intel NetMerge, Intel NetStructure, Intel SingleDriver, Intel SpeedStep, Intel StrataFlash, Intel Viiv, Intel vPro, Intel XScale, Itanium, Itanium Inside, MCS, MMX, Oplus, OverDrive, PDCharm, Pentium, Pentium Inside, skoool, Sound Mark, The Journey Inside, Viiv Inside, vPro Inside, VTune, Xeon, and Xeon Inside are trademarks of Intel Corporation in the U.S. and other countries.*Other names and brands may be claimed as the property of others.

Copyright © 2011.  Intel Corporation.

http://intel.com/software/products

Intel Confidential