lynne hill general manager parallel computing platform visual studio
TRANSCRIPT
Parallel Computing Initiative
Lynne HillGeneral ManagerParallel Computing Platform Visual Studio
Microsoft® Parallel Computing Initiative
Microsoft’s Parallel Computing Initiative encompasses
the vision, strategy, and innovative technologies
for delivering
transformative, natural, and immersive
personal computing experiences,
that harness the computing power of manycore architectures.
Opportunities for the Future
Improved Productivity
Immersive Experience
Breakthrough Innovation
Code Optimize ValidateDesign
Actionable performance
guidance
Across multiple programming
models with data and task
parallelism
Applications for Parallelism
Correctness
Simplifying Parallelism
PDC Parallelism Symposium
Addressing the Hard Problems of ConcurrencySpeakers: Lynne Hill, David CallahanDate/Time: Thursday, Oct. 30 8:30AM – 10:00AMParallel Computing Application Architectures and Opportunities Speakers: John Feo, Jerry Bautista (Intel) Date/Time: Thursday, Oct. 30 10:15AM – 11:45AMFuture of Parallel Computing (Panel)Speakers: Dave Detlefs, Niklas Gustafsson, Sean Nordberg, James Reinders (Intel)Moderator: Selena WilsonDate/Time: Thursday, Oct. 30 12:00PM – 1:30PM
Addressing the Hard Problems of Concurrency
David CallahanDistinguished EngineerParallel Computing
Platform Team, Visual Studio
Return of the free lunch = scalable parallel programs
This Talk
What & Why Parallel Computing How we carve the problem up What we’ve got so far Where we might go next
We need your passionate feedback – make our next steps the right ones
Parallel Programming
Sequential Parallel Identify computations that are currently or
may grow to be performance concerns Over-decompose for scaling
Structured multi-threading with a data focus Relax sequential order to gain
more parallelism Ensure atomicity of unordered interactions
Consider data as well as control flow Careful data structure & locking choices to
manage contention
Broad Adoption
Complex Systems
Diverse Targets
Talking Parallel Computing Mainstream
Reaching More Developers
Enable Experts Increase Safety & Automation Reduce Concepts
Unravelling The Knot
Efficient Execution
System Services
Constructing Parallel
Applications
How do we:cheaply build parallel
applicationsthat can be efficiently
executedand share system
resources?
Unravelling The Knot
Efficient Execution
System Services
Constructing Parallel
Applications
How do we:cheaply build parallel
applicationsthat can be efficiently
executedand share system
resources?
Chart not to scale
Constructing Parallel Applications
Integrating concurrency & coordination into mainstream programming languages
Developing tools to ease development
Encapsulating parallelism in reusable components
Raising the semantic level: new approaches
E
S
C
Integrate/Tool/Encapsulate/Raise
ImplicitImplicit
Explicit, but safe
Explicit, but safe
Explicit, unsafe
Explicit, unsafe
Explicit Tasking Support
.NET 4.0 Task Parallel Library Task, Task<T> Parallel.For Parallel.Foreach Parallel.Invoke Concurrent data structures
Visual Studio 2010 C++Parallel Pattern Library
task, task_group parallel_for parallel_for_each parallel_invoke Concurrent data structures Primitives for message passing User-mode locks
Integrate/Tool/Encapsulate/Raise
Parallel Programming for C++ Developers … Rick Molloy, Oct. 27 3:30PM – 4:45 PM
Parallel Programming for Managed Developers …Daniel Moth, Oct. 29 10:30AM– 11:45AM E
S
C
Structured Multi-threading In .NET
public void dfsearch(Node n) {
Parallel.ForEach(
n.adjacent_nodes(),
delegate (Node a) {
if(first(a)) dfsearch(a);
});
}
• Emphasize recursive decomposition
• Preserves function interfaces• “fork-join”
• Structured control constructs• Parallel loops, co-begin
Each iteration is a task
E
S
C
All tasks finish before function returns
And In C++
void connected_components(Graph * g) {
…
parallel_for_each(g->nodes.begin(),
g->nodes.end(),
[=] (Node *n) {
n->component = NULL;
});
parallel_for_each(g->nodes.begin(),
g->nodes.end(),
[=] (Node *n) {
… parallel searches …
});
… rest of the algorithm …
}
Every Iteration
is a “task”
New C++ Lambda Syntax
Parallel Developer ToolsDefining the developer experience for constructing parallel applications
•Design and modeling tools to enable developers to start with zero parallelism debt
Design
•Debug across multiple programming models, with data and task-focused visualizationsDebug
•Actionable performance guidance for understanding and optimizing parallel applicationsOptimize
•Tools for developers and testers to validate correctness and cope with inherent non-deterministic execution
Validate
Integrate/Tool/Encapsulate/Raise
New Tools In Visual Studio 2010
Debugger views: understand program state in source terms Parallel Tasks Parallel Stacks Parallel Locals
Optimize: ETW trace analysis CPU Utilization Thread blocking
Integrate/Tool/Encapsulate/RaiseE
S
C
Examples in Talks by Moth and Molly mentioned earlier
Microsoft Visual Studio: Bringing out the Best in Multicore SystemsHazim Shafi Oct. 27 1:45 PM – 3:00 PM
MSR: Concurrency Analysis Platform and Tools for Finding Concurrency BugsMadan Musuvathi and Tom BallOct 29 at 10:30
Encapsulating Parallelism
Best not to know
• Parallelism inside of libraries without interface change
Ok to be warned…
• Frameworks with callbacks – must document/specify/enforce restriction
At least get to reuse
• New patterns for data structure traversal
E
S
C
Integrate/Tool/Encapsulate/Raise
New Approaches: Parallel LINQ
var q = from p in points.AsParallel()
let center = nearestCenter(p, clusters)
group p by center into g
select new {
index = g.Key,
count = g.Count(),
position = g.Aggregate(new Vector(0, 0, 0),
(a, e) => a+ e,
(a1, a2) => a1+ a2,
(a) => a
) / g.Count() }; var newclusters = q.ToList();
In domain specific ways, work without explicit sequencing
Integrate/Tool/Encapsulate/Raise
The Problem Of Locking
Sequential Coarseif(!m.visited) { m.visited = true; recurse(m);}
lock(graph)var v = m.visited;if(!v) m.visited = true;unlock(graph);if(!v) recurse(m)
Fine Lock-freelock(m)var v = m.visited;if(!v) m.visited = true;unlock(m);if(!v) recurse(m)
var v = compare_and_swap( &m.visited. false, true);if(!v) recurse(m)
Integrate/Tool/Encapsulate/RaiseE
S
C
Arbitrate parallel traversal of a graph:
“first to visit”
New ApproachesTransactional Memory
Specify intent: 1. Run isolated from the effects of other tasks 2. Do nothing if there is an error
Sequential Transactionalif(!m.visited) { m.visited = true; recurse(m);}
var v;atomic { v = m.visited; if(!v) m.visited = true;}if(!v) recurse(m)
Looks “coarse”, runs “fine”, composes cleanly
Integrate/Tool/Encapsulate/RaiseE
S
C
Unravelling The Knot
Efficient Execution
System Services
Constructing Parallel
Applications
How do we:cheaply build parallel
applicationsthat can be efficiently
executedand share system
resources?
Some Efficiency Factors
Minimize costs Per task Resource expansion Synchronization
Balance the load Enable dynamic resource assignment
Respect the memory hierarchy Allow multiple strategies
Multiple paradigms Specialized schedulers
E
S
C
Recursive Divide and Conquer quickly generates a lot of tasks
We do not want them all to run: Wastes too many resources,
especially memory Traditional OS notions of
“threads” with “fairness” through time-sharing are inappropriate
Work As A Dynamic Tree
E
S
C
Sequential Execution very efficient. Only O(lg n) storage in use
A like number “ready” A ready node represents a potentially
larger amount of work not yet elaborated
Managing The WorkSequential
E
S
C
Manage the workParallel
E
S
C
Assuming 4 worker threads
Publish opportunities to be stolen by idle workers
Visual Studio 2010Tools / Programming Models / Runtimes
Parallel Pattern Library
Resource Manager
Task Scheduler
Task Parallel Library
PLINQ
Managed Library Native LibraryKey:
ThreadsOperating System
Concurrency Runtime
Programming Models
AgentsLibrary
ThreadPool
Task Scheduler
Resource Manager
Data Structures
Dat
a St
ruct
ures
Tools
Tools
ParallelDebugger
Toolwindows
Profiler Concurrency
Analysis
Exposed APIs for Partners
Concurrency Runtime Deep Dive: How to Harvest Multicore Computing ResourcesNiklas Gustafsson, Oct. 29 1:15 PM – 2:30 PM
E
S
C
Architecting Interoperabilty
Common Resource Management
General Purpose
Scheduler
General Purpose
(Background)
Real Time Schedule
Domain Specific
Scheduler
Domain Specific
Abstractions
General Purpose Abstractions: Messages + Tasks + Isolation
Legacy Threads +
Locks
Domain Specific FrameworkssGeneral Purpose Frameworks
Rich, Connected, Scalable Applications
E
S
C
Common Resource Management
General Purpose
Scheduler
Online Query
Optimizers
Standard “SQL” Query
Operators
LINQ .NET Bindings
Rich, Connected, Scalable Applications
Special FrameworksParallel LINQ
Unravelling The Knot
Efficient Execution
System Services
Constructing Parallel
Applications
How do we:cheaply build parallel
applicationsthat can be efficiently
executedand share system
resources?
User/Kernel Architecture
Process Process
KernelExtendedThreads
Cooperative Physical Resource Management
Common Resource Management
General Purpose
Scheduler
General Purpose
(Background)
Real Time Schedule
Domain Specific
Scheduler
Domain Specific
Abstractions
General Purpose Abstractions: Messages + Tasks + IsolationLegacy
Threads + Locks
Domain Specific FrameworkssGeneral Purpose Frameworks
Rich, Connected, Scalable Applications
E
S
C
More System Concerns
Removing concurrency blockers from system layers
Encouraging design for responsiveness
Resource and power management policies
Scaling kernel services to asymmetric and heterogeneous systems
E
S
C
IO N ION N N N N
D D D D SP SP
C C C C SP SP
C C C C SP SP
C C C C SP SP
C C C C SP SP
C C C C SP SP
C C C C SP SP
D D D D SP SP
PrimaryProcessor
Primary Processor
On a single chip: two processors kinds, cache, network, and memory and I/O controllers
Hard Problems Acros the Stack
Constructing Parallel Applications
Efficient Execution
System Services
Applications
Libraries
Languages, Compilers and Tools
Concurrency Runtime
Kernel/Hypervisor
Hardware
Reaching More Developers
Enable Experts Increase Safety & Automation Reduce Concepts
Basic & structured TasksCore data structure supportTools for debugging and
tuning Remove platform
bottlenecks
Work-stealing schedulingCooperative Resource
Management
Eliminate “processors”Reduce sequencing
(PLINQ)
Transactional MemoryType system for effects DSL for isolated componentsData Parallel Support
Extensive parallel librariesReduce “locking”Implicitly parallel
frameworksSafe languages
Tools for validationMemory hierarchy
abstractions
NowFuture
Learn more about Parallel Computing at:
MSDN.com/concurrencyAnd download Parallel Extensions to
the .NET Framework!
Evals & Recordings
Please fill
out your
evaluation for
this session at:
This session will be available as a recording at:
www.microsoftpdc.com
Please use the microphones provided
Q&A
© 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market
conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.