use or disclosure of this data outside the arms program or government is restricted without the...
TRANSCRIPT
![Page 1: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/1.jpg)
Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Phase II PI Meeting Phase II PI Meeting Lockheed Martin Lockheed Martin
Advanced Technology LaboratoriesAdvanced Technology LaboratoriesApril 11-13, 2006
DARPA:ARMSDARPA:ARMS
![Page 2: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/2.jpg)
2ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Team OverviewTeam Overview
Tom Damiano, Patrick Lardieri, Gautam ThakerProject Leadership
Tom Damiano (ATL) & Ed Muholland , Jaiganesh Balasubramanian Will Otte, Nilabja Roy, Nishanth Shankaran (Vanderbilt)
Resource Allocation and Control Engine (RACE)
Patrick Lardieri & Tom Damiano (ATL), Doug Schmidt (Vanderbilt)
Technology Transition
Ming Xiao (Vanderbilt)CIAO DDS Integration
Don Krecker (ATL), Blake Ross (LM), Rose Daley & I-Jeng Weng (APL), Yiaming Je (BBN)
Company Resource Management
Gautam Thaker (ATL), Chenyang Lu &Yuanfang Zhang, Chris Gill (Washington University St. Louis)
Certification Technologies
Gautam Thaker (ATL), Raj Rajkumar & Gaurav Bhatia (CMU), Joe Cross (DARPA)
Gate Test 2
Michael Price, Ed Mulholland, & Tom Damiano (ATL), Matt Gillen (BBN), Doug Stuart (Boeing), John Cosgrove (Raytheon), Will Otte (Vanderbilt)
Gate Test 1
Extended Team Phase II Activity
![Page 3: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/3.jpg)
3ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Technical Accomplishments/ProgressTechnical Accomplishments/Progress
Phase II - Gate Test IPhase II - Gate Test IExperimental ResultsExperimental Results
![Page 4: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/4.jpg)
4ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Gate Test 1 was conducted using two scenarios: GT-1A and GT-1BGate Test 1 was conducted using two scenarios: GT-1A and GT-1BInvolving two pools, three nodes per pool, and two application stringsInvolving two pools, three nodes per pool, and two application strings
…
…
…
…
…
…
…
…
GT-1A GT-1A
Pre-Condition: The TSCE is operating normally.
Scenario: A fault occurs which is detected by MLRM. MLRM begins dynamic reconfiguration when an artificial fault is induced within the MLRM. The MLRM detects the failure to dynamically reconfigure and deploys a feasible static configuration.
Post-Condition: The TSCE is operating with the static configuration.
GT-1BGT-1B
Pre-Condition: The TSCE is in a MLRM determined configuration following a failure(s).
Scenario: A human operator signals the system to ‘fallback’ to a feasible static configuration.
Post-Condition: The TSCE is operating with the static configuration.
GM3-string 2.2GM3-string 2.2
GM3-string1.1GM3-string1.1
ed-1, ed-2, ed-1, ed-2, plan-3plan-3, plan-1, cfgop-1, , plan-1, cfgop-1, eff-1, eff-7, eff-8, eff-12,eff-13eff-1, eff-7, eff-8, eff-12,eff-13
smm-1, smm-1, plan-3plan-3, plan-4, plan-1, plan-4, plan-1
Technical Accomplishments/ProgressPhase II - Gate Test 1: CONOPS - Do No HarmTechnical Accomplishments/ProgressPhase II - Gate Test 1: CONOPS - Do No Harm
Node-Chaparal
Node-Javelin
Node-Hogfish
Pool-2Pool-2
Node-Checkmate
Node-Mako
Node-Champion
Pool-1Pool-1
![Page 5: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/5.jpg)
5ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Run Sequence Run Sequence PlotPlot
Lag PlotLag Plot
Histogram Histogram PlotPlot
Normal Probability Normal Probability PlotPlot
Test CaseTest Case
Tim
e (m
s)T
ime
(ms)
Elapsed Time (ms)Elapsed Time (ms)
To
tal T
est
Cas
eT
ota
l Tes
t C
ase
Tim
e
Tim
e tt
(m
s) (
ms)
Time Time t-1t-1 (ms) (ms)
Ord
ered
Res
po
nse
Ord
ered
Res
po
nse
Normal Order Statistic MediansNormal Order Statistic Medians
Technical Accomplishments/ProgressPhase II - Gate Test 1A: Final Experimental ResultsTechnical Accomplishments/ProgressPhase II - Gate Test 1A: Final Experimental Results
Outliers are due to Non-RT OSOutliers are due to Non-RT OS
Time (ms)Time (ms)
time
Pool Failure Pool Failure DetectedDetected
Pool Mgr Receives Pool Mgr Receives New DeploymentNew Deployment
Resource Allocator Executes Resource Allocator Executes … Induced Error Occurs… Induced Error Occurs
PM Detects RA PM Detects RA ErrorError
IA Notified of IA Notified of Redeploy FailureRedeploy Failure
app performs useful app performs useful workwork
Pool 1.B FailsPool 1.B Fails
XPM Receives Static PM Receives Static
FallbackFallback
WLGs Started IA WLGs Started IA Declares Declares
Redeployment Redeployment CompleteComplete
Data Collection PeriodData Collection Period
Code Base: CVS Branch PHASE2_GM1
Environment: Emulab build phase2-gm1-emulholl
Scenario Time Line
Location Measures Dispersion Measures Mid-Range 101.41 Range 101.43
Mean 68.59 17.22
Median 66.79 Minimum 50.69
Lower ¼ 61.56 Upper ¼ 70.31
Observations 30 Maximum 152.12
Click for Animated Scenario
Results on ARMS wikiResults on ARMS wiki
![Page 6: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/6.jpg)
6ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Technical Accomplishments/ProgressPhase II - Gate Test 1B: Final Experimental ResultsTechnical Accomplishments/ProgressPhase II - Gate Test 1B: Final Experimental Results
Time (ms)Time (ms)
Code Base: CVS Branch PHASE2_GM1
Environment: Emulab build phase2-gm1-emulholl
time
Operator Operator Initiated Initiated FallbackFallback
ASM Suspends ASM Suspends Execution of Execution of Affected AppsAffected Apps ASM Starts/Resumes ASM Starts/Resumes
new Appsnew Apps
IA Notified of Static IA Notified of Static Deployment Deployment
RequestRequest
app performs app performs useful workuseful work
System in System in MLRM MLRM
Determined Determined StateState
PM Receives Static PM Receives Static FallbackFallback
WLGs Started IA WLGs Started IA Declares Declares
Redeployment Redeployment CompleteComplete
NP Kills Affected NP Kills Affected AppsApps
X
Data Collection PeriodData Collection Period
Scenario Time Line
Click for Animated Scenario
Location Measures Dispersion Measures Mid-Range 315.54 Range 2.94
Mean 315.19 1.22
Median 314.39 Minimum 314.07
Lower ¼ 314.22 Upper ¼ 316.31
Observations 5 Maximum 317.01
Note: Timeline includes startup of WLGsNote: Timeline includes startup of WLGs
Results on ARMS wikiResults on ARMS wiki
![Page 7: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/7.jpg)
7ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Technical Accomplishments/ProgressPhase II - Gate Test 1: Gate Test CompletedTechnical Accomplishments/ProgressPhase II - Gate Test 1: Gate Test Completed
Does the MLRM deploy a feasible static configuration? YES
Time between the occurrence of the fault and restored operation using the statically defined configuration. Mean = 68ms.
GT-1A Metrics:GT-1A Metrics:
GT-1B Metrics:GT-1B Metrics: Does the MLRM deploy a feasible static
configuration? YES
Time between the issuance of a command and restored operation using the statically defined configuration. Current Mean = 315.2s.
Gate Test Passed!Gate Test Passed!
![Page 8: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/8.jpg)
8ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Phase II - Gate Test IIExperimental Results
Phase II - Gate Test IIExperimental Results
![Page 9: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/9.jpg)
9ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Technical Accomplishments/ProgressPhase II - Gate Test 2: ObjectivesTechnical Accomplishments/ProgressPhase II - Gate Test 2: Objectives
• Provide efficient algorithms for finding a feasible allocation solution when one exists for Bob(X)-scale problems and beyond
• Exploit special features of practical aspects of problem in a provable way• Presence of ‘slack’ in the packing• Discrete sizes of objects sizes• Expected number of bins and/or objects
• Employ an Ensemble approach - Run multiple heuristics in sequence (or in parallel) – If one heuristic does better in one particular part of the problem space, a solution will be
found by one of these heuristics with a very high probability– Framework uses multiple heuristics in sequence until one succeeds or all fail. Sequence
ordered based on properties of problem set e.g. Non-zero slack, zero slack, size_ratio, etc
3.1536E-07
10 yrs
1.5768E-073.1536E-081.0E-15Probability of meteor strike within duration
5 yrs1 yr1 secDuration
• No time limit specified• Assumption: feasible allocation to be found within 1 second
Acceptable failure probability:
![Page 10: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/10.jpg)
10ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Runtime(s)
Problem Size
1 second1 second
Technical Accomplishments/ProgressPhase II - Gate Test 2: Ensemble & ResultsTechnical Accomplishments/ProgressPhase II - Gate Test 2: Ensemble & Results
100,000 tests each ~0.5% quantization (bin size of 210, object size: multiple of 1) Problem size(x): x2 bins and x3 objects
Ensemble Heuristics: WFD (Worst-Fit-Decreasing): spreads
objects across bins (load-balancing heuristic)
FFD (First-Fit-Decreasing) BFD (Best-Fit-Decreasing) Efficient SubsetSums enumeration Base SubsetSums with preference for
low homogeneity subset sums. Base SubsetSums with preference for
high homogeneity subset sums. LSUBS (developed by Gautam Thaker ) Java Kimchee (developed by Dr. Joe
Cross)
Only a small # of size 3.3size 3.3 cases fail the strict G2 test.
![Page 11: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/11.jpg)
11ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Technical Accomplishments/ProgressPhase II - Gate Test 2: Ensemble Runs for size=3.3Technical Accomplishments/ProgressPhase II - Gate Test 2: Ensemble Runs for size=3.3
0.000080.9999299992 Kimchee with a 60-second timeout
0.006680.9933299332 LSubs with a 1-second timeout
0.021890.9781197811 Subset Sums with Hi Homogeneity
0.02880.971297120 Subset Sums with Lo Homogeneity
0.085490.9145191451 Efficient Subset Sums
0.914510.023492349 BFD
0.80030.01997 1997 FFD
0.999750.00025 25 WFD
% Failure% Success# SuccessesHeuristic
Randomly generated 100,000100,000 zero-slackzero-slack tasksets for the most difficultmost difficult size_3.3size_3.3 case.
Complete Failure Probability Complete Failure Probability if the heuristics were independent:if the heuristics were independent: 2.10743E-112.10743E-11
![Page 12: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/12.jpg)
12ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Technical Accomplishments/ProgressPhase II - Gate Test 2: Ensemble ObservationsTechnical Accomplishments/ProgressPhase II - Gate Test 2: Ensemble Observations
• The Ensemble approach is an excellent scheme to adopt.– A collection of heuristics (each of which has < 100% success rates) can
yield 100% success rates– Runtimes decrease significantly since the most complex schemes are
invoked only when the efficient ones fail.
• However, as used, it does NOT meet the strict 1-second time limit we assumed in GT-2
– Can take 20 seconds or longer in the worst case
• Accepting that there is quantization levels
A Practical Assumptions:
![Page 13: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/13.jpg)
13ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Technical Accomplishments/ProgressPhase II - Gate Test 2: The Quantization EffectTechnical Accomplishments/ProgressPhase II - Gate Test 2: The Quantization Effect
3.1536E-07
1.5768E-07
3.1536E-08
1.0E-15
-
Acceptance Threshold
002.05E-025.12E-01If an allocation occurs every hour for 10 years, probability of at least 1 failure = 1-(1-p)(10*365*24)
001.03E-023.02E-01If an allocation occurs every hour for 5 years, probability of at least 1 failure = 1-(1-p)(5*365*24)
0*0*2.37*10-78.04*10-6Probability of allocation failure with 1s timeout (p)
002.07E-036.93E-02If an allocation occurs every hour for 1 year, probability of at least one failure with 1s timeout =
1-(1-p)(365*24)
0
5%
02378042# of failures with a 1s timeout
2.5%1%~0.5%Quantization Level
Note: Failures occur only for size 3.3
* A lot more samples are needed to observe this extremely improbable event.
Bin-Packing Ensemble Failure Probability (from 101099 cases)
![Page 14: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/14.jpg)
14ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Technical Accomplishments/ProgressPhase II - Gate Test 2: Related Observations Technical Accomplishments/ProgressPhase II - Gate Test 2: Related Observations
• Other failure thresholds considered in practice:– Air Traffic Control availability requirement is 99.99999% failure probability at
any given instant is 1010-7-7.
– Hardware / software failure probability is of the order of 1010-7-7 to 1010-8-8 even in reliable systems
• In the (very unlikely) event of an allocation failure – critical tasks can be allocated very efficiently first (with non-zero slack >= 20%,
even the basic heuristics succeed)
• As the 1-second time limit is relaxed, the failure probability decreases exponentially even at low quantization levels
– With a 10-second timeout and 0.5% granularity, probability of allocation failure over 1 year drops to 3.36E-04 (from 6.93E-02)
– With a 50-second timeout and 0.5% granularity, probability of allocation failure over 1 year drops to 1.24E-07 (from 6.93E-02)
![Page 15: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/15.jpg)
15ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Gate Test II requirements have been satisfied: Feasible allocation found over independent, large sample, problem
sets. Feasible allocation found in all cases in less than 1 second except
size_3_3 where there were a small number of outliers. Solution was demonstrated for no slack stress cases and more
realistic slack cases. A careful study of impact of distribution of item sizes, item size
quantization and overall problem size was completed Parallel ensemble execution shows a collection of heuristics (each
of which has < 100% success rates) can yield 100% success overall With allowance for quantization, event the most demanding cases
can meet the “Meteorite-bound”
Related additional research completed beyond strict requirements: Extend to multi-dimensional bin-packing
Constraints along each dimension must be satisfied In the Bob(X) context, the dimension of processor utilization, network
utilization, and memory needs are typical.
Technical Accomplishments/ProgressPhase II - Gate Test 2: Gate Test CompletedTechnical Accomplishments/ProgressPhase II - Gate Test 2: Gate Test Completed
Gate Test Passed!Gate Test Passed!
![Page 16: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/16.jpg)
16ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Node Alive Research and Results
Node Alive Research and Results
![Page 17: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/17.jpg)
17ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
UDP-PUSH Approach to Node-Failure-DetectionAll clients “push” node-alive messages to a monitor at 100HZ
– Inter-arrival of messages at the monitor should be 10 msec, confirmed in data that is collected – see graphic.
• Node-Alive Monitor “sweeps” over received messages at 50HZ
• Monitor declares client node failure after 2 sweeps without receiving a beat from a client
• Current testing is at Emulab using up to 20 real nodes and 380 virtual nodes.
• Failures are simulated by the clients by suppressing 10 messages at every 60 second mark
• Fastest detection is 40 msec, slowest 60 msec – confirmed in current testing (see graphic).
• A RT Linux kernel was used to obtain accurate 100HZ and 50HZ loops (Ingo Molnar kernel with real-time patches – version 2.6.15-rt15-smp).
• With 380 nodes monitor receives 38,000 messages/sec– Monitor load has been observed to be about 8%– It is estimated that a Hierarchical solution (not yet implemented) will cut this
down to < 2% at cost of increase in maximum detection time.
• In current tests no UDP packets are lost – no false alarms• Further testing and hierarchical implementation underway
10 msec mean interarrivals for
2.2% of samples exceed theoretical max of 60 msec
Technical Accomplishments/ProgressNode Failure Detection: UDP Push ModelTechnical Accomplishments/ProgressNode Failure Detection: UDP Push Model
![Page 18: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/18.jpg)
18ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Acceptable Real-time Performance w/ Up to 1000 ClientsAcceptable Real-time Performance w/ Up to 1000 ClientsAcceptable Real-time Performance w/ Up to 1000 ClientsAcceptable Real-time Performance w/ Up to 1000 Clients
• Observed a 5x increase in CPU load when using Linux w/ complete preemption patches
• Initiated technical exchanges with RT Linux group (Ted Tso, Ingo Molnar, others.)
Technical Accomplishments/ProgressNode Failure Detection: Performance Technical Accomplishments/ProgressNode Failure Detection: Performance
![Page 19: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/19.jpg)
19ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Technical Accomplishments/ProgressNode Failure Detection: Performance (continued)
Technical Accomplishments/ProgressNode Failure Detection: Performance (continued)
SMP Kernel w/ Preemption Patches has 3x Larger LatencySMP Kernel w/ Preemption Patches has 3x Larger Latency
![Page 20: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/20.jpg)
20ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Certification of DRM Systems Technologies and Methods
Certification of DRM Systems Technologies and Methods
![Page 21: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/21.jpg)
21ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Certification is a process to verify that system behavior remains within safety and effectiveness parameters
DRE system effectiveness typically requires performing a subset of tasks within temporal bounds or deadlines In many cases the deadlines apply to an end-to-end string
Dynamic Resource Management generates new system configurations and thereby moves part of the certification process into the system runtime
Technical Accomplishments/ProgressCertification: ProblemTechnical Accomplishments/ProgressCertification: Problem
Use scheduability analysis techniques (periodic and strict aperiodic, and transient periodic tasks triggered by a periodic events) to predict whether a particular allocation will meet deadlines while bounding pessimism.
Automate the process of determining feasible and appropriate deployment placements by providing algorithms for release, development, and integration that determine the appropriate allocation, based on the QoS requirements and constraints of the applications and operational strings.
Approach Certification with Simple DRM CapabilitiesSimple DRM Capabilities and Full DRM CapabilitiesFull DRM Capabilities
Solution Space
Problem Space
![Page 22: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/22.jpg)
22ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Technical Accomplishments/ProgressCertification: ApproachTechnical Accomplishments/ProgressCertification: Approach
Simple DRM Capabilities
• Add constraint capabilities to current allocation methodologies (Phase II)– Mutual Placement Constraints (e.g. replicas)– Attribute Matching Constraints (e.g. OS type)
• Introduce multi-dimensional bin packing algorithms (Phase II)
• Engineering Support Tools (Phase II)– Provide capabilities to Bob(X) to build a pedigree of cases, while providing from static generation
tools (Phase III)– ARMS (Phase III)
• Provide RACE Capability for Online Use (Phase III)
• Schedulability Method for simple QoS Allocation (Phase I)• Schedulability Method for ARMS (Phase II)• Constraint Method (Phase III)• Online Capabilities in RACE (Phase III)
– Constraint Capable Bin-Packer Planner– Attribute Matching Constraints (e.g. OS type)
• Full QoS Allocation (Phase III)• Verification (Phase III)
– Delta from static plans w/small perturbations
Full DRM Capabilities
![Page 23: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/23.jpg)
23ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
– Possible use Offline and Online• Offline tool-suite with integrated algorithm support for
generation of static deployment plans and research into new algorithms.
• Pluggable algorithms – usable “as-is” both online and offline; I.e. the same components run online (within RACE and offline within the tool-suite).
– Statistical history capture– Output adaptation to accommodate varying needs
for deployment configuration file generation– Support for ensemble algorithm runs– Flexible test input distribution generation for
validating algorithms and Extensions for Scheduability Analysis
– Variations of simple bin packing and heuristics-based algorithms for more challenging (e.g. zero slack) problems.
– Multi-Dimensional variations on allocation algorithms – including 3-D bin-packing along CPU, memory, and network bandwidth dimensions.
– Constraint-Based allocation– Incorporation of Scheduability
Ru
nti
me
Off
lin
e
Deployment Deployment ConfigurationConfiguration
Technical Accomplishments/ProgressCertification: QoS Driven Allocation Tools CapabilitiesTechnical Accomplishments/ProgressCertification: QoS Driven Allocation Tools Capabilities
![Page 24: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/24.jpg)
24ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Schedulability analyses for end-to-end aperiodic tasks with hard deadlines 1st Approach: Aperiodic Utilization Bound (AUB) - Online 2nd Approach: Deferrable Server (DS) - Offline
Accomplishments Implemented AUB and DS schedulability analyses Developed heuristics for tuning Deferrable Server Compared two approaches via numerical studies Implementation on TAO federated event channel
• The first DS implementation in middleware• Online admission control based on AUB
Empirical results on a Linux cluster• Validation of schedulability analysis• Run-time overhead
On-going Developing deferrable server mechanisms in TAO’s federated event channel Validating schedulability analyses via empirical studies on TAO
Technical Accomplishments/ProgressTowards Certification: Aperiodic Tasks - OverviewTechnical Accomplishments/ProgressTowards Certification: Aperiodic Tasks - Overview
![Page 25: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/25.jpg)
25ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
• Incorporate aperiodic tasks in periodic scheduling
• Server: a periodic task responsible for processing aperiodic requests.• Budget: maximum time the server can run in a period
• Algorithm• Server is suspended when its budget runs out
Bound aperiodic tasks’ impact on periodic tasks
• Budget is replenished in the beginning of each period
Technical Accomplishments/ProgressTowards Certification: Deferrable ServerTechnical Accomplishments/ProgressTowards Certification: Deferrable Server
Overview
Implementation
• Challenge: Implement bandwidth preserving servers on top of priority-based operating systems.
• Solution– Server thread processes aperiodic events (2nd highest priority)– Budget thread manages the budget and controls the execution of server threads
(highest priority)
![Page 26: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/26.jpg)
26ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
• Correctness: No schedulable task sets had deadline misses.
• Pessimism: Some of the unschedulable task sets also met deadlines.
4 processors; 4 aperiodic tasks+ 8 periodic tasks4 processors; 4 aperiodic tasks+ 8 periodic tasks
Technical Accomplishments/ProgressTowards Certification: Deferrable Server ValidationTechnical Accomplishments/ProgressTowards Certification: Deferrable Server Validation
![Page 27: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/27.jpg)
27ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
• Budget manager: < 89us per server period • Server thread: < 159us per aperiodic subtask
Technical Accomplishments/ProgressTowards Certification: Deferrable Server OverheadTechnical Accomplishments/ProgressTowards Certification: Deferrable Server Overhead
![Page 28: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/28.jpg)
28ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
• Central admission controller for end-to-end tasks.
• Admission test– If the system remains within the feasible region
• admit the new task into the system
• increase the synthetic utilization
– Decrement synthetic utilization• at the deadlines of aperiodic tasks
• [resetting rule] when CPU idles
Technical Accomplishments/ProgressTowards Certification: Admission Control (AC)Technical Accomplishments/ProgressTowards Certification: Admission Control (AC)
AC Policies• Soft Tasks
– Send an event to notify the central admission controller– Hold the task in a waiting queue and waits for the reply
• Hard Tasks– Release immediately, then notify AC– AC may eject soft periodic tasks when it receives the notification.
• Aperiodic Tasks– Admission test for every job– CPU idles idle thread reports the departed aperiodic tasks to AC
• Periodic Tasks– Admit once and maintains reservation for a task
![Page 29: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/29.jpg)
29ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Round-trip latency for admitting a soft taskRound-trip latency for admitting a soft task
Hard tasks are admitted immediatelyHard tasks are admitted immediately
Technical Accomplishments/ProgressTowards Certification: AC Latency for Soft TasksTechnical Accomplishments/ProgressTowards Certification: AC Latency for Soft Tasks
![Page 30: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/30.jpg)
30ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
• Online admission control significantly outperformed offline analysis.– All task sets are unschedulable under offline analysis – Resetting significantly increased the number of admitted tasks.
3 processors + 1 AC processor3 processors + 1 AC processor4 soft aperiodic tasks and 5 soft periodic tasks4 soft aperiodic tasks and 5 soft periodic tasks
Technical Accomplishments/ProgressTowards Certification: AC – Admission RatioTechnical Accomplishments/ProgressTowards Certification: AC – Admission Ratio
![Page 31: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/31.jpg)
31ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
RACE Workshop and Demonstration
RACE Workshop and Demonstration
![Page 32: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/32.jpg)
32ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
RACE Demo and WorkshopTool Chain: Demonstration Highlights RACE Demo and WorkshopTool Chain: Demonstration Highlights
Scenario 1 - Demonstrates RACE Control by reacting to deadline misses in a critical path modeled into the RT1H operation string. The critical path exceeds its EED threshold due to the introduction of a competing operation string that consumes excessive CPU.
Scenario 2 -Scenario 2 - Demonstrates the ability of the tool chain to handle Shared Components. Two operation strings are deployed with shared components between them. After deployment a string is torn down to show the other (involving the shared component) is still operational.
Scenario 3 -Scenario 3 - Demonstrates FT extensions to PICML to capture fault tolerant requirements. The concepts of SRG and FOU are shown and an integrated interpreter is used to run an offline constraint-based algorithm for replica placement.
The RACE demonstration is composed of three scenarios. These scenarios involve RACERACE (control and allocation), DAnCEDAnCE, PICMLPICML, CUTSCUTS, CoWorkErCoWorkEr and the BMW elements.
PICMLPICML Flat Deployment Flat Deployment PlanPlan
Flat DeploymentFlat DeploymentPlanPlan
(modified)(modified)
RACERACE
DAnCEDAnCE
Hierarchical PlanHierarchical Plan
Many of the initial capabilities being shown will support GT-4GT-4, or are extendable to do so.
![Page 33: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/33.jpg)
33ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
RACE Demo and WorkshopTool Chain: New Capability Highlights RACE Demo and WorkshopTool Chain: New Capability Highlights
The RACE demonstration highlights many of the new capabilities developed for the RACE framework and related tool chain, many of which are intended to support GT-4GT-4..
* Importance Attr. (supports GT-4) * Static and Dynamic Plans (supports GT-4) * Component Dynamic Placeability Attr. (supports GT-4) * Shared Components (supports GT-4) * Hierarchical Descriptors (supports GT-4) * PICML Modifications (supports GT-4) o FT Elements o Shared Components o Qos Attributes * DaNCE Modifications (supports GT-4) o ReDAC o Priority Control o Component-Process Mapping o Shared Component Support * Web and Interactive Input Adapters
* RACE Control (supports GT-4) o EED Monitoring o Reactive control of OS priority based on importance
* WLG-2 Capabilities o Code Generation o BMW Integration o BDC Integration
* Ensemble Planner * Target Manager * Fault Model Elements o Failover Unit o Replication Group o CCM IOGRs o Shared Risk Group o Constraint-Based Allocation # metrics (e.g. distance, co-failure) # integrated in interpreter - offline analysis # motivates contraint-based allocation
![Page 34: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/34.jpg)
34ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
RACE Demo and WorkshopTool Chain: Future CapabilitiesRACE Demo and WorkshopTool Chain: Future Capabilities
The RACE framework and tool chain will require additional capabilities to support the current GT-4GT-4 CONOPS.
* RACE Follow-on work o Plan State (supports GT-4) + ReDAC Integration + (Re)plan on Importance + Include FT simplex deployments + Integration of Node Alive Solution
o Events on plan progress and status (supports GT-4) o Warfighter Value/Importance Constraints on Placement (supports GT-4) o Submission of Multiple Plans Simultaneously (supports GT-4)
* Multi-D Planner o Multiple Heuristics: FFD, WFD, BFD, Efficient Subset Sums o Modeled 3 dimensions: CPU Utilization, Memory, Network Bandwidth Algorithms available and initial development done.
![Page 35: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/35.jpg)
35ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Controller ResearchRACE Control Research
Controller ResearchRACE Control Research
![Page 36: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/36.jpg)
36ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
• Task model• Soft, end-to-end deadlines
• Two types of tasks: critical tasks and non-critical tasks
• Goals• Performance isolation: protect critical tasks against
disturbance from non-critical ones
• Minimize deadline misses: improve overall performance
• Handle uncertainties and dynamics• Task arrival/departure• Fluctuation in execution times
• Practical, application-transparent adaptation• Actuator: Priority adjustment• Sensor: CPU utilization, deadline miss• Planned for future RACE implementation
Technical Accomplishments/ProgressRACE Control: Flexible Maximum Urgency First (FMUF)
Technical Accomplishments/ProgressRACE Control: Flexible Maximum Urgency First (FMUF)
![Page 37: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/37.jpg)
37ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
• Two priority classes• Each class is scheduled by a real-time policy (RMS, EDF)• Critical tasks high-priority class
• Feedback control• Dynamically change the priority-class of non-critical tasks
based on deadline misses in the high-priority class• No miss: Non-critical tasks high-priority class• Miss: Non-critical tasks low-priority class
• Avoid oscillation based on measured CPU utilization• Maximize #tasks in the high-priority class without
causing deadline misses in that class
Technical Accomplishments/ProgressRACE Control: The MUF ApproachTechnical Accomplishments/ProgressRACE Control: The MUF Approach
![Page 38: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/38.jpg)
38ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Phase III Future Work
Phase III Future Work
![Page 39: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/39.jpg)
39ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
• Scheduability Analysis (integrated with the allocation/placement problem)
• Multi-Dimensional Allocation
• Constraint-Based Allocation/Placement
• Certifiability of these approaches
Including a framework for testing and researching new algorithms
Verifying allocations meet certification constraints (e.g. differ from a static plan in a specified manner or according to specified rules)
• Offline and Online capability for this analysis and planning
Offline tool-suite with integrated algorithm support for generation of static deployment plans and research into new algorithms.
Pluggable algorithms usable "as-is" both online and offline; I.e. the same components run online (within RACE and offline within the tool-suite).
Phase III IdeasPossible Phase III Research Areas Phase III IdeasPossible Phase III Research Areas
![Page 40: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/40.jpg)
![Page 41: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/41.jpg)
41ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Backup SlidesMain Presentation Support Slides
Backup SlidesMain Presentation Support Slides
![Page 42: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/42.jpg)
42ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Pool-1.B
JavelinJavelin
ChaparalChaparal
Pool-1.A
MakoMako
ChampionChampion
CheckmateCheckmate
HogfishHogfish
smm.1
Primary WLG
Redeployed WLG
Legend
• TSCE is operating normally – as configured by MLRM
• A fault occurs and is detected by MLRM
• An artificial error causes MLRM dynamic allocation to fail
MLRM
Fault Detected
Dynamic Allocation
• MLRM attempts dynamic re-allocation
X
eff.13eff.12plan.1
ed.1 eff.8 plan.4
plan.3eff.1ed.2
plan.3
• MLRM deploys a feasible static allocation
cfgop.1
cfgop.1
eff.7
eff.7
sharedshared
Technical Accomplishments/ProgressPhase II - Gate Test 1A: Test ScenarioTechnical Accomplishments/ProgressPhase II - Gate Test 1A: Test Scenario
Click to return to results slide.
![Page 43: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/43.jpg)
43ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Pool-1.A
MakoMako
ChampionChampion
CheckmateCheckmate
Pool-1.B
JavelinJavelin
HogfishHogfish
ChaparalChaparal
• TSCE is operating normally – as configured by MLRM
• Operator elects to fall back to a feasible static allocation
• MLRM deploys a feasible static allocation
MLRM
Static Allocation Request
• MLRM tears down existing dynamically allocated strings
smm.1eff.1eff.12plan.12
ed.1 eff.8 plan.4
plan.3eff.1ed.2
plan.3
cfgop.1
sharedshared
eff.7
eff.7
cfgop.1
Technical Accomplishments/ProgressPhase II - Gate Test 1B: Test ScenarioTechnical Accomplishments/ProgressPhase II - Gate Test 1B: Test Scenario
Primary WLG
Redeployed WLG
Legend
Click to return to results slide.
![Page 44: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/44.jpg)
44ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Technical Accomplishments/ProgressCertification Model: Constrained Perturbation Technical Accomplishments/ProgressCertification Model: Constrained Perturbation
Template(analogous to a static plan)
Dynamic Plans(DRM generated plans)
constraints, parameters
class of plan
traditionally certifiable
class of plan
from plan class
transformation domain
class transformation relation-pair
inverse verification
φRφR-1
cert
ifiab
ly
cons
trai
nt-
isom
orph
ic
legal dynamic domain for
class of plan
constraints, parameters
from plan classclass transformation relation-pair
inverse verification
ΨRΨR-
1Ψ
φ
legal dynamic domain for
class of plan
cert
ifiab
ly
cons
trai
nt-
isom
orph
ic
feasibly scheduable
feasibly allocatable
Φ П
isomorphic transformation
Γboolean
certification meterics
DRM certification gauntlet
![Page 45: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/45.jpg)
Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
RACE Demonstration and WorkshopRACE Demonstration and WorkshopRACE Demonstration and WorkshopRACE Demonstration and Workshop
Lockheed Martin Lockheed Martin Advanced Technology LaboratoriesAdvanced Technology Laboratories
andandVanderbilt UniversityVanderbilt University
DARPA:ARMSDARPA:ARMS
![Page 46: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/46.jpg)
46ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
RACE Demo and WorkshopTool Chain Demonstration RACE Demo and WorkshopTool Chain Demonstration
Scenario 1 - Demonstrates RACE Control by reacting to deadline misses in a critical path modeled into the RT1H operation string. The critical path exceeds its EED threshold due to the introduction of a competing operation string that consumes excessive CPU.
Scenario 2 -Scenario 2 - Demonstrates the ability of the tool chain to handle Shared Components. Two operation strings are deployed with shared components between them. After deployment a string is torn down to show the other (involving the shared component) is still operational.
Scenario 3 -Scenario 3 - Demonstrates FT extensions to PICML to capture fault tolerant requirements. The concepts of SRG and FOU are shown and an integrated interpreter is used to run an offline constraint-based algorithm for replica placement.
The RACE demonstration is composed of three scenarios. These scenarios involve RACERACE (control and allocation), DAnCEDAnCE, PICMLPICML, CUTSCUTS, CoWorkErCoWorkEr and the BMW elements.
PICMLPICML Flat Deployment Flat Deployment PlanPlan
Flat DeploymentFlat DeploymentPlanPlan
(modified)(modified)
RACERACE
DAnCEDAnCE
Hierarchical PlanHierarchical Plan
Many of the initial capabilities being shown will support GT-4GT-4, or are extendable to do so.
![Page 47: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/47.jpg)
47ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
RACE Demo and WorkshopPhysical Demo Setup: ISIS Lab RACE Demo and WorkshopPhysical Demo Setup: ISIS Lab
RACERACE
DAnCEDAnCE
PICMLPICML
ISIS LabISIS Lab
wiki.isis.vanderbilt.edu/support/isislab.htm
RACEController
Resource Utilization (system and per Opplication)
Opp string QoS
Target Manager
Resource Monitor
Resource Monitor
Resource Monitor
Resource Utilization
Resource Utilization
Resource Utilization
RACE Control Agent
CKRM Control Agent
FCSControl Agent
CPU Broker Control Agent
Opp-string 1 Control Agent
CUTS BDC
Opp-string 1 QoS
Monitor
QoS Information
QoS Information
QoS Information
OS Priority Agent
Opp-string 2 QoS
Monitor
Opp-string n QoS
Monitor
Opp-string 1 Control Agent
Opp-string 1 Control Agent
InternetInternet
Local Demo LaptopLocal Demo Laptop
RACE Demo GUIRACE Demo GUI
![Page 48: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/48.jpg)
48ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
RACE Control Critical Path
End-to-End Deadline Monitoring and Reactive Control
RACE Control Critical Path
End-to-End Deadline Monitoring and Reactive Control
![Page 49: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/49.jpg)
49ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
RACE Demo and WorkshopRACE Controller: RACE ComponentsRACE Demo and WorkshopRACE Controller: RACE Components
RACEController
Resource Utilization (system and per Opplication)
Opp string QoS
Target Manager
Resource Monitor
Resource Monitor
Resource Monitor
Resource Utilization
Resource Utilization
Resource Utilization
RACE Control Agent
CKRM Control Agent
FCSControl Agent
CPU Broker Control Agent
Opp-string 1 Control Agent
CUTS BDC
Opp-string 1 QoS
Monitor
QoS Information
QoS Information
QoS Information
OS Priority Agent
Opp-string 2 QoS
Monitor
Opp-string n QoS
Monitor
Opp-string 1 Control Agent
Opp-string 1 Control Agent
Hierarchical Packages&
Deployment Plans
RACE Controller Receives plans from the RACE allocation planners.Key Elements: Target Manager, Race Controller, CUTS BDC, and DAnCE
![Page 50: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/50.jpg)
50ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
RACE Demo and WorkshopRACE Controller: Scenario OneRACE Demo and WorkshopRACE Controller: Scenario One
First Demo Scenario: 1. Deploy RT1HRT1H Operational String, which has an EEDrequirement specified. View post RACE deployment.
2. Monitor EED
3. Deploy Competing (CPU Hog) Hog_StringHog_String. View post RACE deployment
4. Monitor EED Miss
5. Observe RACE Reactive Control
Deployment After RACE Processing
RACE Demo GUI
All Deployments occur through All Deployments occur through DAnCEDAnCE
![Page 51: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/51.jpg)
51ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Shared CCM Components RACE and DAnCE handle deployment of shared WLGs
Shared CCM Components RACE and DAnCE handle deployment of shared WLGs
![Page 52: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/52.jpg)
52ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Second Demo Scenario: 1. Deploy RT1H_Shared_ART1H_Shared_A Operational String, which has shared components. View post RACE deployment that was dynamically planned.
2. Deploy RT1H_Shared_BRT1H_Shared_B Operational String,which shares components with RT1H_Shared_B.RACE uses a planner to dynamically place string.View deployment post RACE processing.
Deployment After RACE Processing
RACE Demo GUI
RACE Demo and WorkshopShared Components: Scenario TwoRACE Demo and WorkshopShared Components: Scenario Two
3. Teardown an Op-String and observe the remaining string stays operational. All Deployments occur through All Deployments occur through DAnCEDAnCE
![Page 53: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/53.jpg)
53ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
CCM Fault ToleranceModeling Concepts and Demonstration
CCM Fault ToleranceModeling Concepts and Demonstration
![Page 54: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/54.jpg)
54ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Shared Risk Groups (SRG)Shared Risk Groups (SRG)SRGs are an FT modeling element added to PICML, that allow a modeler to capture associations related to risk. This risk association is then used by the interpreter to constrain replica placement decisions in an attempt to minimize the risk of failures affecting primary and replica(s).
Failover Units (FOU)Failover Units (FOU)FOUs are used to model FT requirements on a component or string. The FOU specifies the number of replicas (among other things) and is used by the interpreter to inject replica components into the deployment and perform the correct connection establishment.
Constraint-Based Node assignment•Offline analysis and planning•Metrics
• Composite Distance•Distance to primary•Comparing two placements•Penalties
•Uniformity•Replica Pair-wise Distance (future)
•Co-Failure Probability (another formulation)
FT Interpreter•Injection
•Components•Connections - CCM IOGR
•Placement
RACE Demo and WorkshopIntroduction: FT Modeling Concepts and Demonstration RACE Demo and WorkshopIntroduction: FT Modeling Concepts and Demonstration
Cli
ck im
ages
for
Det
aile
d S
lid
esC
lick
imag
es f
or D
etai
led
Sli
des
![Page 55: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/55.jpg)
55ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
Deployment PlansDeployment Plans
6. Replicas placed according to distance-based constraint algorithm using SRG information.
Example of an Offline constraint placement approach within interpreterExample of an Offline constraint placement approach within interpreter
FT InterpreterFT Interpreter
Replica Placement Algorithm
Plan Viewer
GME/PICMLGME/PICML
injection
model
Model Model InformationInformation
Domain, Deployment, SRG,
and FOU
RACE Demo and WorkshopFT Modeling Demonstration: Scenario Three RACE Demo and WorkshopFT Modeling Demonstration: Scenario Three
Third Demo Scenario: 1. Model Components and Strings in PICML2. Create Deployment Plan3. Model FOU4. Model SRG
5. Interpreter Automatically Injects Replicas and Associated CCM IOGRs
![Page 56: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/56.jpg)
56ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
RACE Workshop SlidesPanel Support Material
RACE Workshop SlidesPanel Support Material
![Page 57: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/57.jpg)
57ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
PICMLPICML Flat Deployment Flat Deployment PlanPlan
Flat DeploymentFlat DeploymentPlanPlan
(modified)(modified)
RACERACE
DAnCEDAnCE
RACE Demonstration and WorkshopModel Driven DRM: Tool Suite
RACE Demonstration and WorkshopModel Driven DRM: Tool Suite
Hierarchical PlanHierarchical Plan
RACE is an extensible CCM framework that integrates multiple resource management RACE is an extensible CCM framework that integrates multiple resource management algorithms for dynamically (re)deploying and (re)configuring application components.algorithms for dynamically (re)deploying and (re)configuring application components.
RACE decouples resource allocation and system adaptation logic from the underlying RACE decouples resource allocation and system adaptation logic from the underlying middleware deployment, configuration, and control mechanisms.middleware deployment, configuration, and control mechanisms.
![Page 58: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/58.jpg)
58ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
• Pluggable Input AdaptersInput Adapters are responsible for translating input provided to RACE into IDL data structures
• The Plan AnalyzerPlan Analyzer is responsible for examining metadata in the plan and selecting pluggable planners to be run on the plan.
• The Plan ManagerPlan Manager executes the planners selected by the Plan Analyzer.
• Pluggable Output AdaptersOutput Adapters are responsible for translating the provisioned deployment plans into a native format for deployment.
• The ControllerController is responsible for reacting to events presented by the Monitors and actuating any required changes to the configuration and deployment through deployed agents.
RACE Demonstration and WorkshopRACE: A PrimerRACE Demonstration and WorkshopRACE: A Primer
![Page 59: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/59.jpg)
59ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
PIC
ML
mo
de
lR
AC
E
A model driven process that allows a complete description of information required for managing, deploying, and configuring RTE applications.
A Platform-Independent Component Modeling Language (PICML) is used to capture all pertinent model elements (e.g. AIM and DIM).
Interpreters capture the information in an OMG compliant DnC deployment specification.
Output from the model drives a flexible and extensible CCM based Resource Allocation and Control Engine (RACE).
RACE analyzes and constructs deployment plans (deployable through DAnCE, for example) based on a plug-in framework where planning such as allocation and schedulability analysis contribute to a final configuration.
RACE monitors and adjusts deployments based on prevailing conditions within its domain of control.
CIAO/DAnCECIAO/DAnCE
The RACE infrastructure and tool chain provides…The RACE infrastructure and tool chain provides…
RACE Demonstration and WorkshopRACE: A Primer (continued)RACE Demonstration and WorkshopRACE: A Primer (continued)
![Page 60: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/60.jpg)
60ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
RACE Demonstration and WorkshopTool Chain: New Capability Highlights RACE Demonstration and WorkshopTool Chain: New Capability Highlights
The RACE demonstration highlights many of the new capabilities developed for the RACE framework and related tool chain, many of which are intended to support GT-4GT-4..
* Importance Attr. (supports GT-4) * Static and Dynamic Plans (supports GT-4) * Component Dynamic Placeability Attr. (supports GT-4) * Shared Components (supports GT-4) * Hierarchical Descriptors (supports GT-4) * PICML Modifications (supports GT-4) o FT Elements o Shared Components o Qos Attributes * DaNCE Modifications (supports GT-4) o ReDAC o Priority Control o Component-Process Mapping o Shared Component Support * Web and Interactive Input Adapters
* RACE Control (supports GT-4) o EED Monitoring o Reactive control of OS priority based on importance
* WLG-2 Capabilities o Code Generation o BMW Integration o BDC Integration
* Ensemble Planner * Target Manager * Fault Model Elements o Failover Unit o Replication Group o CCM IOGRs o Shared Risk Group o Constraint-Based Allocation # metrics (e.g. distance, co-failure) # integrated in interpreter - offline analysis # motivates contraint-based allocation
![Page 61: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/61.jpg)
61ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
RACE Demonstration and WorkshopTool Chain: Future CapabilitiesRACE Demonstration and WorkshopTool Chain: Future Capabilities
The RACE framework and tool chain will require additional capabilities to support the current GT-4GT-4 CONOPS.
* RACE Follow-on work o Plan State (supports GT-4) + ReDAC Integration + (Re)plan on Importance + Include FT simplex deployments + Integration of Node Alive Solution
o Events on plan progress and status (supports GT-4) o Warfighter Value/Importance Constraints on Placement (supports GT-4) o Submission of Multiple Plans Simultaneously (supports GT-4)
* Multi-D Planner o Multiple Heuristics: FFD, WFD, BFD, Efficient Subset Sums o Modeled 3 dimensions: CPU Utilization, Memory, Network Bandwidth
![Page 62: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/62.jpg)
62ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
DataCenter1_SRGDataCenter1_SRG DataCenter2_SRGDataCenter2_SRG
Rack1_SRGRack1_SRG Rack2_SRGRack2_SRG Node1Node1(blade31)(blade31)
Node2Node2(blade32)(blade32)
Shelf1_SRGShelf1_SRG Shelf2_SRGShelf2_SRG
Blade30Blade30
Ship_SRGShip_SRG
Blade34Blade34 Blade29Blade29
Shelf1_SRGShelf1_SRG
Blade36Blade36
RACE Demo and WorkshopShared Risk Group (SRG): ExampleRACE Demo and WorkshopShared Risk Group (SRG): Example
Blade33Blade33
![Page 63: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/63.jpg)
63ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
R1
P
R2
R3
Node2Node2(blade32)(blade32)
Shelf1_SRGShelf1_SRG
Ship_SRGShip_SRG
Blade34Blade34 Blade36Blade36
RACE Demo and WorkshopShared Risk Group (SRG): Example (continued)
RACE Demo and WorkshopShared Risk Group (SRG): Example (continued)
Replica1
Primary
Replica2
Replica3
Composite Composite DistanceDistance
Choose a feasible replica placement based on Composite Distance constraints.
Blade30Blade30
Node1Node1(blade31)(blade31)
DataCenter2_SRGDataCenter2_SRGDataCenter1_SRGDataCenter1_SRG
Rack1_SRGRack1_SRG Rack2_SRGRack2_SRG
Shelf1_SRGShelf1_SRGShelf2_SRGShelf2_SRG
Blade29Blade29 Blade33Blade33
Click to return to concepts slide.
![Page 64: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/64.jpg)
64ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
R1
P
R2
R3
RACE Demo and WorkshopShared Risk Group (SRG): Distance Metric Calculation
RACE Demo and WorkshopShared Risk Group (SRG): Distance Metric Calculation
Choose a feasible replica placement based on Composite Distance constraints.
Formulation of Replica Distance from Primary
Define N orthogonal vectors, one for each of the distance values computed for the N components (with respect to a primary) and vector-sum these to obtain a resultant. Compute the magnitude of the resultant as a representation of the composite distance captured by the placement .
1. Compute the distance from each of the replicas to the primary for a placement. 2. Record each distance as a vector, where all vectors are orthogonal. 3. Add the vectors to obtain a resultant.4. Compute the magnitude of the resultant.5. Use the resultant in all comparisons (either among placements or against a threshold) 6. Apply a penalty function to the composite distance (e.g. pair-wise replica distance or uniformity)
Click to return to concepts slide.
![Page 65: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/65.jpg)
65ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
container/component servercontainer/component server
FPCFPC
A
primary IO
R
secondary IOR
HB
container/component servercontainer/component server
FPCFPC
B
HB
container/component servercontainer/component server
FPCFPC
C
HB
container/component servercontainer/component server
FPCFPC
A’
HB
container/component servercontainer/component server
FPCFPC
B’
HB
container/component servercontainer/component server
FPCFPC
C’
HB
periodic FPC heartbeat
IOG
R
IOG
RIO
GR
IOG
R
“client”
IOG
R
RACE Demo and WorkshopFailover Unit (FOU): Component FOU ExampleRACE Demo and WorkshopFailover Unit (FOU): Component FOU Example
Click to return to concepts slide.
![Page 66: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/66.jpg)
66ARMS Phase II PI Meeting April 11-13, 2006Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company.
RACE Demo and WorkshopFailover Unit (FOU): OpString FOU ExampleRACE Demo and WorkshopFailover Unit (FOU): OpString FOU Example
container/component servercontainer/component server
FPCFPC
“client”
periodic FPC heartbeat
primary IOR
primary stringprimary string
A
HB
container/component servercontainer/component server
Bcontainer/component servercontainer/component server
C
container/component servercontainer/component server
replica stringreplica string
A’container/component servercontainer/component server
B’container/component servercontainer/component server
C’
secondary IOR
IOG
R
FPCFPC
HB HB
intra-FOU heartbeat
Click to return to concepts slide.
HBHB HB
![Page 67: Use or disclosure of this data outside the ARMS Program or Government is restricted without the express written permission of the Lockheed Martin Company](https://reader033.vdocuments.net/reader033/viewer/2022051621/5697bf861a28abf838c87cbf/html5/thumbnails/67.jpg)