evaluation of xen: performance and use in parallel...
TRANSCRIPT
Evaluation of Xen: Performance and
Use in Parallel Applications
EECE 496 Project Report
Prepared by Caleb Ho (38957023)
Supervisor: Matei Ripeanu
Date: April 12, 2007
ii
ABSTRACT
Xen is an open-source virtual machine monitor under heavy development. In this project,
the performance of Xen and its use in parallel applications are investigated. It is found
that Xen performs close to the native performance in the area of computation, but lacking
in other areas. Furthermore, to increase fault tolerance of parallel applications, the naïve
checkpoint technique is analyzed and determined feasible using Xen’s save/restore
functionalities.
iii
TABLE OF CONTENTS
ABSTRACT........................................................................................................................ ii
TABLE OF CONTENTS................................................................................................... iii
LIST OF ILLUSTRATIONS............................................................................................. iv
GLOSSARY ....................................................................................................................... v
LIST OF ABBREVIATIONS............................................................................................ vi
1.0 INTRODUCTION ........................................................................................................ 1
2.0 METHODOLOGY ....................................................................................................... 3
2.1 Performance Evaluation of Xen................................................................................ 3
2.2 Checkpoint techniques for parallel applications ....................................................... 4
3.0 EXPERIMENTS........................................................................................................... 6
3.1 Performance Evaluation of Xen................................................................................ 6
3.1.1 UnixBench v4.0.1 .............................................................................................. 6
3.1.2 Intel MPI Benchmark Suite (IMB) v3.0 ............................................................ 7
3.2 Checkpoint techniques for parallel applications ....................................................... 8
4.0 RESULTS ................................................................................................................... 11
4.1 Performance Evaluation of Xen.............................................................................. 11
4.1.1 UnixBench Results........................................................................................... 11
4.1.2 Intel MPI Benchmark (IMB) results ................................................................ 12
4.2 Checkpoint techniques for parallel applications ..................................................... 13
4.2.1 save/restore space/disk results ......................................................................... 13
4.2.2 Naïve checkpoint results .................................................................................. 14
4.3 Difficulties and Challenges..................................................................................... 14
4.4 Future work............................................................................................................. 15
5.0 CONCLUSIONS......................................................................................................... 17
6.0 REFERENCES ........................................................................................................... 18
APPENDICES .................................................................................................................. 19
Appendix A: UnixBenchResultsParse.py ..................................................................... 19
Appendix B: genShellScript.py .................................................................................... 20
Appendix C: MPIResultsParse.py ................................................................................ 20
Appendix D: save.py..................................................................................................... 21
Appendix E: restore.py ................................................................................................. 21
iv
LIST OF ILLUSTRATIONS
Figure 1. Raw Output of the Benchmark Suite.............................................................. 7
Figure 2. PingPong Operation [9].................................................................................... 8
Figure 3. UnixBench Results .......................................................................................... 11
Figure 4. UnixBench Results Test Legend .................................................................... 12
Figure 5. MPI PingPong results..................................................................................... 13
Table 1. Naïve checkpoint results .................................................................................. 14
v
GLOSSARY
Checkpoint – save the state of an operation to be restored later in the case of a failure
Cluster – a collection of nodes that work on a computation problem together by dividing
the problem into smaller tasks
Guest – a virtual machine created in Xen
Initrd – a temporary file system used by the Linux kernel during boot
Kernel – a piece of software responsible for providing secure access to the machine's
hardware to various computer programs
Native machine – the machine running the operating system without virtualization
Node - a computational processor or machine in parallel computing
Open-source – a program whose source code is made available for use or modification
Para-virtualization – a software interface that runs on top of the virtual machine
monitor to mimic the underlying hardware
Full-virtualization – a complete simulation of the underlying hardware by the virtual
machine monitor that requires special hardware support
Parallel application – a program that uses cooperative nodes to perform parallel
computing
Parallel computing - the simultaneous execution of the same task on multiple processors
or machines in order to obtain results faster [1].
PingPong – message passing between two nodes, where the nodes take turns to a
message to each other
Virtual Machine – also called “hardware virtual machine”, is a self-contained operating
environment that behaves as if it is a separate computer. In Xen, virtual machines that are
created are called guests.
Xen – an open-source virtual machine monitor
vi
LIST OF ABBREVIATIONS
CPU – Central processing unit
I/O – input output
IMB – Intel MPI Benchmark Suite
MPI - Message Passing Interface
MPICH2 – an MPI implementation version 2
1
1.0 INTRODUCTION
Virtual machines are often used in software development, testing, and analysis as they
provide benefits such as isolation, standardization, consolidation, ease of testing, and
mobility [2]. There are currently several virtual machine monitors available in the market,
of which one of the most popular is Xen. Xen is an open-source virtual machine monitor
under heavy development that has shown exceptional level of performance [3].
Furthermore, Xen has a built-in save/restore functionality that allows a user to save and
restore the state of a virtual machine.
In this project, the performance of Xen and its use in parallel applications are
investigated. Specifically, one of the main of objectives of this project is to execute a
quantitative performance comparison between the native machine and Xen. Because
there are currently only a handful of characterizations of Xen as it is still a developing
product, its performance results would be an interesting study to the virtual machine
community.
Another objective of this project is to analyze the feasibility of using the save/restore
functionalities in Xen for parallel applications. During parallel computing, the failure of a
node is normal due to factors such as hardware failures, power outages, or software
problems, which would lead to failure of the entire computation. In order to retain the
computational efforts before a particular node fails, common techniques such as
duplication, logging, and check-pointing can be used [3]. Using Xen, users might be able
to implement check-pointing algorithms on their parallel computing applications to
increase their fault tolerance.
In this project, several Xen virtual machines on one physical machine are installed and
configured. Afterwards, the performance of Xen is evaluated using two benchmark
suites: UnixBench [5], and Intel MPI Benchmark Suite [9]. Lastly, different test cases are
designed and executed to analyze the feasibility of using the save/restore functionalities
of Xen to checkpoint parallel applications. This project was performed alone, and was
2
supervised by Matei Ripeanu.
This project specifically deals with Xen, and checkpoint techniques using Xen
save/restore functionalities. This report divides into the following primary sections:
methodology, experiments, results, and conclusions.
3
2.0 METHODOLOGY
This project naturally can be categorized into two parts. Firstly, a performance evaluation
of Xen is to be executed. Secondly, a feasibility analysis of using Xen's save/restore
functionalities for check-pointing parallel applications is to be designed and done.
In order to design the appropriate tests, the different modes under which Xen can create a
virtual machine should be considered. There are two modes – para-virtualization, and full
virtualization. Para-virtualization requires the operating system to be explicitly ported to
run on top of the Xen to provide a software interface that is similar to that of the
underlying hardware; while full-virtualization provides a complete simulation of the
underlying hardware, but require special hardware support. In order to evaluate both
modes, the test hardware chosen have the support required for full virtualization. For this
project, we have chosen a Dell E520 that has an Intel processor that supports full
virtualization, which is the only real constraint for the selection of machine.
Before conducting the experiments, a correct testbed must be configured and set up. The
testbed is on a Dell E520 with virtualization technology, running on Fedora Core 6
distribution of Linux. First, a new partition of the harddrive is created using GParted [4]
so that Linux can be installed. Xen is then installed: using the fc6-kernel (2.6.19-
1.2911.fc6) for native tests, xen-kernel (2.6.19-1.2911.fc6xen) for para-virtualized tests,
and xen-hvm-loader (/xen/boot/hvmloader) for full-virtualized tests.
2.1 Performance Evaluation of Xen
The performance evaluation of Xen can be further categorized into the following areas:
computation, process creation and execution, file system operations, concurrency,
process I/O, and network I/O. In order to cover all these areas of performance, two
benchmarks were chosen: UnixBench 4.0.1, and Intel MPI Benchmark Suite (IMB) v3.0.
UnixBench, consists of ten different tests that cover all the areas mentioned except I/O;
4
whereas IMB consists of over ten that measures I/O performance covers, and it is based
on the MPICH2 implementation of the Message Passing Interface (MPI) standard. IMB is
chosen because MPI is needed for the network I/O tests, as it will be detailed later. While
other benchmarks could have been chosen, these two benchmark suites were chosen
because they are free, complete, and simple to run.
In order to evaluate network I/O, a virtual cluster, which is a network of virtual machines,
is configured and set up using Xen on a single physical machine. Using a single physical
machine instead of multiple physical machines allows less equipment to be bought for
testing. In addition, Xen provides built-in functions for saving and restoring a virtual
machine's state. For coordination between the virtual machine nodes during parallel
computing, MPICH2, which is an implementation of the Message Passing Interface
(MPI) Standard, is used. This implementation is widely used in the parallel computing
community.
2.2 Checkpoint techniques for parallel applications
Parallel computing is often used to speed up computation problems that could take days,
months, or even years to complete. Some practical applications of parallel computing in
the scientific and engineering computing field include computational electromagnetics,
industrial environmental flows, and groundwater flow models [2].
During parallel computing, the failure of a node is normal due to factors such as hardware
failures, power outages, or software problems, which would lead to failure of the entire
computation. In order to retain the computational efforts before a particular node fails,
common techniques such as duplication, logging, and check-pointing can be used [3].
Check-pointing is a common technique that allows a user to save the current state of an
operation, and then later restore to a pre-failure state if an error ever occurs. Xen has
built-in functions, namely “save”, and “restore”, allow a user to save the state of a virtual
machine to a file; and to restore that virtual machine at the saved state from a file at a
5
later time.
A difficulty with check-pointing for parallel applications arises because every node needs
to have the same state. In other words, for a parallel application with node A and node B,
and that the user invokes a checkpoint at time t, there could be an inconsistent state
between node A and B. For example, at the point of the checkpoint, a message might be
in transit from node A to B, where node A knows about the transfer, but B does not know
about the message since it has not received it – the two nodes would have different view
of the system and hence save different states.
There are checkpoint techniques that can be done without Xen’s save/restore. For
example, synchronization can be done at the application level, where the application
would have functionality to signal its internal functions for a checkpoint. Although this
method is more reliable, it also makes the job of the developer more difficult. In the rest
of this report, all references to check-pointing would be referred to using Xen’s
save/restore functionalities unless otherwise specified.
The original intent was to design and implement various checkpoint techniques using
Xen’s save/restore functionalities. Because of time constraints, only one checkpoint
technique is analyzed, which is to checkpoint naively by scripts without any kind of
explicit synchronization between the nodes. Specifically, “save” would be called for the
nodes involved at the same instance, and then “restore” would be called immediately
after “save” is complete – which is equivalent of a simple checkpoint with no guarantees.
In order to evaluate the success/failure condition of a checkpoint technique, a modified
version of IMB’s PingPong test is used. The test would run continuously, check-pointed,
and resume running after various idle intervals. If the test continues to run, it is
considered to be successful.
6
3.0 EXPERIMENTS
3.1 Performance Evaluation of Xen
To evaluate Xen, the two benchmark suites, UnixBench 4.0.1, and Intel MPI Benchmark
Suite (IMB) v3.0 are used. In the following sections, the tests within these suites are
described briefly.
3.1.1 UnixBench v4.0.1
The UnixBench test is an open-source benchmarking tool [5], and it consists of 27 tests
that test the areas of computation, process creation and execution, file-system operations,
and concurrency. The metrics given are either bytes per second, or loops per second,
where higher number denotes better performance.
Some of the computation benchmarks include Dhrystone [6], Arithmetic tests of integer,
double, float, and various other types, compiler throughput test, and a Tower of Hanoi
recursion test [7]. For processes, tests that measure system call overhead, process
creation, execl throughput [8], pipe throughput, and context switching are used. For file-
system operations, various block sizes are tested for both reads and writes. For
concurrency, shell scripts that running concurrently are used. Overall, this benchmark
suite is a good tool to evaluate the areas mentioned for a system.
The benchmark suite is run 10 times each on the native machine (i.e. with no
virtualization), on one para-virtualized guest, and on one full-virtualized guest separately.
The average and standard deviation for each test is computed, and the performance
measurements for the virtualized guests are normalized to the native performance such
that a comparison can be done easily.
In order to run the test and capture the results, simple shell scripts were used. The
7
following figure is the raw output screenshot of the results of running the suite once.
Figure 1. Raw Output of the Benchmark Suite
In order to format the data and perform computations of averages and standard deviation,
A Python script was written to parse the results and generate a summary file. The script
has to open all the related test results, take the corresponding results of each test, and
compute the average and standard deviation of each test. The script file is included in
Appendix A.
3.1.2 Intel MPI Benchmark Suite (IMB) v3.0
The Intel MPI Benchmark Suite (IMB) is an open-source benchmarking tool that is
targeted towards benchmarking the I/O of a system. Specifically, it is based on the
MPICH2 implementation of the Message Passing Interface (MPI) standard, which is
often used in parallel applications. It consists of thirteen benchmarks, and for this project,
on the most basic one, PingPong, was used to compare the I/O performance of Xen. The
following diagram is an illustration of PingPong, where X bytes is the variable size of the
message to be sent in a ping pong. The time for the message to send and received again is
used to measure the performance of the operation. Hence, a shorter time difference
8
denotes better the performance.
Figure 2. PingPong Operation [9]
Similarly to UnixBench, the tests are to be done on the native machine, para-virtualized
guest, and full-virtualized guest. Differently from UnixBench benchmarks, the PingPong
benchmark requires two processes to operate. Hence, the benchmark is run ten times in
each of the following setup:
� 2 processes on 1 machine (no virtualization)
� 2 processes on 1 para-VM
� 2 processes on 2 para-VM
� 2 processes on 1 full-VM
� 2 processes on 2 full-VM
By performing the above tests, we can evaluate the I/O of Xen. Ideally, a test case for 2
processes on 2 machines would be done, but such case requires two physical machines.
Again, scripts were written to deploy the tests and format the results, which can be found
in Appendix B and Appendix C.
3.2 Checkpoint techniques for parallel applications
First, the space/time tradeoff for Xen’s save/restore functionalities is analyzed. By using
“time <command>”, the user time and the CPU usage can be determined.
9
In order to evaluate different checkpoint techniques, a modified version of the Intel MPI
Benchmark Suite’s PingPong test is again used. Specifically, the messages used in the
PingPong and configured to be large (150 MB) so that the test would run for more than 5
minutes such that failure conditions can be observed. For simplicity, only two nodes are
used in the evaluation.
In order to determine the success criterion of a checkpoint technique, failure condition of
the PingPong test is first identified. Xen has a pause/unpause functionalities for virtual
machines. The transient failure of a node can be simulated by pausing the execution a
node for a desirable period of time. In order to test the time required for the PingPong test
to detect a failure, one node is paused for indefinite amount of time until an error occurs
on the other node. After running five trials, the test detects a failure in the range of 20 to
22 minutes, which can be explained the default time-out period of 20 minutes for a
Transfer Control Protocol (TCP).
Because of time constraints, only one checkpoint technique is analyzed, which is to
checkpoint naively by scripts without any kind of explicit synchronization between the
nodes. Specifically, “save” would be called for the nodes involved at the same instance.
This can be done by the following code,
xm save nodeA &
xm save nodeB &
which would allow the saving of both virtual machine nodes at the same time in the
background. The saving of the nodes is handled the operating system and Xen, which
would provide no guarantees on the states. The full script can be found in Appendix D.
The nodes can then be restored after a desirable amount of time with the following code,
xm restore nodeA &
xm restore nodeB &
which would restore both virtual machine nodes at the same time in the background.
Again, the two nodes would not be restored simultaneously. The full script can be found
10
in Appendix E. Because of the nature of TCP, packets can be lost and retransmitted.
Hence, the application can tolerate some loss naturally, and the failure condition can still
be evaluated.
The following test cases are performed to evaluate the naive check-pointing, where the
two nodes can be identified as node A and node B:
� save and stop node B, then restore it after 5 minutes
� save and stop node B, then restore it after 10 minutes
� save and stop node B, then restore it after 20 minutes
� save and stop nodes A and B, then restore it after 5 minutes
� save and stop nodes A and B, then restore it after 10 minutes
� save and stop nodes A and B, then restore it after 1 hour
� save and stop nodes A and B, then restore it after 1 day
By performing the above tests, the feasibility of a checkpoint technique for a PingPong
application can be determined.
11
4.0 RESULTS
4.1 Performance Evaluation of Xen
4.1.1 UnixBench Results
The following graph is the results for the UnixBench benchmark, where the different
colors represent the different testbeds – native, para-virtualized, and full-virtulized. The
x-axis is the benchmark test denoted in Figure 4 on the next page; while the y-axis is the
normalized performance values. A higher number denotes better performance.
Unix Bench results
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Tests
No
rmali
zed
valu
es
Native
ParaVirtualized
Fully Virtualized
Figure 3. UnixBench Results
As seen in the above graph, the performance of para-virtualization is close to native for
computation, while the filesystem performance of para-virtualization even exceeds the
native performs. A possible explanation is that since para-virtualization is performed in
software, Xen would have likely cached the operation request and sent a “complete” to
12
the application before actually completing the task.
However, performance relating to processes/pipes (test 5-7) and concurrency (tests 17-
19), para-virtualization has only half the performance of native, while full-virtualization
performs even worse.
Figure 4. UnixBench Results Test Legend
4.1.2 Intel MPI Benchmark (IMB) results
The following graph is the results for the MPI PingPong benchmark, where the different
colors represent the different testbeds –
� 2 processes on 1 machine (native1Machine)
� 2 processes on 1 para-VM (para1VM)
� 2 processes on 2 para-VM (para2VM)
� 2 processes on 1 full-VM (Full1VM)
� 2 processes on 2 full-VM (Full2VM)
The x-axis is the different message block sizes; while the y-axis is the throughput values.
13
A higher number denotes better performance.
Figure 5. MPI PingPong results
As seen in the above graph, the performance of para-virtualization is a bit slower than
native, while full-virtualization is significantly slower than native performance for all
block sizes.
4.2 Checkpoint techniques for parallel applications
4.2.1 save/restore space/disk results
The “save” function of Xen on the test machine takes on average less than one second
user time and 3% CPU, but requires ~133MB of disk space; while the restore function of
Xen takes less than one second of user time.
14
4.2.2 Naïve checkpoint results
Test case Results
save and stop node B, then restore it after 5 minutes No Failure
save and stop node B, then restore it after 10 minutes No Failure
save and stop node B, then restore it after 25 minutes Failure
save and stop nodes A and B, then restore it after 5 minutes No Failure
save and stop nodes A and B, then restore it after 10 minutes No Failure
save and stop nodes A and B, then restore it after 1 hour No Failure
save and stop nodes A and B, then restore it after 1 day No Failure
Table 1. Naïve checkpoint results
According to the results above, Naïve check-pointing using Xen’s save/restore
functionalities is feasible for applications similar to the PingPong benchmark. Because
total failure occurs after 20 minutes, checkpoints should be made at least once every 20
minutes.
4.3 Difficulties and Challenges
There have been numerous challenges and roadblocks in the process of setting up the
testbed. Several reasons contributed to the problem: hardware incompatibility, my
unfamiliarity with the platform and software, and the immaturity of Xen. The details of
the challenges are described in this section.
At first, the testbed was to be installed and deployed on a IBM Thinkpad laptop, which
was chosen for its portability. However, during the course of setup, it was found that Xen
3.0 required Physical Address Extension (PAE), which is a feature the laptop did not
support. This requirement was not obvious in documentation of Xen at the time of
purchasing the laptop for this project. Consequently, identifying the problem and looking
for workarounds caused delay in the original schedule of the project. After considerable
effort to have Xen operational on the laptop, a new desktop computer was purchased
15
instead – which is the Dell E520 with full virtualization support.
The installation of Linux on the test hardware (Dell E520) did not go as smooth as
expected. There was trouble partitioning the disk using a method I was previously using
(QtParted [11]) because the Linux rescue CD was not mounting for hardware
configuration reasons. An alternative method (GParted [4]) was found, but the problem
has already caused delay to the schedule.
There were also problems during the installation and configuration of Xen. Because para-
virtualization requires a software interface to mimic the underlying hardware, a
modification to the Xen kernel was required for my hardware. Originally, third-party pre-
built disk images [10] were to be used to reduce development time. However, due to
hardware differences, these images did not function. Finally, after many failed attempts
and trials of different workarounds suggested by the Xen community, it was found that
the recent Xen kernel was missing essential modules for my setup, and a workaround was
used by building a new initrd based on the original initrd with the missing modules.
Problems were encountered with the virtual Ethernet hardware. Up to date, there is still
no active Internet connection from the nodes to the outside world. However, to
workaround this problem, a virtual local area network between the nodes has been set up
using static IP addresses, which is sufficient for this project's purposes.
4.4 Future work
In this project, the Fedora Core 6 distribution of Linux was used. Instead, future
experiments can involve using different versions of Windows, as well as other operating
systems that support Xen. Furthermore, in this project, only one physical machine was
used. Therefore, in the future, effects of using two physical machines for testing network
I/O could be investigated.
Since only one checkpoint technique was investigated, more techniques should be
16
designed and tested in the future. In addition, instead of using a third-party test for
checkpoint feasibility, one could develop custom software. Such software can have the
benefits of giving the tester more control over the tests, as well as providing more
debugging information such as the number of messages lost and retransmitted.
17
5.0 CONCLUSIONS
In this project, the performance of Xen and its use in parallel applications were
investigated. Firstly, Xen was installed in a Fedora Core 6 Distribution of Linux, and the
performance differences of native, para-virtualized, and full-virtualized machines were
compared using benchmark suites. Secondly, different test cases were designed and
executed to analyze the feasibility of using the save/restore functionalities of Xen to
checkpoint parallel applications. Specifically, a naïve checkpoint approach was used.
As Xen is still a maturing product, difficulties were faced were setting up the testbed due
to bugs in Xen and its lack of documentation.
For the performance evaluation, it was found that para-virtualization and full-
virtualization performed close to the native performance in the area of computation.
However, para-virtualization performed only half as well in the areas of processes/pipes
and concurrency compared to native; whereas full-virtualization performed only half as
well in the same areas as para-virtualization. In the area of I/O, para-virtualization
performed close to native performance, while full-virtualization performed ten times
worse. While Xen performs well in the area of computation, the overheads introduced in
the Xen virtualization, especially full virtualization, might be significant to user
applications.
For the checkpoint evaluation, it was found that the naïve checkpoint technique can be
used for parallel applications similar to the PingPong test performed. A modification to
the feasibility test could be developed, and more checkpoint techniques could be
investigated in the future.
18
6.0 REFERENCES
[1] “Parallel Computing”, http://en.wikipedia.org/wiki/Parallel_computing, April 2007.
[2] Chao, Wellie. "The Pros and Cons of Virtual Machines in the Datacenter",
http://www.devx.com/vmspecialreport/Article/30383, January 2006.
[3] “The difference between Xen & VMware”.
http://linux.inet.hr/the_difference_between_xen_and_vmware.html, November 2006.
[4] “GParted”. http://gparted.sourceforge.net/, April 2007.
[5] “UnixBench”. http://www.unixbench.org/, April 2007.
[6] “Dhrystone”. http://en.wikipedia.org/wiki/Dhrystone, April 2007.
[7] “Tower of Hanoi”. http://en.wikipedia.org/wiki/Tower_of_Hanoi, April 2007.
[8] “execl()”.http://mkssoftware.com/docs/man3/execl.3.asp, April 2007
[9] Intel Corporation. “Intel Cluster Toolkit 3.0 for Linux”, April 2007.
[10] “Jailtime.org: Downloadable Images for Xen”. http://www.jailtime.org, April 2007.
[11] “QTParted”. http://qtparted.sourceforge.net/, April 2007.
19
APPENDICES
Appendix A: UnixBenchResultsParse.py
import sys
testList = []
for filename in sys.argv[1:]:
newList = []
curFile = open(filename)
fileList = curFile.readlines()
curFile.close
start = False
for line in fileList:
if line.find('Dhrystone') != -1 and line.find('lps') != -1:
start = True
if start is True:
line = line.replace(" lps", "lps").replace(" KBps", "KBps").replace(" lpm", "lpm")
splitLine = line.split(" ")
i = 0
for eachItem in splitLine:
if eachItem.find("lps") != -1 or eachItem.find("KBps") != -1 or \
eachItem.find("lpm") != -1:
number = splitLine[i]
newList.append([ " ".join(splitLine[0:i]).rstrip(), number ])
break
i = i+1
if line.find('Recursion Test') != -1 and line.find('lps') != -1:
break
testList.append(newList)
#print testList
numTests = len(testList)
for i in range(0, len(testList[0])):
total = 0.0
for j in range(0, len(testList)):
number = float(testList[j][i][1].replace("lps", "").replace("KBps", "").replace("lpm", ""))
total = total + number
average = total/numTests
for j in range(0, len(testList)):
number = float(testList[j][i][1].replace("lps", "").replace("KBps", "").replace("lpm", ""))
total = total + pow((number - average), 2)
stddev = pow(total/numTests, 0.5)
errorRange = 0.0
if (average != 0.0):
errorRange = stddev/average
print testList[0][i][0], "\t", "%.2f" % average, "\t", "%.2f" % stddev, "\t", "%.2f" % (errorRange)
20
Appendix B: genShellScript.py
test = "";
for i in range(10):
test = test + "mpirun -n 2 ./IMB-MPI1 | tee test" + str(i) + ".txt;";
print test
Appendix C: MPIResultsParse.py
import sys
fullList = []
current = 0
benchmark = ["PingPong", "PingPing", "Sendrecv"]
for i in range(0 , len(benchmark)):
testList = []
for filename in sys.argv[1:]:
newList = []
curFile = open(filename)
fileList = curFile.readlines()
curFile.close
start1 = False
start2 = False
for line in fileList:
if line.find(benchmark[current]) != -1 and line.find('Benchmarking') != -1:
start1 = True
if start2 is True:
splitLine = line.replace("\n", "").split(" ")
if len(splitLine) <= 3:
break
splitLineMod = []
for item in splitLine:
if item != '':
splitLineMod.append (item)
value = splitLineMod[0]
if (len(newList) <= i):
newList.append(value)
else:
newList[i] = value
i = i + 1
if start2 is False and start1 is True:
if line.find("sec") != -1:
start2 = True
i = 0
testList.append(newList)
fullList.append([benchmark[current], testList])
current = current + 1
for i in fullList:
print i[0]
for j in i[1]:
21
for k in j:
print k
Appendix D: save.py
import sys, os
for i in sys.argv[1:]:
os.system("xm save " + i + " &")
Appendix E: restore.py
import sys, os
for i in sys.argv[1:]:
os.system("xm restore " + i + " &")