making geni experiments repeatable and replicable
TRANSCRIPT
Sponsored by the National Science Foundation
Making GENI Experiments Repeatable and Replicable
Vic Thomas GENI Project Office
Sponsored by the National Science Foundation 2 Repeatable and Replicable Experiments in GENI www.geni.net
MAKING EXPERIMENTS REPEATABLE
Sponsored by the National Science Foundation 3 Repeatable and Replicable Experiments in GENI www.geni.net
Experiment Repeatability
Experiment is repeatable if: The measurement can be obtained with stated precision by the same team using the same measurement procedure, the same measuring system, under the same operating conditions, in the same location on multiple trials. For computational experiments, this means that a researcher can reliably repeat her own computation.
– The ACM Policy on Artifact Review and Badging
Real stories:
“My advisor wants me to collect more measurements but I don’t remember how to re-create my setup from four months ago.”
“My slice expired and I did not keep track of the software I installed and configured in my VMs to get things to work.”
Sponsored by the National Science Foundation 4 Repeatable and Replicable Experiments in GENI www.geni.net
Making GENI Experiments Repeatable
• Script your resource setup – GENI install/execute scripts (aka post-boot scripts) – Sysadmin tools such as Ansible
• Create custom OS images – Snapshot your VM after it is configured – Boot this VM instead of a standard GENI OS image
Don’t be a Bart Simpson!! Use a script!
From: Itchy and Scratch and Marge
The Simpsons, Season 2, Episode 9
Sponsored by the National Science Foundation 5 Repeatable and Replicable Experiments in GENI www.geni.net
Install/Execute Scripts
• Scripts that run after your VM has booted • Used to configure the VM for your experiment
– Install and configure software – Start services – Change firewall settings – Configure network interfaces – …
Sponsored by the National Science Foundation 6 Repeatable and Replicable Experiments in GENI www.geni.net
Example: HelloGENI Experiment [http://groups.geni.net/geni/wiki/GENIExperimenter/Tutorials/RunHelloGENI]
1. Reserve 2 node topology 2. Install web server on server
sudo apt-get -y install apache2
3. Edit web server config file (apache2.conf) …
<Location /server-status SetHandler server-status Allow from all
</Location>" ExtendedStatus On …
4. Start web server sudo /etc/init.d/apache2 force-reload sudo service apache2 restart
5. Install iperf on nodes sudo apt-get -y install iperf
6. Run iperf Server: iperf -s -i 10 &> $iperf_server_log Client: iperf -c 10.10.10.1 -P $i &> /tmp/iperf-logs/iperf_client.log
Not all steps in the experiment are shown here
Sponsored by the National Science Foundation 7 Repeatable and Replicable Experiments in GENI www.geni.net
HelloGENI using Install/Execute Scripts
#!/bin/bash #Install iperf on VMs sudo apt-get -y install iperf if [ $hn == "server" ] then # This is the server: Configure and start the http server and start iperf server sudo apt-get -y install apache2 # Edit Apache config file echo "<Location /server-status>" | sudo tee –a /etc/apache2/apache2.conf> /dev/null echo " SetHandler server-status" |sudo tee -a … echo " Allow from all" | sudo tee -a … echo "</Location>" | sudo tee –a … echo "ExtendedStatus On" | sudo tee –a … # Start the webserver sudo /etc/init.d/apache2 force-reload sudo service apache2 restart # Start the iperf server sudo bash -c "iperf -s -i 10 &> $iperf_server_log” else # This is the client: Tun iperf client # Wait 60 seconds for server to come up sleep 60 iperf -c 10.10.10.1 -P $i &> /tmp/iperf-logs/iperf_client.log fi
Put all commands in a file Called an install/execute script in GENI Typically a shell or Python script
Can have multiple install/execute scripts for each VM or different scripts for different VMs Scripts associated with a VM run automatically after VM boots
No assurance on the order in which these scripts will be run
Snippet from the actual HelloGENI install script
Sponsored by the National Science Foundation 8 Repeatable and Replicable Experiments in GENI www.geni.net
Adding Script to RSpec
Use the “node details” box in the Portal Add Resources page
… <node client_id="server" exclusive="false"> <sliver_type name="default-vm"/> <services> <install install_path="/local" url="https://myweb.com/hellogeni-install.tar.gz"/> <execute command="sudo /local/install-script.sh" shell="sh"/> </services> <interface client_id="server:if0"> <ip address="10.10.10.1" type="ipv4" netmask="255.255.255.0"/> </interface> </node> …
Script source (URL)
Command to execute on boot
Path to directory on VM where script is to be installed
Resulting Rspec (snippet):
Sponsored by the National Science Foundation 9 Repeatable and Replicable Experiments in GENI www.geni.net
Install/Execute Scripts
• Difficult to debug – The write-execute-debug cycle can be long
• Need to delete and recreate resources to run script – Script runs as a special user (not you) and may not be able to
access resources in your home directory
• Tip: Log in and run script manually – Speeds up write-execute-debug cycle
• Eliminates need to delete and recreate resources – Scripts runs as you; may not catch errors that result in script trying
to access resources in your home directory when it is deployed
For more info and tips: http://groups.geni.net/geni/wiki/HowTo/WriteInstallScript
Sponsored by the National Science Foundation 10 Repeatable and Replicable Experiments in GENI www.geni.net
Scripting using Ansible
Ansible: Open Source IT automation tool
Sponsored by the National Science Foundation 11 Repeatable and Replicable Experiments in GENI www.geni.net
Scripting GENI Experiments using Ansible
- name: Configure server hosts: server sudo: True tasks:
- name: install apache2 apt: name=apache2 update_cache=yes - name: install iperf apt: name=iperf update_cache=yes - name: Make sure Apache config contains "ExtendedStatus On” lineinfile: line='ExtendedStatus On' dest=/etc/apache2/conf.d/extendedstatus create=yes state=present - name: Make sure Apache config contains Location information lineinfile: line='<Location /server-status>\n SetHandler server-status\n Allow from all\n</Location>' dest=/etc/apache2/sites-available/default create=yes state=present insertafter=EOF backup=yes - name: restart apache2 service service: name=apache2 state=restarted - name: Start iperf as a daemon action: command /usr/bin/iperf --server --daemon
Create an Ansible “playbook” In YAML format Shortcuts for many common tasks such as s/w updates, sudo, ping, getting VM configuration (called Ansible modules)
No additional software needed in your VMs
Ansible uses ssh to run commands on remote hosts You do need Ansible installed on your machine (Mac or Linux)
Snippet of an Ansible playbook used to set up the HelloGENI server
Sponsored by the National Science Foundation 12 Repeatable and Replicable Experiments in GENI www.geni.net
Scripting using Ansible
• Benefits – More compact than shell scripts – Easier to debug
• Faster write-execute-debug cycle
– No need to modify Rspec if install/execute script is moved to a different web server
• Disadvantages – Mac/Linux only
• Ansible uses ssh; Windows does not have ssh built-in
Sponsored by the National Science Foundation 13 Repeatable and Replicable Experiments in GENI www.geni.net
geni-lib
Python library for interacting with the GENI federation, or any federation that uses the GENI APIs. Useful for scripting resource management
Discovery, reservation, deletion
No support for scripting execution
Use install/execute scripts or Ansible for this
Sponsored by the National Science Foundation 14 Repeatable and Replicable Experiments in GENI www.geni.net
Making GENI Experiments Repeatable
• Script your resource setup – GENI install/execute scripts (aka post-boot scripts) – Sysadmin tools such as Ansible
• Create custom OS images – Snapshot your VM after it is configured – Boot this VM instead of a standard GENI OS image
Sponsored by the National Science Foundation 15 Repeatable and Replicable Experiments in GENI www.geni.net
Custom OS Images
• Configuring a VM can sometimes take a long time or be difficult to script… – Download and install of many or large packages – Multiple paths in script
• …resulting in resources taking a long time to be ready – Resources are not ready until VM has booted and all
install scripts have run
Sponsored by the National Science Foundation 16 Repeatable and Replicable Experiments in GENI www.geni.net
Custom OS Images
1. Configure VM as needed for experiment – Install and configure packages
2. “Snapshot” the VM – You will get a URL to your custom image
3. Specify this URL as the image to be loaded the next time you create your topology
Sponsored by the National Science Foundation 17 Repeatable and Replicable Experiments in GENI www.geni.net
Creating a Snapshot
1. Select the VM to be “snapshotted” 2. Click Snapshot
Sponsored by the National Science Foundation 18 Repeatable and Replicable Experiments in GENI www.geni.net
Specifying a Custom Image
1. Select the VM to be booted with the custom VM 2. In the Disk Image part of node details, select Other and give URL
to the custom image
Sponsored by the National Science Foundation 19 Repeatable and Replicable Experiments in GENI www.geni.net
Custom Image: An Example
• The OVS switch used in your OpenFlow tutorials – Custom image available to all
experimenters
• The tutorial uses the OVS custom image + install/execute scripts – Scripts install and configure wireshark since
tutorial includes learning how to debug switch-controller communications
Custom Image
Sponsored by the National Science Foundation 20 Repeatable and Replicable Experiments in GENI www.geni.net
Custom Images
• Benefits – Boot up faster than images with scripts to install packages – Better repeatability
• Version of OS and all installed packages and libraries will not change unbeknownst to the experimenter
• Disadvantages – May be difficult to recreate: Experimenter may not remember what
he/she did to get to a working image – Version of OS and installed packages/libraries will not change
• Including bugs and vulnerabilities; experimenter responsible for patching the images
Sponsored by the National Science Foundation 21 Repeatable and Replicable Experiments in GENI www.geni.net
MAKING EXPERIMENTS REPLICABLE
Sponsored by the National Science Foundation 22 Repeatable and Replicable Experiments in GENI www.geni.net
Experiment Replicability
Experiment is replicable if: The measurement can be obtained with stated precision by a different team using the same measurement procedure, the same measuring system, under the same operating conditions, in the same or a different location on multiple trials. For computational experiments, this means that an independent group can obtain the same result using the author’s own artifacts.
– The ACM Policy on Artifact Review and Badging
Repeatable: Researcher can reliably repeat her experiment Replicable: Others can run her experiment using her artifacts and get the same results
Sponsored by the National Science Foundation 23 Repeatable and Replicable Experiments in GENI www.geni.net
Experiment Artifacts
• An experiment can be properly evaluated only if its artifacts are published
• Artifact – – Digital object that was either created by the authors to
be used as part of the study or generated by the experiment. Artifacts can be software systems, scripts used to run experiments, input datasets, raw data collected in the experiment, or scripts used to analyze results. [ACM]
• Name some artifacts specific to GENI experiments -
Sponsored by the National Science Foundation 24 Repeatable and Replicable Experiments in GENI www.geni.net
Examples of GENI Experiment Artifacts
• RSpecs • Install/execute scripts • Custom images • Ansible scripts • Geographic location of resources used • Experiment software • Input data sets • Instructions for running the experiment
Sponsored by the National Science Foundation 25 Repeatable and Replicable Experiments in GENI www.geni.net
Replicable Experiments on GENI
All tutorials are experiments designed to be replicable: – Detailed instructions – RSpecs, scripts and software available
You can share your RSpecs on the GENI Portal!
Sponsored by the National Science Foundation 26 Repeatable and Replicable Experiments in GENI www.geni.net
MAKING EXPERIMENTS REPRODUCIBLE
Sponsored by the National Science Foundation 27 Repeatable and Replicable Experiments in GENI www.geni.net
Experiment Reproducibility
Experiment is reproducible if: The measurement can be obtained with stated precision by a different team, a different measuring system, in a different location on multiple trials. For computational experiments, this means that an independent group can obtain the same result using artifacts which they develop completely independently.
– The ACM Policy on Artifact Review and Badging
Repeatable: Researcher can reliably repeat her experiment Replicable: Others can run her experiment using her artifacts and get the same results Reproducible: Others can run the experiment using artifacts they create and get similar results. Reproducibility is the ultimate goal.
Sponsored by the National Science Foundation 28 Repeatable and Replicable Experiments in GENI www.geni.net
Popper
A convention for conducting experiments and writing academic article’s following a DevOps approach that allows researchers to automate the re-execution and validation of an experiment.
- http://falsifiable.us
Sponsored by the National Science Foundation 29 Repeatable and Replicable Experiments in GENI www.geni.net
Summary
• Make sure your experiment is repeatable – For your own sake
• Replicability and reproducibility increasingly important to get published
Artifacts Results Artifacts are documented, consistent, complete, exercisable, and include evidence of verification and validation.
Artifacts are carefully documented and well-structured so reuse and repurposing is facilitated.
Main results have been obtained by a person or team other than the authors, using, in part, artifacts provided by the author.
Author-created artifacts have been placed on a publically accessible archival repository.
Main results have been independently obtained by a person or team other than the authors, without the use of author-supplied artifacts.
Badges for articles in ACM Publications
https://www.acm.org/publications/policies/artifact-review-badging
Sponsored by the National Science Foundation 30 Repeatable and Replicable Experiments in GENI www.geni.net
QUESTIONS?