Download - Nebula james Williams
NASA NebulaPast, Present, and Future
A Story in Three Parts
James F. WilliamsCIO, NASA Ames Research CenterApril 26, 2011
2
‣ One of the first cloud computing platforms built for the Federal Government by the Federal Government
‣ Publicly launched IaaS with the White House as the first customer in production
‣ Basis of OpenStack Compute, aka “Nova”
‣ There are over 300 users across nine NASA Centers + JPL + HQ
NASA Nebula Cloud Computing
3
Lots Been Said About Nebula….some of it crazyWhat do you get when you combine
cloud computing and data center containers? You get NASA’s Nebula,
the space agency’s new data powerhouse, which provides on-
demand computing power for NASA researchers.
http://www.datacenterknowledge.com/archives/2009/12/02/nasas-nebula-the-cloud-in-a-container/
“This will can help solve NASA’s real compute issues”
New NASA Center CIO
“Will create world peace.”
Future Miss America“Putting the ‘Space’ in
Rackspace”
Start-up Co-founder
“The world will end if we don’t do this.”
Former NASA Center CIO
“Because we need more developers”
Well known cloud architect
4
‣ Tell you the unofficial Story of how NASA Nebula started…..from my perspective. I was there for most of it. I’ve blacked out for some of it.
‣ Where we are today and what NASA is doing
‣ Our vision for NASA Nebula for the future
Getting Past the Hype
5
NASA Nebula, Part 1
6
1000s of other NASA websites
"A long time ago in a US Agency far, far away.... (well, DC is far)"
www.nasa.gov
Photo Credit: www.starwars,com Photo Credit: www.starwars,com
7
Why wouldn’t they join NASA.gov?www.nasa.gov
Perception:
‣ Control
‣ Issues with CMS
‣ Flexibility
‣ Cost
Photo Credit: www.starwars,com
8
There’s a better way to do this…
‣ Problem: How do we get these web developers to stop building out their own sites?
‣ Solution: Give Developers a better alternative to status quo
NASA.net was born
9
NASA.net
‣ Setting: Basement of NASA Ames Research Center (ARC) Building 200, in an old conference room
‣ Imagine: Small team of developers working on Platform as a Service
‣ Code hosting
‣ Continuous integration
‣ Bug Tracking
‣ Best Practices in code development
‣ Making unicorns happy across NASA
10
But after working on Platform as a Service prototypes…
‣ Learned in order to run a web application framework properly as a service, we need elastic infrastructure
11
Over Indian Food in Mountain View
Photo Credit: http://www.tandooribistrosj.com/
‣ Joshua McKenty pitched the “cloud” idea to us. I just ate curry.
‣ We decided build out an IaaS capability just to support NASA.net
‣ Chris didn’t pay for lunch.
12
Problem
Everyone: “Great Idea!”
Us: “Thanks! We need funding..”
13
14
Why?
FY06:
Plan FY09 Budget
FY09:
Line item for cloud?
How do we fund IT innovation?
15
At the same time, White House “cloud first” initiative was gaining traction
Federal CIO, Vivek Kundra evangelized the idea of Cloud Computing
In coordination with the data center consolidations, agencies should evaluate the potential to adopt cloud computing solutions
by analyzing computing alternatives for IT investments in FY 2012. Agencies will be expected to adopt cloud computing
solutions where they represent the best value at an acceptable level of risk.
http://www.whitehouse.gov/sites/default/files/omb/assets/memoranda_2010/m10-19.pdf
16
A team said “No”
‣ They just said No.
‣ A wise woman said. “Drop back. Punt. And wait to get the ball back.”
17
‣ We didn’t have any money but we had popsicle sticks and string so we started anyway.
‣ Josh liked Indian food and was friends with Jesse
‣ These guys were the first IT hippies I met. I gave them free headbands.
‣ There eventually was a guy named Vish (from Iowa? Really?) and someone from ZZTop (“He’s got cloud, and he knows how to use them..”)
‣ I bought cases of Red Bull to cover up the fact I didn’t have money for chairs.
But ARC is creative…..
18
USASpending.gov‣ Then the White House wanted to use our cloud. Then
they came to ARC and some NASA officials were there…...
‣ Somehow we got a container and a couple $$’s to do something but still no money for chairs.
‣ Lots of cats……no milk…. Herding was a problem…… Enter Soo and Ray and William……Finally a semblance of a team that started to make things happen.
19
Then things started to happen
20
Massive Technical Challenges
Ran into every problem you can think of:
‣ Hardware, Database, Software, Client management
Details:
‣ Jumbo frames causing VMs to kernel panic
‣ 2 minute network separation between cloud controller & VMs caused controller to decide to terminate instances
‣ Prepare your container for the cloud… it rains
‣ Pieces of their really cloud ready. Indexes didn’t fit in RAM (even with 96GB allocated to VMs)
)
21
Weekend Hack-a-Thon‣ Decided to spend the weekend hacking a new open
source cloud controller framework. This became “Nova”
‣ Some thoughts:Monolithic is bad – each component should scale independently
‣ APIs are good – you shouldn’t have to use web UI to configure
‣ Simple things should be simple, hard things possible…
‣ The hard parts (hypervisor, storage system, networking) is done by others. Build a cloud the same way the way you build a scalable web application…
‣ Nova has been empowering NASA users for a year this May.
22
Launched USAspending.gov on 5/21/2010
23
OpenStack History
Rackspace Decides to Open
Source Cloud Software
March
NASA Open Sources Nebula
Platform
May June July
OpenStack formed with contributions
from Rackspace & NASA
Inaugural Design Summit in Austin
20
10
20
05
Rackspace Cloud
developed
24
Rackspace Called NASA
Rackspace:
‣ “Wow. Can we meet your team of 400 developers?”
NASA:
‣ “Sure. We got 8 developers. And not all of them full time. “
25
Why not a NASA-Driven Foundation?
‣Resources
‣Expertise
‣Focus
‣Not NASA’s mission
26
NASA Nebula’s contributions to OpenStack aligns with the Administrator's strategic goals‣Facilitate the success of a viable commercial space industry to provide assured U.S. access to low Earth orbit for cargo and crew and acquire, mature, and infuse commercial capabilities across all NASA activities
‣Promote enhanced cooperation with international, industry, other U.S. government agency, and academic partners in the pursuit of our missions.
2626
27
NASA Nebula + OpenStack supports the Agency’s Goals:‣ Goal 6: Share NASA with the
public, educators, and students to provide opportunities to participate in our mission, foster innovation and contribute to a strong National economy
28
NASA Nebula, Part 2
29
‣Developers from all over the world want to contribute code because they “want to be part of the space exploration.”
‣ Overheard at the OpenStack Design Summit, “My code could be part of NASA Nebula. This is as close to being astronaut I am ever going to get!”
Overheard at Design Summit
30
IaaS Status as of the start of FY11 Nebula is maturing from an Agency innovation project to a new OCIO service offering
In April 2010, the Alpha phase began with 20 users began
Even though we’ve been concentrating on ARC and GSFC, Beta closed with 240 IaaS users
across 9 Centers, HQ, JPL, and NSSC through word of mouth
Nebula requires OCIO sponsorship to expand and become an institutionalized service for the
entire Agency
30
ARC: 111GSFC: 104
JSC: 4
MSFC: 2
HQ: 8GRC: 4
JPL: 3 KSC: 1
LaRC: 1
NSSC: 2SSC: 1
31
Our major challenge
32
How do we get here?
33
Science-scale application
development
Very large data set processing
Compute intensive
processing
Timely sharing of results with
collaborators and the public
Missions
BUILD ITBuild my own IT infrastructure that may/may not comply with Federal/Agency IT security standards.
BUY ITGo through a lengthy procurement and provisioning process for basic IT services
DO NOTHINGThe current basic IT services model is cost prohibitive and I cannot afford to process my data and share with collaborators and the public at large.
Current Options*
Requirements*
* Requirements and Options documented in over 30+ interviews with Ames scientists as part 2009 NASA Workstation project.
34
High-end Compute
Vast StorageHigh Speed Networking
TARGET COMPUTE PLATFORM
Serv
er-
based
com
pu
te
resou
rces
Su
per
Com
pu
ter
Deskto
p
Excellent example of how OCIO-
sponsored innovation can be
rapidly transformed into services that address Agency mission needs
34
Offer scientists services to address the gap
35
2,600 Civil Servants and Contractors
*2,600 Servers • Non-enterprise
applications
600 servers in traditional data
center environments
1,000 servers or desktops/
workstations being used as servers in lab environments
1,000 “under the desk” desktops
and workstations being used as
additional compute resources
CHALLENGES• Dedicated use (non-shared)• **Underutilized (average of 15% utilization)• Inefficient space and power use
Ames Research Center
• Numerous security plans• Significant system administration expense
*Number of servers: Estimate based on data collected from NASA Workstation Project, inventory of Ames institiional data center, review of Ames IP address allocation, and consultation with Ames Network Engineers.
**15% utilization based on two reports from Gartner Group, Cost of Traditional Data Centers (2009), and Data Center Efficiency (2010). 35
ROI and ARC Case Study
36
Utilization
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
15%
75%
85%
CloudVirtualizedDedicated
*15% utilization based on two reports from Gartner Group, Cost of Traditional Data Centers (2009), and Data Center Efficiency (2010).
POWER: Computers typically require 70% of their total
power requirements to run at just 15% utilization.
36
ROI and ARC Case Study
37
‣ This does not include power, cooling, networking, overhead, or system administration costs.
‣ This is one Center…
2,600 Servers • Non-enterprise
applicationsHigh-end Compute
Vast StorageHigh Speed Networking
Ames estimated server cost:
$ 7.8M
=
For the equivalent amount of servers:
$ 1.15M 2,600 servers x $3,000 a server = $7.8M 2,600 servers x $3,000 server x .15 utilization
= $1.15M (85% savings)
*$3,000 based on 1) the average cost of a sample set of low to medium range of servers and high-end desktops and 2) a hosting service case study from www.rentaserver.com. This does not include power, cooling, networking,
overhead, or system admistration costs.
ROI and ARC Case Study
38
‣ Cost Avoidance‣ Improved visibility of NASA server resources achieved by cloud-
based resource pooling.
‣ Promotes standardization of both hardware and operating systems
‣ Security compliance and technical integration of new capabilities
‣ IT Security Enhancements: ‣ Implementing NASA’s Security Program will be much easier in a
standardized environment
‣ We know where the servers are located: Physically locate assets involved in incidents
‣ Security posture will be improved simply due to enhanced visibility (by the NASA Security Team) into the resources being secured
38
ROI and ARC Case Study
39
‣ Operational Enhancements:‣ Strict standardization of hardware and infrastructure
software components
‣ Small numbers of system administrators to manage the cloud and applications due to the cookie-cutter design of cloud components and support processes
‣ Failure of any single component within the Nebula cloud will not become reason for alarm
39
ROI and ARC Case Study
40
Mission Objectives
Explore, Understand, and Share
Exploration Space OpsScienceAeronautics
High Compute Vast StorageHigh Speed Networking
Process Large Data Sets
Scale-out for one-
time events
Require infrastructur
e on-demand
Store mission & scientic
data
US
E
CA
SES Share
information with
the public
Run Compute Intensive Workloads
MIS
SIO
NO
CIO
IN
NO
VA
TIO
N
Mission Support
40
NASA has direct access to the Nebula cloud computing platform
41
Conversation with Scientists
I love your cloud and want to start. How much does it
cost?
Test it for free. Cool. Let’s see what this baby can
do…
42
Use Case: SERVIR.net
43
Use Case: SERVIR.net
44
Use Case: SERVIR.net
45
Customer Example: WISE (Wide-Field IR Survey Explorer)
‣ WISE: Images the sky with greater than 8X redundancy
‣ Helping NASA find the most luminous galaxies in the universe and the closest stars to the sun.
‣ Issue: Encountered a short-term need for a large number of small servers and also needed a server with a large memory footprint, did not have access and could not justify cost for his needs alone
‣ Nebula Project #1: 2000 distant galaxies• Increase resolution with processing
• 100 CPU hrs per galaxy
• We upped instance quota to get started
‣ Nebula Project #2: Some sky areas require huge RAM-based processing.• We set up an 80 GB RAM instance
• Finished first phase on Nebula in two days.
Use Case: WISE
46
WISE Region Processing
47
WISE Cloud Processing
48
WISE Galaxy Processing
49
Possible Cloud Applications
‣ Hundreds of galaxies can be hi-res processed to provide higher angular resolution images for better studies of star formation, galactic structure, etc.
‣ Thousands of galaxies could be hi-res processed to resolve & measure source sizes, etc.
‣ Hundreds of thousands of galaxies can be hi-res processed to differentiate point-like (active galactic nucleus or nuclear star formation) from distributed (merger or spiral galaxy) emission.
‣ Several large regions nearby can be surveyed for distributed star formation in our galaxy.
‣ A few large regions can be processed to tremendous depth.
‣ The whole sky can be hi-res processed. “conducted calculations
that could not be done on our project's large server
farm. ”
50
Use case example – NASA WISE
50
Use case name NASA WISE Project Calculations
Specific service WISE, or Wide-field Infrared Survey Explorer, images the sky with greater than 8X redundancy helping NASA find the most luminous galaxies in the universe and the closest stars to the sun. Data collected by WISE provides an important catalog for the James Webb Space Telescope as well as a lasting legacy for the astronomical community
Contact name, title and responsibility
Dominic Benford, Scientist, WISE
What the service currently does
Uses enormous processing power to interpret data received from space.
What the problem is Within the scope of this project, Benford encountered a short-term need for a large number of small servers. Making the best use of WISE data requires low-resolution galaxies to be processed by many small virtual servers to resolve and measure their source sizes, and differentiate point-like emissions (such as from active nuclei or nuclear star formation) from distributed emissions (such as from merger or spiral galaxies). Additionally, WISE surveys several large regions of the sky for distributed star formations in our own galaxy. Using instances with enormous computational capacity, a few large regions can be processed to tremendous depth.
51
Use case example – NASA WISE
51
Use case name NASA WISE Project Calculations
How Nebula Solves the problem
“With the recent addition of a large-RAM instance,” says Benford, “I am now able to conduct calculations that could not be done on our project's large server farm. Nebula has provided me with a tool for science data analysis that far surpasses anything that I could envision in a single-user context. NASA Cloud computing may be the way forward for our data-intensive projects in the future, since only a NASA system could provide the necessary reliability and proprietary controls on our data.”
Considerations & time line
Benford says he sees other possible WISE cloud applications. Nebula, for example, can be used for processing thousands of galaxies to produce higher angular resolution of images and better assist studies of star formation and galactic structure. Benford has also suggested that Nebula can be used to process the entire 41,253 square degrees of the sky in high resolution. The initial use case is being validated through an advanced POC.
End result This proof of concept is now processing data that could not be addressed in their current compute farm
52
NASA Nebula, Part 3
53
‣ Nebula will continue to be part of the open source community
‣ Nebula is being incorporated as an option in the overarching computational services for the agency. And yes….we are looking at commercial clouds too. Operational management is being implemented.
‣ Nebula continues to be supported by the Agency and the Agency is working to define strategies for cloud use.
‣ The Agency is defining its current use cases for cloud to ensure adoption.
‣ ARC is currently developing a cloud migration strategy for business and operational organizations. Are you cloud ready?
‣ ARC is beginning to incorporate cloud into their mission proposal process.
‣ Security in the cloud continues to be a driver. Pushing towards moderate and ITAR usage.
What NASA is doing………
54
Continue Collaboration
OpenStac
k
Nebula
• Hundreds of contributions of large and small companies have improved OpenStack since release
• NASA has benefited from OpenStack contributions integrating ideas and patches into Nebula after appropriate review.
55
Key OpenStack Community Contributions powering NASA Nebula
• Virtual Firewall Service: Implemented by Ubuntu developer
• Windows Support: Disk support added by San Francisco startup FathomDB
• IPv6: Developed by Japanese telecom NTT
56
‣ How NASA can ensure OpenStack works for NASA
• Initially very active participant, ensuring NASA’s requirements are in the DNA of OpenStack
• As OpenStack matures NASA moves from contributor to user. “as they stand up as we stand down”
OpenStack & NASA
Define
Share
Contribute
Use
57
‣ Regardless of the road taken to get here…..
‣ NASA does support and funds Nebula …At all levels. From the administrator suite to the stock room. A group of very innovative people working together to make it happen….technical, business, and political.
‣ There are lots of stories of attribution..… but the truth is….it was done here with all of you……. So thanks to all the contributors that are here.
What NASA has done……
58
‣ Open source means moving faster and smarter
‣ Moving inertia takes great strength.
‣ People aren’t wrong in the way they do things. It still works for them. They just have to be shown why another way is better for the future.
‣ Messaging and direction should always be clear. The why needs to be solid. The how agreed upon. And the message accurate.
‣ Passion for the mission outweighs the paycheck. Yes some of us really do work for NASA because it is NASA.
‣ Never ask hippy developers to show up to an 8 a.m. meeting.
What have I learned…..