seti@home sunny gleason com s 717 november 29, 2001 (based on the article, “seti@home: massively...
Post on 21-Dec-2015
225 views
TRANSCRIPT
SETI@Home
Sunny GleasonCOM S 717
November 29, 2001
(Based on the article, “SETI@Home: Massively Distributed
Computing for SETI.”)
In This Presentation
• What is SETI?• Partitioning the Job• The SETI@Home Client• Server Post-processing• Project Status
SETI@Home
• SETI: Search for Extra-Terrestrial Intelligence– Private / Academic efforts– NASA– SETI Institute– SETI@Home
• SETI@Home : Project led by researchers at University of California - Berkeley (1997)
• “Piggyback SETI” receiver at Arecibo radio telescope
SETI: The Task
• What is the complexity of detecting signals sent by an extra-terrestrial civilization?
• Category: massively difficult– Signal parameters unknown– Sensitivity of analysis depends on
available computing power
SETI: Task Assumptions
• Aliens would broadcast a signal that is easily detectable, distinguishable from natural radio emission
• Narrowband signals stand out from natural broadband sources of noise
• Thus, SETI efforts concentrate on narrowband signals
• The hydrogen line: 1420 MHz
Narrowband Signals
• Use a narrow search window (channel) around a given frequency
• Earlier systems:– Analog narrow bandpass filters
• Newer systems:– Dedicated banks of Fast-Fourier Transform
(FFT) processors– Separate signal into up to 1 billion 1-Hz
channels
Signal Problems
• Signals are unlikely to be stable in frequency– Example:
• A listener on Earth’s surface for 1.4GHz signals undergoes acceleration of up to 3.4cm/s2 due to Earth’s rotation
• Corresponding Doppler drift rate: 0.16 Hz/s• Alien transmission would drift out of
channel in about 6 seconds
Signal Problems
• We can compensate for Earth’s rotation, but what about remote planet?
• Solution:– Correct for Doppler drift at the receiving end– Search for signals at multiple Doppler drift
rates
• Computation-intensive!• Allowed remote drift rates are between
-10Hz/s and +10Hz/s (+50/-50)
Other Parameters
• Signal frequency / bandwidth?• Is it pulsed? If so, what period?• Solving over the full range of
parameters is beyond even the world’s most powerful supercomputers
• Fortunately, the task is easily partitioned
Distributing the Load
• Break the data up into separate frequency bands
• Observations of different portions of the sky are essentially independent
• Partition the huge dataset into smaller chunks that ordinary PC’s can handle
Data Collection
• Observations come from 305-meter radio telescope in Arecibo, Puerto Rico
• Dedicated instrumentation within telescope
• Passively monitors the telescope’s field of view (0.1 degrees)
• Stationary telescope: objects pass through in 24 seconds
• When telescope is tracking: 12 s
Data Collection
• Over the course of the project, SETI@Home will see visible portions of the sky 3 or more times
• Covers stars with declinations from -2 to 38 degrees
• Approximately 25% of the sky
Data Collection
• System records a 2.5MHz band, centered at the 1,420MHz hydrogen line
• Records 2-bit samples onto 35GB DLT tapes (Recall: Nyquist Rate)
• Each tape: 15.5h of data• 39TB of data total
Data Collection
• Data tapes shipped to Berkeley• Split into work units using 4 splitter
workstations– Divide 2.5MHz data into 256 subbands using
2048-point FFT followed by 256 8-point inverse transforms
– Subbands are 9,766Hz wide– 220 samples, thus each work unit is ~10KHz wide
and 107 s long– Work units overlap to detect overlapping signals
• Work units are stored on separate server for distribution
Data Collection
• Main SETI@Home Server– 3 Sun Enterprise 450 Series Computers
• User Database– Contains account information for each of the 2.4 million
users– Also aggregates statistics by platform
• Science Database– Contains information about each work unit
» Time, sky coords, frequency range» How many times each work unit has been downloaded
– Stores parameters of candidate signals» Signal power, frequency, arrival time sky coords» 1.1 billion candidates (Oct. 2000)
• Work unit storage
Data Collection
• Work unit storage server– Distribution of work units, storage of results
• Client communications via HTTP– Important to get through firewalls– Request to download new work unit
• Work units that have not been downloaded yet have priority
• Then, work units for which no results have been returned
– Request to post results• Data contains signal characteristics• Updates user statistics
The SETI@Home Client
• Available for 47 different combinations of CPU and OS
• Dominant platforms: Windows, Mac– Feature graphical “screensaver”
display– UNIX works as daemon
(display program available for X)
The SETI@Home Client
• Downloads work unit from server• Performs “baseline smoothing” to
eliminate wideband features, help reduce false signals
• Performs main data analysis loop(shown on next page)
Main Data Analysis Loopfor Doppler Drift rates from -50 to 50Hz {
for bandwidths from 0.075 to 1220Hz in 2x steps { Generate time-ordered power spectra
Search for short-duration signals above a constant threshold
for each frequency { Search for faint signals matching
beam parameters (Gaussians)
Search for groups of 3 evenly spaced signals
Search for faint repeating pulses (pulses) } } }
The SETI@Home Client
• Client examines signal at various drift rates– 10 to -10 Hz (fine-grained)– 50 to -50 Hz (~twice as course)
• Although drift rates are most likely negative, examine both sides– For statistical comparison– To detect deliberately chirped signals
The SETI@Home Client
• For each drift rate, examines the signal at different bandwidths between 0.075 and 1,221 Hz– Using a variety of FFT– Not all bandwidths are examined at
every drift rate (only when drift rate becomes significant compared to the frequency)
The SETI@Home Client
• Transformed signals are examined for spiked exceeding 22 times the mean noise power
• Threshold: 7.2 x 1025 W/m2 (at the finest frequency resolutions)
• “Detecting a cell phone on one of the moons of Saturn”
• These spikes are what the client reports
The SETI@Home Client
• Other transformations to detect Gaussians and pulse patterns
• Specialized algorithms (fast-folding algorithms) for detecting pulses efficiently
• Work by “folding” portions of the signal together in time, to detect gain over the pulse period
The SETI@Home Client
• Typical workload:– 2.4 to 3.8 trillion floating-point
operations (teraflops)– Typical 500MHz PC takes 10 to 12
hours to complete a work unit– Within the average work unit:
• 4 spikes, 1 Gaussian, 1 pulsed signal, 1 triplet signal
• <Insert Demonstration Here>
Postprocessing
• Client uploads candidate signal data to server(exact data formats are kept quiet)
• Server examines results for errors• Keeps track of user statistics
Error detection
• SETI@Home uses thousands of CPU years every day
• With heat, floating-point units are the first to give incorrect results
• High error rates are offset by easy error detection
• Replication of work units is the primary error detection mechanism
• 60% of work unit results must agree in order to be considered for further analysis
Candidate Signals
• Vast majority of detected signals correspond to terrestrial RFI– Extra-terrestrial signals can not last
more than 12 s– Also, signals should repeat when
viewing the same portion of the sky at a later time
Project Status
• October 2000– 2.4 million users– 520,000 active clients donating 437,000
years of CPU time (4.3 x 1020 flop)– Average processing rate: 15.7 Tflops
• “Largest supercomputer in existence”• “Largest computation ever performed”
Project Status
• 1.1 Billion signals in SETI@Home database
• Candidate signals being submitted faster than the server can confirm them
• So far, no extra-terrestrial signals
Future Work
• Expand coverage by adding new telescope in southern hemisphere
• Expand frequency bandwidth(up to double the data rate)
• Expand number of volunteers, increase SETI education efforts
Summary
• Seemingly impossible problem• Easily partitioned• Good publicity, marketing• Achieves incredible performance
– But, high latency– High redundancy/replication of
computation
Related Work
• Distributed.net– Cracking of encryption keys (DES,rc5)– Search for optimal Golomb rulers
• Folding@Home– Stanford project - distributed protein folding
• PiHex– Distributed effort to calculate Pi
• GIMPS– Great Internet Mersenne Prime Search
Discussion
• Potential comments on:– System architecture– Fault-tolerance– Security
References
• Seti@Home Web Site– http://setiathome.ssl.berkeley.edu/
• NASA Science Newsletter– http://science.nasa.gov/newhome/headlines/ast23may99_1.htm
• Papers– Korpela, et al. “SETI@Home: Massively
Distributed Computing for SETI.”– Sullivan, et al. “A new major SETI project based
on Project Serendip data and 100,000 personal computers.”
– “The SETI@Home Sky Survey.” Available from the SETI@Home web site.