seti@home sunny gleason com s 717 november 29, 2001 (based on the article, “seti@home: massively...

34
SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Post on 21-Dec-2015

225 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

SETI@Home

Sunny GleasonCOM S 717

November 29, 2001

(Based on the article, “SETI@Home: Massively Distributed

Computing for SETI.”)

Page 2: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

In This Presentation

• What is SETI?• Partitioning the Job• The SETI@Home Client• Server Post-processing• Project Status

Page 3: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

SETI@Home

• SETI: Search for Extra-Terrestrial Intelligence– Private / Academic efforts– NASA– SETI Institute– SETI@Home

• SETI@Home : Project led by researchers at University of California - Berkeley (1997)

• “Piggyback SETI” receiver at Arecibo radio telescope

Page 4: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

SETI: The Task

• What is the complexity of detecting signals sent by an extra-terrestrial civilization?

• Category: massively difficult– Signal parameters unknown– Sensitivity of analysis depends on

available computing power

Page 5: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

SETI: Task Assumptions

• Aliens would broadcast a signal that is easily detectable, distinguishable from natural radio emission

• Narrowband signals stand out from natural broadband sources of noise

• Thus, SETI efforts concentrate on narrowband signals

• The hydrogen line: 1420 MHz

Page 6: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Narrowband Signals

• Use a narrow search window (channel) around a given frequency

• Earlier systems:– Analog narrow bandpass filters

• Newer systems:– Dedicated banks of Fast-Fourier Transform

(FFT) processors– Separate signal into up to 1 billion 1-Hz

channels

Page 7: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Signal Problems

• Signals are unlikely to be stable in frequency– Example:

• A listener on Earth’s surface for 1.4GHz signals undergoes acceleration of up to 3.4cm/s2 due to Earth’s rotation

• Corresponding Doppler drift rate: 0.16 Hz/s• Alien transmission would drift out of

channel in about 6 seconds

Page 8: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Signal Problems

• We can compensate for Earth’s rotation, but what about remote planet?

• Solution:– Correct for Doppler drift at the receiving end– Search for signals at multiple Doppler drift

rates

• Computation-intensive!• Allowed remote drift rates are between

-10Hz/s and +10Hz/s (+50/-50)

Page 9: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Other Parameters

• Signal frequency / bandwidth?• Is it pulsed? If so, what period?• Solving over the full range of

parameters is beyond even the world’s most powerful supercomputers

• Fortunately, the task is easily partitioned

Page 10: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Distributing the Load

• Break the data up into separate frequency bands

• Observations of different portions of the sky are essentially independent

• Partition the huge dataset into smaller chunks that ordinary PC’s can handle

Page 11: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Data Collection

• Observations come from 305-meter radio telescope in Arecibo, Puerto Rico

• Dedicated instrumentation within telescope

• Passively monitors the telescope’s field of view (0.1 degrees)

• Stationary telescope: objects pass through in 24 seconds

• When telescope is tracking: 12 s

Page 12: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Data Collection

• Over the course of the project, SETI@Home will see visible portions of the sky 3 or more times

• Covers stars with declinations from -2 to 38 degrees

• Approximately 25% of the sky

Page 13: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Data Collection

• System records a 2.5MHz band, centered at the 1,420MHz hydrogen line

• Records 2-bit samples onto 35GB DLT tapes (Recall: Nyquist Rate)

• Each tape: 15.5h of data• 39TB of data total

Page 14: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Data Collection

• Data tapes shipped to Berkeley• Split into work units using 4 splitter

workstations– Divide 2.5MHz data into 256 subbands using

2048-point FFT followed by 256 8-point inverse transforms

– Subbands are 9,766Hz wide– 220 samples, thus each work unit is ~10KHz wide

and 107 s long– Work units overlap to detect overlapping signals

• Work units are stored on separate server for distribution

Page 15: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Data Collection

• Main SETI@Home Server– 3 Sun Enterprise 450 Series Computers

• User Database– Contains account information for each of the 2.4 million

users– Also aggregates statistics by platform

• Science Database– Contains information about each work unit

» Time, sky coords, frequency range» How many times each work unit has been downloaded

– Stores parameters of candidate signals» Signal power, frequency, arrival time sky coords» 1.1 billion candidates (Oct. 2000)

• Work unit storage

Page 16: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Data Collection

• Work unit storage server– Distribution of work units, storage of results

• Client communications via HTTP– Important to get through firewalls– Request to download new work unit

• Work units that have not been downloaded yet have priority

• Then, work units for which no results have been returned

– Request to post results• Data contains signal characteristics• Updates user statistics

Page 17: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

The SETI@Home Client

• Available for 47 different combinations of CPU and OS

• Dominant platforms: Windows, Mac– Feature graphical “screensaver”

display– UNIX works as daemon

(display program available for X)

Page 18: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

The SETI@Home Client

• Downloads work unit from server• Performs “baseline smoothing” to

eliminate wideband features, help reduce false signals

• Performs main data analysis loop(shown on next page)

Page 19: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Main Data Analysis Loopfor Doppler Drift rates from -50 to 50Hz {

for bandwidths from 0.075 to 1220Hz in 2x steps { Generate time-ordered power spectra

Search for short-duration signals above a constant threshold

for each frequency { Search for faint signals matching

beam parameters (Gaussians)

Search for groups of 3 evenly spaced signals

Search for faint repeating pulses (pulses) } } }

Page 20: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

The SETI@Home Client

• Client examines signal at various drift rates– 10 to -10 Hz (fine-grained)– 50 to -50 Hz (~twice as course)

• Although drift rates are most likely negative, examine both sides– For statistical comparison– To detect deliberately chirped signals

Page 21: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

The SETI@Home Client

• For each drift rate, examines the signal at different bandwidths between 0.075 and 1,221 Hz– Using a variety of FFT– Not all bandwidths are examined at

every drift rate (only when drift rate becomes significant compared to the frequency)

Page 22: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

The SETI@Home Client

• Transformed signals are examined for spiked exceeding 22 times the mean noise power

• Threshold: 7.2 x 1025 W/m2 (at the finest frequency resolutions)

• “Detecting a cell phone on one of the moons of Saturn”

• These spikes are what the client reports

Page 23: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

The SETI@Home Client

• Other transformations to detect Gaussians and pulse patterns

• Specialized algorithms (fast-folding algorithms) for detecting pulses efficiently

• Work by “folding” portions of the signal together in time, to detect gain over the pulse period

Page 24: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

The SETI@Home Client

• Typical workload:– 2.4 to 3.8 trillion floating-point

operations (teraflops)– Typical 500MHz PC takes 10 to 12

hours to complete a work unit– Within the average work unit:

• 4 spikes, 1 Gaussian, 1 pulsed signal, 1 triplet signal

• <Insert Demonstration Here>

Page 25: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Postprocessing

• Client uploads candidate signal data to server(exact data formats are kept quiet)

• Server examines results for errors• Keeps track of user statistics

Page 26: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Error detection

• SETI@Home uses thousands of CPU years every day

• With heat, floating-point units are the first to give incorrect results

• High error rates are offset by easy error detection

• Replication of work units is the primary error detection mechanism

• 60% of work unit results must agree in order to be considered for further analysis

Page 27: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Candidate Signals

• Vast majority of detected signals correspond to terrestrial RFI– Extra-terrestrial signals can not last

more than 12 s– Also, signals should repeat when

viewing the same portion of the sky at a later time

Page 28: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Project Status

• October 2000– 2.4 million users– 520,000 active clients donating 437,000

years of CPU time (4.3 x 1020 flop)– Average processing rate: 15.7 Tflops

• “Largest supercomputer in existence”• “Largest computation ever performed”

Page 29: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Project Status

• 1.1 Billion signals in SETI@Home database

• Candidate signals being submitted faster than the server can confirm them

• So far, no extra-terrestrial signals

Page 30: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Future Work

• Expand coverage by adding new telescope in southern hemisphere

• Expand frequency bandwidth(up to double the data rate)

• Expand number of volunteers, increase SETI education efforts

Page 31: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Summary

• Seemingly impossible problem• Easily partitioned• Good publicity, marketing• Achieves incredible performance

– But, high latency– High redundancy/replication of

computation

Page 32: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Related Work

• Distributed.net– Cracking of encryption keys (DES,rc5)– Search for optimal Golomb rulers

• Folding@Home– Stanford project - distributed protein folding

• PiHex– Distributed effort to calculate Pi

• GIMPS– Great Internet Mersenne Prime Search

Page 33: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

Discussion

• Potential comments on:– System architecture– Fault-tolerance– Security

Page 34: SETI@Home Sunny Gleason COM S 717 November 29, 2001 (Based on the article, “SETI@Home: Massively Distributed Computing for SETI.”)

References

• Seti@Home Web Site– http://setiathome.ssl.berkeley.edu/

• NASA Science Newsletter– http://science.nasa.gov/newhome/headlines/ast23may99_1.htm

• Papers– Korpela, et al. “SETI@Home: Massively

Distributed Computing for SETI.”– Sullivan, et al. “A new major SETI project based

on Project Serendip data and 100,000 personal computers.”

– “The SETI@Home Sky Survey.” Available from the SETI@Home web site.