processing data with ruby
DESCRIPTION
Brief overview of how to deal with processing scientific data using Ruby to interface with existing software.TRANSCRIPT
![Page 1: Processing Data with Ruby](https://reader034.vdocuments.net/reader034/viewer/2022051109/54851ac25806b595588b4724/html5/thumbnails/1.jpg)
Data Processing with Ruby
Brian Chapadoshttp://chapados.org
SDRubyApril 3, 2008
![Page 2: Processing Data with Ruby](https://reader034.vdocuments.net/reader034/viewer/2022051109/54851ac25806b595588b4724/html5/thumbnails/2.jpg)
> Archaeglobus PCNAMIDVIMTGELLKTVTRAIVALVSEARIHFLEKGLHSRAVDPANVAMVIVDIPKDSFEVYNIDEEKTIGVDMDRIFDISKSISTKDLVELIVEDESTLKVKFGSVEYKVALIDPSAIRKEPRIPELELPAKIVMDAGEFKKAIAAADKISDQVIFRSDKEGFRIEAKGDVDSIVFHMTETELIEFNGGEARSMFSVDYLKEFCKVAGSGDLLTIHLGTNYPVRLVFELVGGRAKVEYILAPRIESE
Understanding Proteins
sequence: 1-D linear chain
structure: 3-D after folding
![Page 3: Processing Data with Ruby](https://reader034.vdocuments.net/reader034/viewer/2022051109/54851ac25806b595588b4724/html5/thumbnails/3.jpg)
Hard to do structures with several components
![Page 4: Processing Data with Ruby](https://reader034.vdocuments.net/reader034/viewer/2022051109/54851ac25806b595588b4724/html5/thumbnails/4.jpg)
X-ray scattering
C. Trame, personal communication.Sousa et al. 2000. Cell 103: 633-643.
![Page 5: Processing Data with Ruby](https://reader034.vdocuments.net/reader034/viewer/2022051109/54851ac25806b595588b4724/html5/thumbnails/5.jpg)
Raw Data Distance distribution function of
particle
R P(R) ERROR
0.0000E+00 0.0000E+00 0.0000E+00 0.5000E+00 0.3157E-02 0.0000E+00 0.1000E+01 0.6069E-02 0.0000E+00 0.1500E+01 0.8740E-02 0.0000E+00 0.2000E+01 0.1118E-01 0.0000E+00 0.2500E+01 0.1339E-01 0.0000E+00 0.3000E+01 0.1538E-01 0.0000E+00 0.3500E+01 0.1718E-01 0.0000E+00 0.4000E+01 0.1879E-01 0.0000E+00 0.4500E+01 0.2023E-01 0.0000E+00 0.5000E+01 0.2153E-01 0.0000E+00 0.5500E+01 0.2269E-01 0.0000E+00 0.6000E+01 0.2374E-01 0.0000E+00 0.6500E+01 0.2471E-01 0.0000E+00 0.7000E+01 0.2560E-01 0.0000E+00 0.7500E+01 0.2645E-01 0.0000E+00 0.8000E+01 0.2727E-01 0.0000E+00 0.8500E+01 0.2809E-01 0.0000E+00 0.9000E+01 0.2891E-01 0.0000E+00 0.9500E+01 0.2976E-01 0.0000E+00 0.1000E+02 0.3065E-01 0.0000E+00 0.1050E+02 0.3160E-01 0.0000E+00 0.1100E+02 0.3261E-01 0.0000E+00 0.1150E+02 0.3370E-01 0.0000E+00 0.1200E+02 0.3487E-01 0.0000E+00 0.1250E+02 0.3613E-01 0.0000E+00 0.1300E+02 0.3747E-01 0.0000E+00 0.1350E+02 0.3890E-01 0.0000E+00 0.1400E+02 0.4041E-01 0.0000E+00 0.1450E+02 0.4201E-01 0.0000E+00 0.1500E+02 0.4367E-01 0.0000E+00 0.1550E+02 0.4539E-01 0.0000E+00 0.1600E+02 0.4717E-01 0.0000E+00 0.1650E+02 0.4899E-01 0.0000E+00 0.1700E+02 0.5083E-01 0.0000E+00 0.1750E+02 0.5268E-01 0.0000E+00 0.1800E+02 0.5453E-01 0.0000E+00 0.1850E+02 0.5636E-01 0.0000E+00 0.1900E+02 0.5815E-01 0.0000E+00 0.1950E+02 0.5989E-01 0.0000E+00 0.2000E+02 0.6157E-01 0.0000E+00 0.2050E+02 0.6317E-01 0.0000E+00 0.2100E+02 0.6467E-01 0.0000E+00 0.2150E+02 0.6607E-01 0.0000E+00 0.2200E+02 0.6735E-01 0.0000E+00 0.2250E+02 0.6851E-01 0.0000E+00 0.2300E+02 0.6954E-01 0.0000E+00 0.2350E+02 0.7043E-01 0.0000E+00 0.2400E+02 0.7118E-01 0.0000E+00 0.2450E+02 0.7179E-01 0.0000E+00 0.2500E+02 0.7225E-01 0.0000E+00 0.2550E+02 0.7258E-01 0.0000E+00 0.2600E+02 0.7277E-01 0.0000E+00 0.2650E+02 0.7283E-01 0.0000E+00 0.2700E+02 0.7277E-01 0.0000E+00 0.2750E+02 0.7259E-01 0.0000E+00 0.2800E+02 0.7231E-01 0.0000E+00 0.2850E+02 0.7194E-01 0.0000E+00 0.2900E+02 0.7149E-01 0.0000E+00 0.2950E+02 0.7096E-01 0.0000E+00 0.3000E+02 0.7038E-01 0.0000E+00 0.3050E+02 0.6975E-01 0.0000E+00 0.3100E+02 0.6909E-01 0.0000E+00 0.3150E+02 0.6840E-01 0.0000E+00 0.3200E+02 0.6770E-01 0.0000E+00 0.3250E+02 0.6700E-01 0.0000E+00 0.3300E+02 0.6630E-01 0.0000E+00 0.3350E+02 0.6561E-01 0.0000E+00 0.3400E+02 0.6494E-01 0.0000E+00 0.3450E+02 0.6429E-01 0.0000E+00 0.3500E+02 0.6366E-01 0.0000E+00 0.3550E+02 0.6304E-01 0.0000E+00 0.3600E+02 0.6245E-01 0.0000E+00 0.3650E+02 0.6186E-01 0.0000E+00 0.3700E+02 0.6128E-01 0.0000E+00 0.3750E+02 0.6070E-01 0.0000E+00 0.3800E+02 0.6010E-01 0.0000E+00 0.3850E+02 0.5948E-01 0.0000E+00 0.3900E+02 0.5881E-01 0.0000E+00 0.3950E+02 0.5810E-01 0.0000E+00 0.4000E+02 0.5731E-01 0.0000E+00 0.4050E+02 0.5643E-01 0.0000E+00 0.4100E+02 0.5545E-01 0.0000E+00 0.4150E+02 0.5434E-01 0.0000E+00 0.4200E+02 0.5309E-01 0.0000E+00 0.4250E+02 0.5168E-01 0.0000E+00 0.4300E+02 0.5008E-01 0.0000E+00 0.4350E+02 0.4828E-01 0.0000E+00 0.4400E+02 0.4627E-01 0.0000E+00 0.4450E+02 0.4401E-01 0.0000E+00 0.4500E+02 0.4151E-01 0.0000E+00 0.4550E+02 0.3874E-01 0.0000E+00 0.4600E+02 0.3568E-01 0.0000E+00 0.4650E+02 0.3234E-01 0.0000E+00 0.4700E+02 0.2869E-01 0.0000E+00 0.4750E+02 0.2472E-01 0.0000E+00 0.4800E+02 0.2044E-01 0.0000E+00 0.4850E+02 0.1583E-01 0.0000E+00 0.4900E+02 0.1088E-01 0.0000E+00 0.4950E+02 0.5608E-02 0.0000E+00 0.5000E+02 0.0000E+00 0.0000E+00
Reciprocal space: Rg = 20.97 , I(0) = 0.2953E+02
Real space: Rg = 20.94 +- 0.026 I(0) = 0.2953E+02 +- 0.2278E+00
![Page 6: Processing Data with Ruby](https://reader034.vdocuments.net/reader034/viewer/2022051109/54851ac25806b595588b4724/html5/thumbnails/6.jpg)
Existing SoftwareSvergun group @ EMBLhttp://www.embl-hamburg.de/ExternalInfo/Research/Sax/software.html
“interactive” interfacesnot easily scriptable
Works well, but...
requires running each program multiple times
no really... you have to see it to believe it
![Page 7: Processing Data with Ruby](https://reader034.vdocuments.net/reader034/viewer/2022051109/54851ac25806b595588b4724/html5/thumbnails/7.jpg)
Help from Ruby
We want to use linux clusters with hundreds of CPUs
Ruby
Rake
wrap external programswrite shell scripts to run external programs
define relationships between inputs/outputs of different programs
launch external programs after dependencies are satisfied
![Page 8: Processing Data with Ruby](https://reader034.vdocuments.net/reader034/viewer/2022051109/54851ac25806b595588b4724/html5/thumbnails/8.jpg)
Do more with Ruby
Define input parameters in a scriptDefine common tasks in a library
quick and dirty...
more robust...
Evolve towards a micro-framework
Ruby API for running commands
More sophisticated information processing
![Page 9: Processing Data with Ruby](https://reader034.vdocuments.net/reader034/viewer/2022051109/54851ac25806b595588b4724/html5/thumbnails/9.jpg)
AcknowledgementsLab (Scripps Research Institute)
John TainerScott WilliamsChris Putnam
Data CollectionBeamline 12.3.1
The Advanced Light Source (ALS, LBNL)
FundingNIH, DOE, NCI