data descriptions

23
7 Datasets 7.1 Tips Source: Bryant, P. G. and Smith, M. A. (1995), Practical Data Analysis: Case Studies in Business Statistics, Richard D. Irwin Publishing, Homewood, IL. Number of cases: 244 Number of variables: 8 Description: Food servers’ tips in restaurants may be influenced by many factors, including the nature of the restaurant, size of the party, and table locations in the restaurant. Restaurant managers need to know which factors matter when they assign tables to food servers. For the sake of staff morale, they usually want to avoid either the substance or the appearance of unfair treatment of the servers, for whom tips (at least in restaurants in the United States) are a major component of pay. In one restaurant, a food server recorded the following data on all cus- tomers they served during an interval of two and a half months in early 1990. The restaurant, located in a suburban shopping mall, was part of a national chain and served a varied menu. In observance of local law the restaurant offered seating in a non-smoking section to patrons who requested it. Each record includes a day and time, and taken together, they show the server’s work schedule. Variable Explanation obs Observation number totbill Total bill (cost of the meal), including tax, in US dollars tip Tip (gratuity) in US dollars sex Sex of person paying for the meal (0=male, 1=female) smoker Smoker in party? (0=No, 1=Yes) day 3=Thur, 4=Fri, 5=Sat, 6=Sun time 0=Day, 1=Night size Size of the party

Upload: ledan

Post on 19-Jan-2017

227 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Data Descriptions

“book”2007/9/21page 153!

!!

!

!!

!!

7

Datasets

7.1 Tips

Source: Bryant, P. G. and Smith, M. A. (1995), Practical Data Analysis: CaseStudies in Business Statistics, Richard D. Irwin Publishing, Homewood, IL.

Number of cases: 244Number of variables: 8Description: Food servers’ tips in restaurants may be influenced by manyfactors, including the nature of the restaurant, size of the party, and tablelocations in the restaurant. Restaurant managers need to know which factorsmatter when they assign tables to food servers. For the sake of sta! morale,they usually want to avoid either the substance or the appearance of unfairtreatment of the servers, for whom tips (at least in restaurants in the UnitedStates) are a major component of pay.

In one restaurant, a food server recorded the following data on all cus-tomers they served during an interval of two and a half months in early 1990.The restaurant, located in a suburban shopping mall, was part of a nationalchain and served a varied menu. In observance of local law the restauranto!ered seating in a non-smoking section to patrons who requested it. Eachrecord includes a day and time, and taken together, they show the server’swork schedule.

Variable Explanationobs Observation numbertotbill Total bill (cost of the meal), including tax, in US dollarstip Tip (gratuity) in US dollarssex Sex of person paying for the meal (0=male, 1=female)smoker Smoker in party? (0=No, 1=Yes)day 3=Thur, 4=Fri, 5=Sat, 6=Suntime 0=Day, 1=Nightsize Size of the party

Page 2: Data Descriptions

“book”2007/9/21page 154!

!!

!

!!

!!

154 7 Datasets

Primary question: What are the factors that a!ect tipping behavior?

Data restructuring: A new variable tiprate = tip/totbill should be calculated.

Analysis notes: This dataset is fabulously simple and yet fascinating. Theoriginal case study fits a traditional regression model, using tiprate as a re-sponse variable. The only important variable emerging from this model issize: As size increases, tiprate decreases. The reader may have noticed thatrestaurants seem to know about this association, because they often includea service charge for larger dining parties. (There has been at least one lawsuitregarding this service charge.) Here, this association explains only 2% of allthe variation in tip rate — it is a very weak model! There are many otherinteresting features in the data, as described in this book.

Data files:tips.csv, tips.xml

7.2 Australian Crabs

Source: Campbell, N. A. & Mahon, R. J. (1974), A Multivariate Study ofVariation in Two Species of Rock Crab of genus Leptograpsus, AustralianJournal of Zoology 22, 417–425. The data was first brought to our attentionby Venables & Ripley (2002) and Ripley (1996).

Number of rows: 200Number of variables: 8Description: Measurements on rock crabs of the genus Leptograpsus. Onespecies L. variegatus had been split into two new species, previously groupedby color, orange and blue. Preserved specimens lose their color, so it washoped that morphological di!erences would enable museum specimens to beclassified. There are 50 specimens of each sex of each species, collected on siteat Fremantle, Western Australia. For each specimen, five measurements weremade, using vernier calipers.

Variable Explanation

species orange or bluesex male or femaleindex 1–200frontal lip (FL) length, in mmrear width (RW) width, in mmcarapace length (CL) length of midline of the carapace, in mmcarapace width (CW) maximum width of carapace, in mmbody depth (BD) depth of the body; for females, measured after

displacement of the abdomen, in mm

Page 3: Data Descriptions

“book”2007/9/21page 155!

!!

!

!!

!!

7.3 Italian Olive Oils 155

Primary question: Can we deter-mine the species and sex of thecrabs based on these five morpho-logical measurements?

Data restructuring: A new classvariable distinguishing all fourgroups would be useful.

Analysis notes: All physical measurements on the crabs are strongly positivelycorrelated, and this is the main structure in the data. For this reason, it maybe helpful to sphere the data and use principal components instead of rawvariables in any analysis. Despite this strong association, there are a lot ofdi!erences among the four groups. Species can be perfectly distinguished byphysical characteristics, and so can the sex of the larger crabs. In previousanalyses, the measurements were logged, but we have not found this to benecessary.

Data files:australian-crabs.csv, australian-crabs.xml

7.3 Italian Olive Oils

Source: Forina, M., Armanino, C., Lanteri, S. & Tiscornia, E. (1983), Classi-fication of Olive Oils from their Fatty Acid Composition, in Martens, H. andRusswurm Jr., H., eds, Food Research and Data Analysis, Applied SciencePublishers, London, pp. 189–214. It was brought to our attention by Glover& Hopke (1992).

Number of rows: 572Number of variables: 10Description: This data consists of the percentage composition of fatty acidsfound in the lipid fraction of Italian olive oils. The data arises from a studyto determine the authenticity of an olive oil.

Page 4: Data Descriptions

“book”2007/9/21page 156!

!!

!

!!

!!

156 7 Datasets

Variable Explanation

region Three “super-classes” of Italy: North, South, and the islandof Sardinia

area Nine collection areas: three from the region North (Umbria,East and West Liguria), four from South (North and SouthApulia, Calabria, and Sicily), and two from the island ofSardinia (inland and coastal Sardinia).

palmitic,palmitoleic,stearic, oleic,linoleic,linolenic,arachidic,eicosenoic

fatty acids, % ! 100

Primary question: Howdo we distinguish theoils from di!erent regionsand areas in Italy basedon their combinations ofthe fatty acids?

Data restructuring: Noneneeded.

Analysis notes: There are nine classes (areas) in this data, too many to easilyclassify. A better approach is to take advantage of the hierarchical structurein the data, partitioning by region before starting.

Some of the classes are easy to distinguish, but others present a challenge.The clusters corresponding to classes all have di!erent shapes in the eight-dimensional data space.

Data files:olive.csv, olive.xml

Page 5: Data Descriptions

“book”2007/9/21page 157!

!!

!

!!

!!

7.5 PRIM7 157

7.4 Flea Beetles

Source: Lubischew, A. A. (1962), On the Use of Discriminant Functions inTaxonomy, Biometrics 18, 455–477.

Number of rows: 74Number of variables: 7Description: This data contains physical measurements on three species offlea beetles.

Variable Explanation

species Ch. concinna, Ch. heptapotamica, and Ch. heikertingeritars1 width of the first joint of the first tarsus in micronstars2 width of the second joint of the first tarsus in micronshead the maximal width of the head between the external edges of

the eyes in 0.01 mmaede1 the maximal width of the aedeagus in the fore-part in micronsaede2 the front angle of the aedeagus (1 unit = 7.5 degrees)aede3 the aedeagus width from the side in microns

Primary question: How do we classify the three species?

Data restructuring: None needed.

Analysis notes: This straightforward dataset has three very well separatedelliptically shaped clusters. It is fun to cluster the data with various algorithmsand see how many get the clusters wrong.

Data files:flea.csv, flea.xml

7.5 PRIM7

Source: First used in Friedman, J. H. & Tukey, J. W. (1974), A Projection Pur-suit Algorithm for Exploratory Data Analysis, IEEE Transactions on Com-puting C 23, 881–889. Originally from Ballam, J., Chadwick, G. B., Guiragos-sian, G. T., Johnson, W. B., Leith, D. W. G. S., & Moriyasu, K. (1971), Vanhove analysis of the reaction !!p "# !!!!!+p and !+p "# !+!+!!p at16 gev/c", Physics Review D 4(1), 1946–1966.

Number of cases: 500Number of variables: 7

Page 6: Data Descriptions

“book”2007/9/21page 158!

!!

!

!!

!!

158 7 Datasets

Description: This data contains observations taken from a high-energy particlephysics scattering experiment that yielded four particles. The reaction !+

b pt #p!+

1 !+2 !! can be described completely by seven independent measurements.

Below, µ2(A, B,±C) = (EA +EB±EC)2"(PA +PB±PC)2 and µ2(A,±B) =(EA ± EB)2 " (PA ± PB)2, where E and P represent the particle’s energyand momentum, respectively, as measured in billions of electron volts. Thenotation (p)2 represents the inner product P/P. The ordinal assignment ofthe two !+’s was done randomly.

Variable Explanation

X1 µ2(!!, !+1 , !+

2 )X2 µ2(!!, !+

1 )X3 µ2(p, !!)X4 µ2(!!, !+

2 )X5 µ2(p, !+

1 )X6 µ2(p, !+

1 ,"pt)X7 µ2(p, !+

2 ,"pt)

Primary question: What are the clusters in the data?

Data restructuring: None needed, although it is helpful to sphere the data toprincipal component coordinates before using projection pursuit.

Analysis notes: The case study illustrates the strength of our graphical meth-ods in detecting sparse structure in high-dimensional space. It is a stunninglook at uncovering a very geometric structure in high-dimensional space. Acombination of interactive brush controls and motion graphics reveals thatthe points lie on a structure comprising connected low-dimensional pieces: atwo-dimensional triangle, with two linear pieces extending from each vertex(Cook et al. 1995). Various graphical tools specifically facilitated the discoveryof the structure: Plots of low-dimensional projections of the seven-dimensionalobject allowed discovery of the low-dimensional pieces, highlighting allowedthe pieces to be recorded or marked, and animating many projections into amovie over time allowed the pieces to be reconstructed into the full shape.The data was 20 years old by the time these visual methods were applied toit, and the structure is known by physicists.

Data files:prim7.csv, prim7.xml

Page 7: Data Descriptions

“book”2007/9/21page 159!

!!

!

!!

!!

7.6 Tropical Atmosphere-Ocean Array (TAO) 159

7.6 Tropical Atmosphere-Ocean Array (TAO)

Source: The data from the array, along with current updates, can be viewedon the web at http://www.pmel.noaa.gov/tao.

Number of cases: 736Number of variables: 8Description: The El Nino/Southern Oscillation (ENSO) cycle of 1982–1983,the strongest of the century, created many problems throughout the world.Parts of the world such as Peru and the United States experienced destruc-tive flooding from increased rainfall, whereas countries in the western Pacificexperienced drought and devastating brush fires. The ENSO cycle was nei-ther predicted nor detected until it was near its peak, which highlighted theneed for an ocean observing system to support studies of large-scale ocean-atmosphere interactions on seasonal-to-interannual time scales.

This observing system was developed by the international Tropical OceanGlobal Atmosphere (TOGA) program. The Tropical Atmosphere Ocean (TAO)array consists of nearly 70 moored buoys spanning the equatorial Pacific,measuring oceanographic and surface meteorological variables critical for im-proved detection, understanding and prediction of seasonal-to-interannual cli-mate variations originating in the tropics, most notably those related to theENSO cycles.

The moorings were developed by the National Oceanic and AtmosphericAdministration’s (NOAA) Pacific Marine Environmental Laboratory (PMEL).Each mooring measures air temperature, relative humidity, surface winds, seasurface temperatures, and subsurface temperatures down to a depth of 500meters, and a few of the buoys measure currents, rainfall, and solar radiation.

The TAO array provides real-time data to climate researchers, weatherprediction centers, and scientists around the world. Forecasts for tropical Pa-cific Ocean temperatures for one to two years in advance can be made usingthe ENSO cycle data. These forecasts are possible because of the mooredbuoys, along with drifting buoys, volunteer ship temperature probes, and sealevel measurements.

Page 8: Data Descriptions

“book”2007/9/21page 160!

!!

!

!!

!!

160 7 Datasets

Variable Explanation

year 1993 (a normal year), 1997 (an El Nino year),for November, December, and the January of thefollowing year.

latitude 0#, 2#S, 5#S only.longitude 110#W, 95#W only.sea surface temp (SST) measured in #C, at 1 m below the surfaceair temp (AT) measured in #C, at 3 m above the sea surface.humidity (Hum) relative humidity, measured 3 m above the sea

surface.uwind east–west component of wind, measured 4 m

above sea surface: positive means the wind isblowing toward the east.

vwind north–south component of the wind, measured 4m above sea surface: a positive sign means thatthe wind is blowing toward the north.

Primary question: Can we detect the El Nino event, based on sea surfacetemperature? What changes in the other observed variables occur during thisevent?

Data restructuring: This subset comes from a larger dataset extracted from theweb site mentioned above. That data runs from March 7, 1980 to December31, 1998, from "10#S to 10#N, and from 130#E to 90#W. There are 178,080recorded measurements, with time, latitude, longitude, and five atmosphericvariables for each record.

Longitude is measured with east as positive and west as negative units,with the prime meridian in Greenwich, UK, at 0#. The buoys are mooredin the Pacific Ocean on the other side of the globe, with measurements oneither side of the International Date Line ("180# = 180#)! This makes it verydi"cult to plot the data using numerical scales. We added new categoricalvariables marking the moored location of the buoys, making it easier to plotthe spatial coordinates. These variables also make it easier to identify thebuoys, because they tend to break free of the moorings and drift occasionally.

Analysis notes: One hurdle in working with this data is the large numberof missing values. The missingness needs to be explored as a first step, andmissing values need to be imputed before an analysis.

The larger data is cumbersome to work with, because of the missing valuesand the spatiotemporal context, but it has some interesting features. Plottingthe latitude and longitude reveals that some buoys tend to drift, quite sub-stantially at times, and that they are eventually retrieved and reattached tothe moorings!

Page 9: Data Descriptions

“book”2007/9/21page 161!

!!

!

!!

!!

7.7 Primary Biliary Cirrhosis (PBC) 161

There is a massive El Nino event in the last year of this larger subset,1997–1998, and it is visible at some locations when time series of sea surfacetemperature are plotted. Smaller El Nino events are visible at several otheryears. Changes in other variables are noticeable during these events too, par-ticularly in one of the wind components.

Data files:tao.csv, tao.xml Small subsets mostly used in the book.

tao-full.csv,tao-full.xml

The full data, not commonly used in the book, butincluded for data context.

7.7 Primary Biliary Cirrhosis (PBC)

Source: Distributed with Fleming & Harrington, Counting Processes and Sur-vival Analysis, Wiley, New York, 1991, and available fromhttp://lib.stat.cmu.edu/datasets. A description of the clinical back-ground for the trial and the covariates recorded here is in Chapter 0, especiallySection 0.2. It was originally from the Mayo Clinic trial in primary biliary cir-rhosis (PBC) of the liver conducted between 1974 and 1984. A more extendeddiscussion can be found in Dickson et al., Prognosis in primary biliary cirrho-sis: model for decision making, Hepatology 10, 1–7 (1989) and in Markus et al.,E"ciency of liver transplantation in patients with primary biliary cirrhosis,New England Journal of Medicine 320, 1709–1713 (1989).

Number of cases: 312Number of variables: 20Description: A total of 424 PBC patients, referred to the Mayo Clinic dur-ing that ten-year interval, met eligibility criteria for the randomized placebocontrolled trial of the drug D-penicillamine. The first 312 cases in the datasetrefer to subjects who participated in the randomized trial; they contain largelycomplete data. The additional 112 subjects did not participate in the clinicaltrial but consented to have basic measurements recorded and to be followedfor survival; they are not represented here.

Page 10: Data Descriptions

“book”2007/9/21page 162!

!!

!

!!

!!

162 7 Datasets

Variable Explanation

idfu.days number of days between registration and the earlier of death,

transplantation, or study analysis time in July 1986status status is coded as 0=censored, 1=censored due to liver tx,

2=deathdrug 1=D-penicillamine, 2=placeboage in dayssex 0=male, 1=femaleascites presence of ascites: 0=no 1=yeshepatom presence of hepatomegaly: 0=no 1=yesspiders presence of spiders: 0=no 1=yesedema presence of edema: 0=no edema and no diuretic therapy for

edema; .5 = edema present without diuretics, or edema re-solved by diuretics; 1 = edema despite diuretic therapy

bili serum bilirubin in mg/dlchol serum cholesterol in mg/dlalbumin in gm/dlcopper urine copper in µg/dayalk.phos alkaline phosphatase in U/lsgot SGOT in U/mltrig triglycerides in mg/dlplatelet platelets per cubic ml/1,000protime prothrombin time in secondsstage histologic stage of disease

Primary question: How do the di!erent drugs a!ect the patients?

Data restructuring: Only records corresponding to patients that were in theoriginal clinical trial were included in this data. The remaining records hadtoo many systematic missing values.

Analysis notes: Handling missing values is an interesting exercise in this data,and experimenting with data transformations.

Data files:pbc.csv

7.8 Spam

Source: This was data collected at Iowa State University (ISU) by the 2003Statistics 503 class.

Page 11: Data Descriptions

“book”2007/9/21page 163!

!!

!

!!

!!

7.8 Spam 163

Number of cases: 2,171Number of variables: 21Description: Every person monitored their email for a week and recordedinformation about each email message; for example, whether it was spam,and what day of the week and time of day the email arrived. We want to usethis information to build a spam filter, a classifier that will catch spam withhigh probability but will never classify good email as spam.

Variable Explanation

isuid Iowa State U. student id (1–19)id email id (a unique message descriptor)day of week sun, mon, tue, wed, thu, fri, sattime of day 0–23 (only integer values)size.kb size of email in kilobytesbox yes if sender is in recipient’s in- or outboxes (i.e., known to

recipient); else nodomain high-level domain of sender’s email address: e.g., .edu, .rulocal yes if sender’s email is in local domain, else no; local ad-

dresses have the form [email protected] number of numbers (0-9) in the sender’s name: e.g., for lot-

[email protected], this is 4.name “name” (if first and last names are present), “single” (if only

one name is present), or emptycapct % capital letters in subject linespecial number of non-alphanumeric characters in subjectcredit yes if subject line includes one of mortgage, sale, approve,

credit; else nosucker yes if subject line includes one of the words earn, free, save;

else noporn yes if subject line includes one of nude, sex, enlarge, improve;

else nochain yes if subject line includes one of pass, forward, help; else nousername yes if subject includes recipient’s name or login; else nolarge.text yes if email is HTML R$ and includes test for large font, de-

fined as size = +3 or size = 5 or higher; else nospampct probability of being spam, according to ISU spam filter.category extended spam/mail category: “com,” “list,” “news,” “ord”spam yes if spam; else no

Primary question: Can we distinguish between spam and “ham?”

Data restructuring: A lot of work was done to prepare this data for analysis!It is now quite clean, and no restructuring should be needed.

Page 12: Data Descriptions

“book”2007/9/21page 164!

!!

!

!!

!!

164 7 Datasets

Analysis notes: The ISU mail handlers examine each email message and assignit a probability of being spam. Commonly used mail readers can use thisinformation to file email directly into the trash, or at least to a special folder.It will be interesting to compare the results of a spam filter built on ourcollected data with results of the university’s algorithm. (The university’salgorithm was classifying a lot of email from the university president as spamfor a short period!) Another aside is that there is a temporal trend to spam,which seems to be more frequent at some times of day and night. We havealso seen that some users get more spam than others.

Careful choice of variables is needed for building the spam filter. Only thosethat might be automatically calculated by a mail handler are appropriate.

There are some missing values in the data due to di!erences between mailhandlers and the availability of information about the emails.

Spammers evolve their attacks quickly, and the recognizable signs of spamof 2003 no longer exist in 2006 spam. For example, all spam now arrives withcomplete Caucasian-style name fields, and messages are embedded in imagesrather than plain text.

Data files:spam.csv, spam.xml

7.9 Wages

Source: Singer, J. D. & Willett, J. B. (2003), Applied Longitudinal DataAnalysis, Oxford University Press, Oxford, UK. It is a subset of data col-lected in the National Longitudinal Survey of Youth (NLSY) described athttp://www.bls.gov/nls/nlsdata.htm.

Number of subjects: 888Number of variables: 15Number of observations, across all subjects: 6,402Description: The data was collected to track the labor experiences of malehigh-school dropouts. The men were between 14 and 17 years old at the timeof the first survey.

Page 13: Data Descriptions

“book”2007/9/21page 165!

!!

!

!!

!!

7.9 Wages 165

Variable Explanation

id 1–888, for each subject.lnw natural log of wages, adjusted for inflation, to 1990 dollars.exper length of time in the workforce (in years). This is treated as

the time variable, with t0 for each subject starting on theirfirst day at work. The number of time points and values oftime points for each subject can di!er.

ged when/if a graduate equivalency diploma is obtained.black categorical indicator of race = black.hispanic categorical indicator of race = hispanic.hgc highest grade completed.uerate unemployment rates in the local geographic region at each

measurement time.

Primary question: How do wages change with workforce experience?

Data restructuring: The data in its original form looked as follows, wheretime-independent variables have been repeated for each time point:

id lnw exper black hispanic hgc uerate

31 1.491 0.015 0 1 8 3.21531 1.433 0.715 0 1 8 3.21531 1.469 1.734 0 1 8 3.21531 1.749 2.773 0 1 8 3.29531 1.931 3.927 0 1 8 2.89531 1.709 4.946 0 1 8 2.49531 2.086 5.965 0 1 8 2.59531 2.129 6.984 0 1 8 4.79536 1.982 0.315 0 0 9 4.89536 1.798 0.983 0 0 9 7.40036 2.256 2.040 0 0 9 7.40036 2.573 3.021 0 0 9 5.29536 1.819 4.021 0 0 9 4.49536 2.928 5.521 0 0 9 2.89536 2.443 6.733 0 0 9 2.59536 2.825 7.906 0 0 9 2.59536 2.303 8.848 0 0 9 5.79536 2.329 9.598 0 0 9 7.600

It was restructured into two tables of data. One table contains the time-independent measurements identified by subject id, and the other table con-tains the time-dependent variables:

Page 14: Data Descriptions

“book”2007/9/21page 166!

!!

!

!!

!!

166 7 Datasets

id black hispanic hgc id lnw exper uerate

31 0 1 8 31 1.491 0.015 3.21536 0 0 9 31 1.469 1.734 3.215

31 1.749 2.773 3.29531 1.931 3.927 2.89531 1.709 4.946 2.49531 2.086 5.965 2.59531 2.129 6.984 4.79536 1.982 0.315 4.89536 1.798 0.983 7.40036 2.256 2.040 7.40036 2.573 3.021 5.29536 1.819 4.021 4.49536 2.928 5.521 2.89536 2.443 6.733 2.59536 2.825 7.906 2.59536 2.303 8.848 5.79536 2.329 9.598 7.600

Analysis notes: Singer & Willett (2003) use this data to illustrate fitting mixedlinear models to ragged time indexed data. The analysis reports that theaverage growth in wages is about 4.7% for each year of experience. There isno di!erence between whites and Hispanics, but a big di!erence from blacks.The model uses a linear trend (on the log wages) to follow these patterns.The within-variance component of the model is significant, which indicatesthat the variability for each person is dramatically di!erent. It does not tellus, however, in what ways people di!er, and which people are similar.

The data is fascinating from a number of perspectives. Although on averagewages tend to increase with time, the temporal patterns of individual wagesvary dramatically. Some men experience a decline in wages over time, othersa more satisfying increase, and yet others have very volatile wage histories.There is also a strange pattern di!erentiating the wage histories of black menfrom whites and Hispanics.

Data files:wages.xml

7.10 Rat Gene Expression

Source: X. Wen, S. Fuhrman, G. S. Michaels, D. B. Carr, S. Smith, J. L.Barker & R. Somogyi (1998), Large-scale temporal gene expression mapping ofcentral nervous system development, in Proceedings of the National Academyof Science 95, pp. 334–339, available on the web from http://pnas.org.

Page 15: Data Descriptions

“book”2007/9/21page 167!

!!

!

!!

!!

7.10 Rat Gene Expression 167

Number of cases: 112Number of variables: 17Description: This small subset of data is from a larger study of rat devel-opment using gene expression. The subset contains gene expression for ninedevelopmental times, taken by averaging several replicates and normalizingthe values using the maximum value for the gene.

Variable Explanation

E11 11-day-old embryoE13 13-day-old embryoE15 15-day-old embryoE18 18-day-old embryoE21 21-day-old embryoP0 at birthP7 at 7 daysP14 at 14 daysA adultClass1 (Class2) Functional classes representing expert’s best guess: 1

neuro-glial markers (1 markers), 2 neurotransmittermetabolizing enzymes (2 neurotransmitter receptors,3 GABA-A receptors, 4 glutamate receptors, 5 acetyl-choline receptors, 6 serotonin receptors), 3 peptidesignaling (7 neurotrophins, 8 heparin-binding growthfactors, 9 insulin/IGF), 4 diverse (10 intracellularsignaling, 11 cell cycle, 12 transcription factor, 13novel/EST, 14 other)

avcor average linkage, correlation distancewardcor Wards linkage, correlation distancecomcor complete linkage, correlation distanceavfluor average linkage, fluorescence distancewardfluor Wards linkage, fluorescence distancecomfluor complete linkage, fluorescence distance

Primary question: Do genes within a functional class have similar gene ex-pression patterns? How does a clustering of genes compare with the functionalclasses?

Data restructuring: The data has been cleaned and heavily processed. Thevariables summarizing the cluster analysis were added, but beyond this nomore restructuring of the data should be needed.

Analysis notes: The variables are time-ordered so parallel coordinate plots arevery useful here. Brushing, particularly automatically from R, to focus on one

Page 16: Data Descriptions

“book”2007/9/21page 168!

!!

!

!!

!!

168 7 Datasets

functional class, or cluster, at a time is useful to compare patterns of geneexpression between groups.

Data files:ratsm.csv, ratsm.xml

7.11 Arabidopsis Gene Expression

Source: The data was collected in Dr. Basil Nikolau’s lab at Iowa State Uni-versity and it is discussed in Cook, Hofmann, Lee, Yang, Nikolau & Wurtele(2007).

Number of cases: 8,297Number of variables: 9Description: This data is from a two-factor, single replicate experiment of thefollowing form:

Treatment added

no yesMutant M1,M2 MT1, MT2

Wild type W1, W2 WT1, WT2

The mutant organism is defective in the ability to synthesize an essentialcofactor, which is provided by the treatment.

The data was recorded on A!ymetrix GeneChip Arabidopsis Genome Ar-rays. The raw data was processed using robust median average and quan-tiles normalization available in the Bioconductor suite of tools (Bioconductor2006).

Variable Explanation

Gene ID A!ymetrix unique identifier for each gene. This is used asa label in the data, and for linking between multiple formsof the data.

M1 Mutant, no treatment added, replicate 1M2 Mutant, no treatment added, replicate 2MT1 Mutant, treatment added, replicate 1MT2 Mutant, treatment added, replicate 2WT1 Wild type, no treatment added, replicate 1WT2 Wild type, no treatment added, replicate 2WTT1 Wild type, treatment added, replicate 1WTT2 Wild type, treatment added, replicate 2

Page 17: Data Descriptions

“book”2007/9/21page 169!

!!

!

!!

!!

7.11 Arabidopsis Gene Expression 169

Primary question: Which genes are di!erentially expressed when the treat-ment is not added, with special interest in the mutant genotype?

Data restructuring: Two forms are provided in di!erent tables of data sothat we can examine the replicate data values in association with the overallvariation.

GeneID M1 M2 MT1 MT2 WT1 WT2 WTT1 WTT2

1...

8297

GeneID M MT WT WTT1...

8297

1...

8297

Averages across replicates are added to the short form of the data.Summaries from ANOVA models fit for each gene in the data are included

in the short form. These are useful for helping to detect di!erentially expressedgenes.

Analysis notes: Di!erence is measured by how the individual gene varies inthe replicate and by how all the genes vary in expression value.

We would also hope to see (1) small di!erences in expression values in thereplications, (2) small di!erences between expression values in wild type withand without the treatment added, and (3) little di!erence between expressionvalues in the mutant with the treatment and wild type.

It is important to emphasize the di!erence in analysis of microarray datafrom many other multivariate data analysis tasks. In microarray data, it isimportant to find a small number of genes that are behaving di!erently fromothers in an understandable way. This task involves both multiple comparisonsand outlier detection. From the perspective of a traditional statistical analysis,we are merely dealing with a problem that could be solved by a t-test forcomparing means. The drawback is that we have to do a test for every singlegene!

Page 18: Data Descriptions

“book”2007/9/21page 170!

!!

!

!!

!!

170 7 Datasets

Conventionally this type of data is plotted using a heatmap, shown below,but a lot of information can be obtained from linked scatterplots and parallelcoordinate plots.

WT1 W1

M1

M2

WT2 W2

MT2

MT1

16418_s_at15108_s_atAFFX−ThrX−3_at16439_at16014_at12763_s_at12881_s_at16011_s_at16924_s_at15105_s_at12752_s_at16430_at16983_at15981_at15122_at15170_s_at18080_at16424_g_at12741_at13269_s_at14684_s_at15952_i_at16436_at15102_s_at18085_r_at18073_g_at16905_s_at15953_s_at15977_s_at12791_r_at18053_s_at16024_atAFFX−Athal−Ubq_3_f_at12792_s_at15115_f_at15992_s_at16035_at18679_s_at15104_s_atAFFX−ThrX−5_atAFFX−ThrX−M_at15995_at12936_s_at15949_s_at12754_g_at14726_s_at18090_s_at18084_s_at16004_s_at13213_s_at13214_at15659_at18082_at16044_s_at18079_g_at15133_at15097_at16423_at16449_s_at16417_s_at12847_at12381_at16432_g_at16493_at12774_at15193_s_at15118_s_at16047_at13264_s_at12771_atAFFX−Athal−GAPDH_M_s_at15160_s_at18701_s_at16030_s_at16422_at15997_s_at16009_s_at16008_i_at14662_f_at17087_at16899_at15131_s_at16001_at15134_s_at18077_at12151_at12816_at19848_s_at16927_s_at19181_s_at12753_at16922_i_at18052_s_at12234_at16503_at14714_s_at16957_at20640_s_at13218_s_at16443_at13182_i_at14721_s_at18678_at15614_s_at14737_s_at18732_i_at14729_s_at16414_at18072_at15182_at16026_at15986_s_at18076_s_at12756_s_at13234_s_at12739_s_at18078_at16919_at18290_at16135_at12745_at16509_at16485_s_at16010_s_at13129_f_at13093_at15165_s_at16985_s_at16926_at17002_at18560_at15910_atAFFX−DapX−3_atAFFX−Athal−Ubq_5_f_at14719_s_at17394_s_at16495_at16531_s_at16931_s_at16046_s_at14644_s_at16967_s_at16463_at17095_s_at16527_at15135_s_at16982_at16464_g_at18276_at13578_at17390_at16918_s_at14728_s_at14717_s_at18666_s_at13089_at14722_s_at15989_at18081_at20184_at17414_at16456_at16446_at18088_i_at16015_at16476_at17931_at16497_at15972_at13230_atAFFX−Athal−Actin_3_f_at13183_r_at13150_at12800_atAFFX−CreX−3_at16574_s_at15584_s_at17004_s_at17389_at13166_at15890_at14073_at16429_at13184_s_at16032_s_at17005_at18614_at16933_at16016_at12748_f_at12746_i_at12239_at16514_at14117_at14077_at19843_at12892_g_at17119_s_at15103_s_at14089_at19707_s_at18949_at17381_at15145_s_at14733_s_at18594_at12760_g_at19946_at16457_s_at12740_s_at16048_at15189_s_at15153_at16488_at19977_at17010_s_at15107_s_at15969_s_at12769_at18087_s_at16898_s_at18601_s_at15205_at14725_s_at15114_s_at15373_g_at18644_at18089_r_at18708_at20256_s_at12036_at15098_s_at17876_at17877_g_at18629_s_at13225_s_at15116_f_at16136_at15164_s_at18692_at18009_s_at15576_s_at16477_at15962_at13663_s_at16017_at13062_at12921_s_at16612_s_at12767_at15154_at18217_g_at15431_at17900_s_at16620_s_at17441_s_at12817_g_at16060_at19178_at20537_at16517_at15111_s_at12920_at16897_i_at13120_at12781_at15100_f_at16039_s_at15194_s_at18700_r_at18699_i_at12842_s_atAFFX−DapX−M_at12798_at12797_s_at16455_at16901_at15987_at19963_at17947_at16021_at20562_at18086_s_at18264_at19887_s_at12785_at16428_at16499_at12117_at15603_at18075_r_at13678_at15521_at16479_s_atAFFX−CreX−5_atAFFX−Athal−Ubq_M_f_at15163_s_at16452_g_at18011_s_at14730_s_at16507_at18677_at16469_at14736_s_at12802_at16080_f_at14245_at12788_at13007_at16544_s_at13616_s_at17899_at19623_at12742_at18661_at16953_at16976_at13122_at14659_s_at16036_i_at14108_at15973_at15993_at12750_s_at16917_at16506_at13132_at16145_at16923_s_at16952_s_at15971_s_at17442_i_at15110_s_at12772_at16459_s_at13205_at18220_s_at17468_at16948_at15960_at15150_s_at15213_s_at19852_s_at16940_g_at15624_s_at17426_at13088_atAFFX−Athal−GAPDH_5_s_at15216_at16049_s_at13185_at15152_s_at13572_s_at14723_f_at16487_atAFFX−Athal−GAPDH_3_s_at14734_s_at17882_at20442_i_at14716_f_at15668_s_at15690_f_at12789_at14701_s_at17909_at17878_at15607_s_at12766_at17103_s_at19545_at13073_s_at16996_s_at20024_s_at12783_s_at17910_at18731_at15640_at12169_i_at16005_at12749_at16416_at15988_at16458_s_at16988_at17011_at16018_s_at19932_atAFFX−BioDn−3_at13124_at18874_at12768_at13091_r_at15411_at17436_at18019_s_at18221_at16969_i_at16935_s_at12894_g_at16027_s_at14099_at13169_r_at14658_s_at20709_s_at13080_at13142_at13074_at13076_r_at13077_f_at20559_at16954_at13267_at16920_at16942_at13075_i_at18882_at16437_s_at12775_s_at15113_s_at12078_at16034_at17913_at16540_s_at18000_s_at17438_at18687_at14727_s_at15191_at17048_s_at15136_s_at15125_f_at15141_s_at16054_s_at16150_at20491_at16053_i_at16914_s_at14832_at15162_at14096_at18228_at17187_at17930_s_at14491_at20117_at14667_at19892_at16431_at16462_s_at16461_i_at20227_s_at13211_s_at13680_at18581_at16981_s_at13187_i_at13189_s_at16447_at14720_s_at15672_s_at12324_i_at19646_at17388_at15582_s_at16022_at12167_at17413_s_at15523_s_at20238_at12764_f_at19548_at15776_at17018_s_at15448_at15620_s_at12916_at13078_at12410_g_at16525_at18929_s_at12777_i_at12778_r_at16943_s_at12166_i_at15087_at12744_at12843_s_at14946_at15642_at15982_s_at16037_s_at13146_s_at18222_at14534_s_at13544_at19156_s_at13259_s_at15130_s_at18735_s_at13102_at15149_s_at15963_at12773_at16938_at13940_at19131_at19375_at16460_s_at18047_at13305_at16673_at14947_at18655_at15020_at16475_at12811_at14565_at17421_s_at13668_at20704_at18734_s_at13698_at13670_s_at17459_at16051_at15209_s_at18071_at16106_at15990_at17996_at12762_r_at13682_s_at18247_at17532_at13140_at17374_at18239_g_at20001_at12813_at15091_at15627_at15127_s_at18604_at15124_s_at14240_s_at18270_at16438_at15215_s_at14492_s_at12799_at18521_at18083_r_at17552_s_at12784_at16906_at12877_at19919_i_at15585_s_at20662_g_at18733_r_at18736_at18638_at13592_g_at14523_at13562_at18003_at20459_i_at16962_at15195_s_at15594_s_at17957_at15128_s_at18670_g_at12893_at18669_at18989_s_at18710_at18018_s_at16523_s_at12581_s_at13574_at15296_at16297_at12809_g_at16930_at18730_at17430_s_at15137_s_at12125_at18650_s_at18729_at13597_at17384_at14689_at16994_at16934_at18573_at12177_at13095_at13096_g_at15206_s_at20050_at13079_at16984_at13916_atAFFX−PheX−3_at15212_s_at16627_s_at15463_at19677_at17435_at15144_s_at15147_s_at17393_at16960_at15155_s_at16548_at18288_at18274_s_at13664_s_at20162_s_at13107_s_at17429_at15452_at12137_at15645_s_at19964_g_at18074_at17420_at13125_at15937_at16076_s_at19216_at12338_at17938_at12412_at12743_g_at15633_s_at12098_at19133_at19359_at12201_at15387_at15177_s_at16019_at18306_at16515_at14645_at15654_at18613_s_at13262_s_at18905_at14642_f_at13157_at17409_at13105_at16435_at17030_at15643_s_at15156_at15117_at15655_s_at16420_at15677_s_at17895_i_at18651_at14018_at17573_at13242_at20180_at16598_s_at15601_s_at15203_s_at15146_s_at18565_i_at16466_s_at15192_at20017_at18330_at16098_s_at18659_at17391_at15663_s_at16510_at18439_s_at15881_at14809_at15094_s_at16480_at19917_at13548_at17921_s_at13163_s_at17858_at15197_s_at17067_s_at16451_at13247_f_at18046_s_at13237_at15171_s_at17471_at19224_at18245_s_at14075_at16975_at14580_at15003_at16146_at15617_s_at13069_at17392_at18273_at13164_at12803_s_at15174_f_at17565_s_at15626_s_at16087_s_at18694_s_at13563_at20546_at15656_at14641_at16528_at15634_s_atAFFX−LysX−3_atAFFX−DapX−5_at20164_at18219_at18937_at15210_s_at13618_s_at19118_s_at12313_g_at19386_at15058_s_at18888_at15079_at16057_s_at12796_s_at17447_at13590_at17385_at14579_at13649_at19197_at18296_at15119_s_at15552_at12787_at13086_r_at13629_s_at16052_at19406_at20713_at12245_s_at15372_at18688_at19227_at19896_s_at18976_at14986_at13099_s_at16647_s_at16513_at20450_at16504_at15398_at16101_s_at18580_at14397_at13559_at17056_s_at18712_g_at20468_at19409_at14627_i_at12830_f_at16058_s_at12808_at16433_at12232_at13657_at18315_s_at19190_g_at14401_at20187_at18682_s_at12819_at14651_at15173_f_at17112_s_at13829_at16937_at17870_at15126_s_at17450_s_at16482_s_at13186_g_at15818_at14508_at19215_at20008_at13232_s_at14674_at13614_at13335_at14954_i_at15444_at18940_at20156_at18696_s_at19940_at14955_at12776_at18886_at15186_s_at16526_at20658_s_at16160_f_at13126_g_at13152_s_at17473_at14057_at12804_at13070_at13190_s_at16546_s_at13097_at13104_s_at13116_at16063_s_at16056_s_at18227_at20434_at18283_at16913_s_at14599_at16125_s_at13158_at17006_atAFFX−PheX−M_atAFFX−LysX−M_atAFFX−PheX−5_at12857_at13636_at15608_at13151_g_at14068_s_at15667_s_at12820_at18605_s_at20378_g_at14581_s_at18610_s_at19682_at16079_s_at16925_at12751_at17994_r_at17063_s_at20629_at16992_at19665_at18025_at12885_s_at15112_s_at13573_at16508_at16538_at16524_at16912_at18689_s_at16608_s_at13098_at13656_at16999_at16529_at18230_at20067_at19999_s_at16492_at18611_at14082_at20300_g_at15129_at18331_s_at16441_s_at16132_s_at13180_s_at15179_s_at15506_at20261_at14697_g_at18686_at19411_at20688_at12156_at19139_at13004_at13285_at19108_at18267_at13934_g_at17105_at12431_at13274_at19464_at15404_at20323_at13275_f_at13279_at13256_s_at17445_at13933_at15005_s_at14980_at20463_s_at13277_i_at13284_at18224_s_at13273_at17113_s_at13188_r_at17600_s_at15680_s_at18668_at16617_s_at19182_at12497_at14070_at15647_s_at16817_s_at16064_s_at13617_at20689_at18300_at12726_f_at16893_at12198_at20335_s_at13705_at19869_at19992_at19263_at13706_at13920_at16232_s_at20489_at18284_at15389_at16393_s_at17300_at18995_at14468_at12958_at15271_at18662_s_at17978_s_at18885_at12912_at17094_s_at12251_at18945_at12801_at13625_s_at13270_at17579_s_at16077_s_at19186_s_at13134_s_at15519_s_at14069_at20030_at12934_at13681_s_at15591_s_at13255_i_at17990_at14622_at20210_g_at17098_s_at14066_at20372_at13641_at16635_s_at14594_at12698_at16989_at12335_at12822_at12571_s_at14608_at14030_at12293_at14591_at18683_s_at14459_at18877_at15405_at12443_at14583_at20401_at17406_i_at14133_s_at20246_s_at15541_at20244_at20185_at17916_at14559_s_at13584_at15923_at14494_at18048_at13011_at18950_at16141_s_at16554_s_at18316_at18062_at17932_at17100_s_at14578_s_at13561_at14582_at12910_s_at14731_s_at12370_at12886_s_at17411_s_at15366_at18031_r_at12048_at18667_at18588_at13579_at17530_at13596_at18634_at19627_s_at12212_at18569_at12969_at19221_at19954_at19171_at16569_s_at12332_s_at20480_s_at15629_s_at15616_s_at14638_at17485_s_at15866_s_at14116_at17963_at13212_s_at15846_at14704_s_at17917_s_at20429_at15522_i_at14964_at19759_at20239_g_at14672_at18255_at12349_s_at19974_at12233_at14111_s_at12353_at15410_at18235_at12880_at17109_s_at15042_at16108_s_at12998_at16140_s_at15886_at16365_at12354_g_at16288_at15551_at15453_at15515_r_at12406_s_at12131_at19834_at19398_at19017_at20284_at18927_at20547_at12409_at13103_at17104_s_at17499_s_at12805_s_at15142_at18971_at13534_at18698_s_at15172_s_at17089_s_at19450_at16448_g_at12521_at15088_at15976_at17017_s_at16536_at15954_at16427_at12815_at18036_s_at15965_at18664_at20467_at19454_at13908_s_at15531_i_at16951_i_at16023_at15554_at17322_at15967_at17428_at19463_s_at17395_s_at19361_s_at14118_i_at18948_s_at16618_s_at14078_at18653_s_at17025_s_at13108_at17908_at13083_at17092_at16640_s_at12782_r_at19410_at19198_at20195_at12180_at16301_at12597_at14417_at19863_at13594_at12794_at16516_at13121_at19422_at19845_g_at15540_at14605_at20151_at20369_s_at20456_at16592_s_at16566_s_at14239_s_at17096_s_at19183_at19900_at20032_at13667_s_at20166_at14624_at11996_at15955_s_at15658_at17934_at13545_at13927_at15941_at13165_at16955_at15534_f_at14628_r_at15009_at13286_s_at16050_at15580_at17936_s_at19563_s_at17872_at19364_at13460_at12058_at20522_at15837_at12120_at13450_at12532_at19648_at17072_s_at15503_at15975_s_at12322_at20636_at17510_at16453_s_at20631_s_at12604_at19670_at19596_at12914_s_at17580_at14088_at14020_i_at14022_s_at15520_s_at14536_s_at17009_at20618_at15571_s_at13646_at12807_at13198_i_at12205_at15894_at18251_at12705_f_at18215_at15581_s_at13143_at19116_i_at16921_at13700_at15985_at13595_at15862_at17449_at20178_at15646_s_at18742_f_at20495_s_at18561_at19839_at20037_at14463_at13244_s_at13564_at15168_at17918_at19366_at17562_at17563_g_at14486_at15595_s_at16894_at19403_at18883_g_at15583_s_at17505_s_at15694_s_at15625_at13652_at15593_s_at15980_at16118_s_at13254_at16591_s_at15466_at17457_at13655_at17458_at14724_f_at18572_at13588_at15190_s_at13236_at18711_at15180_s_at13148_at18244_g_at17114_s_at14067_at17418_s_at19703_s_at16997_at17823_s_at13529_at15412_at14080_at18301_s_at19008_s_at16490_at12075_at16491_at15931_at18285_at18548_s_at19370_at17482_at14572_at12611_g_at18898_s_at17966_at15602_f_at18262_at17129_at18253_s_at13580_at19179_s_at17000_at17949_s_at16110_s_at16069_s_at17252_at13904_s_at17401_at15664_at13591_at15900_at18726_s_at14713_s_at15422_at14432_at13546_at13554_at16977_at16553_f_at20694_s_at18484_at12183_at14897_at14539_s_at20648_at15384_at17383_at14354_at18626_at12605_at14543_r_at12170_at20354_s_at16142_at13118_f_at18645_at17024_s_at20299_at18936_at14416_at20273_at16511_at16687_at20337_at14563_at12645_at18416_at12606_at14369_at18994_at14930_at20525_at18702_at18627_s_at18977_at16068_s_at18246_at13621_g_at16979_at14600_at13953_at19833_s_at12171_at20260_s_at17589_at13025_at14501_at20215_s_at19637_at17470_at17533_s_at18946_at16082_s_at18063_i_at19137_at12846_s_at12128_at15665_at12031_at18591_at19591_at19840_s_at12790_s_at19433_at17464_at20590_at15481_at16571_s_at17433_at20529_at16134_at19176_at18311_at13136_at15341_at12891_at18216_at17886_at14711_at13015_s_at15392_at13550_at17303_s_at13177_at18672_s_at14900_at17379_at14553_at12375_s_at16062_s_at16539_s_at16440_at13645_at17897_at20558_at15600_s_at15683_s_at14500_at17881_at18032_i_at16963_at18983_s_at20194_at18533_at19466_s_at20362_at19720_at16042_s_at18890_at17386_at13785_at12758_at16489_at19660_at12523_at12095_at20360_at13662_at18933_at16031_at16038_s_at17517_at20201_at18607_at16434_at20176_at14634_s_at13144_at15208_at13568_at16120_at13547_at12779_f_at14856_at13467_at16029_s_at13160_at16583_s_at19844_at20167_i_at17424_at13593_at15181_at17861_at14090_i_at17572_s_at16501_s_at15761_at18259_s_at18314_i_at15496_at19401_at14974_at19365_s_at17560_s_at17896_at16486_at13438_at12253_g_at16367_i_at12447_at15792_at16184_at14478_at16696_at13517_g_at13951_at15920_i_at18756_at15451_at13314_at15850_at19075_at12057_at13414_at19690_s_at20324_at19438_at19149_at12460_s_at20150_at16302_at15836_at18652_at12650_at18252_at15857_at18595_at14421_at18954_at17926_s_at12818_at18243_at14999_at16547_s_at13174_r_at20057_at18029_g_at14545_at15006_at15999_s_at16530_at15070_at12174_at14037_at12030_at19144_at13347_at19987_at18223_at13036_at20042_at12218_at18806_at14301_at20028_at18892_at20171_s_at15369_at15368_at15456_at15386_at16319_at17944_at13589_atAFFX−BioDn−5_at13106_at14592_at15370_at16902_at20536_s_at15510_r_at19910_at15651_f_at15692_s_at15057_at18624_at13896_at12052_at15901_at17904_at18635_at14978_at20454_at18304_s_at13967_at19436_at12189_at17466_s_at18004_at17084_s_at14557_at17454_at14688_at14063_at17415_at15868_at15596_s_at13167_at19616_s_at13117_at18281_at12860_s_at14779_at19898_at14469_at19984_at18826_at20574_at19746_at19996_at17880_s_at13172_s_at13607_at18722_s_at13553_at13604_at16596_at13441_at15200_s_at15140_s_at11987_at18685_at12178_s_at19933_g_at13246_at15570_i_at16606_at17601_at13071_at17207_at17991_g_at17567_at17961_at16712_at17416_at14924_at18034_s_at12759_at12092_at15912_at19426_s_at20190_at17860_at13283_at18673_at14703_at19231_at19128_at16936_at18646_at13138_atAFFX−LysX−5_at17955_at17085_at16152_s_at15459_s_at12964_at16716_at18351_at18967_s_at17047_at18328_at16104_s_at19402_at20015_at16143_s_at12408_at12008_at17046_s_at17887_at20701_s_at19960_at17506_s_at15586_s_at12858_at13556_i_at15166_s_at18033_r_at15598_s_at19130_at17044_at16107_s_at15032_at13581_at14686_at13173_at18269_at18632_at19920_s_at14516_at12538_at13137_at14800_at17942_s_at13552_at15695_s_at14586_at12004_at15996_at16028_at19063_at12761_s_at17463_at15120_at17901_at11998_at14126_s_at13635_at19467_at17631_at14286_at19998_at19961_at12135_at18628_at11992_at12223_s_at14477_at16306_at19945_at18656_at14820_g_at13085_i_at17857_at13209_s_at17494_s_at15101_s_at17488_at19710_s_at20576_at19656_s_at18618_s_at19686_at16468_at19700_s_at18579_at16127_s_at20199_at17490_s_at19979_at14436_at15861_at12430_at17852_g_at12786_at12821_at12454_at20396_at14596_at12114_at15441_at20191_at12141_s_at18658_at15089_at12195_at20282_at13654_at19650_at13287_at19369_at12219_at16774_at12647_s_at12118_at19681_at17865_at19902_at14134_at18782_at16672_at14498_at15978_at14072_at16003_s_at15833_at20462_at15635_s_at20229_at13555_at18918_at20545_at14513_s_at16651_s_at14507_atAFFX−BioC−5_at14919_at12666_at14481_at18906_at16074_s_at15673_at13006_at19204_s_at17962_at13208_s_at12631_at20668_at18002_at14625_g_at19376_at18962_s_at20064_at20644_at18769_at17866_at18550_at19185_s_at12485_at17479_at20183_at20510_at14135_at20255_at16124_s_at15109_s_at16949_s_at20305_at12515_at17577_g_at20257_at17744_s_at15909_at14103_at12042_at12968_at13509_at13514_s_at12480_s_at14476_at19991_at20129_at13382_at20134_s_at18907_s_at18021_at18706_s_at15577_s_at17398_at16908_at19651_s_at18598_at16543_s_at19854_at16454_s_at16998_at16604_s_at13261_at17126_at16154_s_at18556_at19200_s_at18225_at19302_at15263_at18417_at12459_i_at16250_at18091_at16309_at17660_at13387_at15883_at16674_at13803_at19362_at19219_at13807_at15771_at17705_i_at15379_at12112_at16260_at20379_at15255_at16236_g_at16720_at14968_at20614_s_at15497_at12531_at18728_at13240_f_at17986_s_at18622_g_at16950_s_at13139_at19191_at18751_f_at12010_i_at14654_s_at17555_s_at13023_at12457_at12975_at16335_at16322_at12060_at13950_at18475_at16229_at15446_at19252_at14312_at19392_at15352_at12493_g_at15815_at16888_at19146_at12494_at19614_at17548_s_at12679_s_at19972_at16467_s_at16385_s_at18974_at12505_s_at12039_at18313_at12560_at13905_at19752_s_at19717_at12852_s_at16939_at14095_s_at15371_at20113_s_at14479_at20615_at13493_at19832_s_at14564_at19210_at19969_at20621_at16987_s_at14430_g_at14509_g_at13648_at18612_s_at15553_at14540_at17419_at12793_at15623_f_at12576_at17939_at15270_at12110_at17275_i_at15491_at14874_at12974_at19663_at15832_at17730_at15402_at20531_at18762_at20368_at18972_at18218_at18760_at20665_s_at18639_at17906_at15681_s_at17480_at18303_at13113_at17465_at17815_at15082_at16941_at15469_at19990_at15549_at17937_at12102_at14531_at12674_at17596_at20294_at15599_s_at19192_at12034_at15592_s_at14678_f_at19708_at19208_at18238_at15081_at16863_at19358_at12175_at16117_at13480_at16067_at12810_at13669_at16133_at16518_at16089_s_at16652_s_at17057_at13044_at18956_at14595_at17053_at16252_at18555_at19642_at12879_at12333_at17840_at19704_i_at17681_at13449_at14609_at17273_at17026_s_at17008_at13217_s_at19267_s_at19193_at16073_f_at19695_at17907_s_at13322_at12345_at20698_at12577_at17204_at18597_at12574_at12500_s_at17894_at19860_at18963_at16631_s_at18978_at12897_at13842_at20144_at17752_at14431_at19672_at12094_at14398_at14032_at18287_at18719_r_at16088_f_at18720_s_at12539_s_at20382_s_at15040_g_at14490_at20384_at20421_at20291_at20269_at13627_at12356_at14016_s_at14620_at19762_at17775_at12115_at18250_at19993_at20420_at13880_s_at16632_at14083_at19640_at20639_at13219_at12344_at14763_at12953_at20479_i_at19789_at18201_at15383_at14495_at14005_s_at12449_s_at14009_at13906_at18231_at14665_r_at17953_at14015_s_at20263_at19447_s_at19594_i_at13622_i_at16099_at19699_i_at19446_i_at13115_at18022_at16126_s_at12369_at13041_at16629_at15622_s_at20168_s_at14660_at15199_s_at20617_at12795_at17570_g_at18318_at13068_at18928_at14614_at15660_s_at17995_s_at14061_at20066_at17592_at16641_at13147_at13605_at16171_at18713_at12511_g_at20408_at17483_at15505_at15579_s_at18675_at15488_at20285_s_at15542_r_at18684_at14320_at13014_at14673_s_at19894_at12081_at17922_at16860_at20271_at14602_at13130_at20460_s_at18109_s_at20287_at17924_at20069_at12217_at19531_at18194_i_at15423_at15882_at18346_at20002_at12959_at18154_at14926_g_at12966_s_at20656_at16978_g_at19060_at20582_s_at17378_at12392_at20189_at12280_g_at13052_s_at19906_at12284_at12675_s_at12145_at19878_at12227_at19400_at15059_at12589_at16286_at18676_g_at18512_at12989_at20138_at18348_at15388_at20012_g_at14429_at12540_at18991_s_at12394_at18813_s_at14870_at20660_s_at16310_at18402_at20100_at18750_f_at13871_at18808_at19728_at18309_s_at15418_at14942_at19988_at14995_at17905_at15029_at17033_at14542_i_at20539_at20336_at14907_at15207_s_at17040_s_at20054_at15687_f_at15685_s_at16642_s_at13637_at14677_r_at13658_at17914_at16559_s_at14630_s_at17591_at16595_s_at16968_at16650_s_at13587_at15864_at11994_at12029_at20014_at13145_at15035_at13175_s_at14128_at15676_at14097_at16576_f_at15043_at13090_at12006_at13601_at17399_at16041_s_at18574_at15610_at16168_s_at18674_at19924_at15674_at14056_at16123_s_at16980_at19989_at13960_at18286_at14021_r_at17593_r_at13560_at13290_atAFFX−CreX−3_st12926_s_at20204_at20567_at17507_at18257_at15202_s_at15606_at16555_at19948_at15065_at14679_at20056_at20160_at15859_at19742_at15823_i_at19561_at12176_at14387_g_at19632_at13897_at14601_at19553_at18291_at17958_at18957_g_at15028_at14384_at16579_at15063_at13538_at17286_at14503_at16382_at14316_at13128_at12579_at16784_at15899_at15382_at14269_at14391_at20214_i_at19925_at18617_at20340_at12202_at19647_at12003_at16625_at14454_at13042_at20202_at13127_s_at13471_g_at18023_s_at18070_r_at19388_at13532_at20198_at19947_at18102_at17169_at12432_at14875_at15086_at18643_at18893_at18958_s_at19407_at13017_at15408_at14990_at12374_i_at19638_at15184_s_at13609_at18911_at15817_at17323_at19549_s_at20719_at19172_at18600_at18566_s_at18761_at14945_at15611_s_at15670_s_at13094_at15669_s_at18695_s_at19841_at15438_at19034_at19199_at12049_at19658_at18925_at17342_at15578_s_at17999_at14519_at16965_at20446_s_at17935_at17862_at13558_s_at15689_at16043_at12765_at14615_at18630_at14979_at19230_at16204_at20638_at14460_at16161_s_at16532_at13683_at15636_at17467_at16066_at14646_s_at20616_at19368_at12387_at20118_at17352_at19314_at19142_at15424_at12730_f_at20572_at17382_at14505_at13613_at18985_g_at12148_at12193_at20506_at20444_at16690_g_at19994_at17902_s_at18623_at20068_at18256_at19957_at17427_at16956_at20353_at20172_at17538_at13207_at14058_at13101_at14010_at14914_at16294_s_at18973_at12304_at18233_at20650_i_at20685_at12172_at11988_at19205_s_at13197_r_at13064_at18981_at18745_f_atAFFX−BioB−3_st15201_f_at20303_g_at16580_s_at16847_at12068_at13131_at19119_i_at19121_at12312_at19125_s_at16603_s_at16990_at15929_at12028_at18616_at13492_at18272_at12719_f_at12709_f_at12213_at19956_at14045_at15045_at16312_at12453_at20367_s_at20437_at19448_s_at15049_at20612_s_at19882_at19652_at20317_at20440_at19982_at12723_f_at12150_at20023_at15961_at20470_at14025_s_at18910_i_at19389_at18942_at17003_at15328_at14828_at13508_s_at19826_at14965_at16296_at19099_at16792_at15343_at16408_at17181_at18507_s_at13884_at18197_at16806_at12491_at17368_at17616_at18820_at16239_at18199_at16238_at13410_at12543_at12191_at19645_at16333_i_at14393_at17239_at19431_s_at12616_i_at17439_g_at11986_at15403_s_at13910_at18831_at19684_at17481_at19866_at12617_s_at12159_at20652_at15933_at20121_at13974_at14861_at20152_at13027_at19942_at20627_at20325_s_at20158_at12522_at15031_at15489_at18965_at20314_s_at13798_at14956_at12020_at12613_at19930_at20173_at18966_at14081_at13577_s_at15548_at13084_at20628_at20389_at13887_s_at19483_at15852_s_at13523_s_at17215_at19512_at13948_at19258_at20070_at14768_at18420_at14771_at16675_g_at17645_s_at15781_at18097_at15849_at20149_at12592_at16693_at12222_s_at13883_at19039_at13850_i_at18755_at15299_s_at19941_at15921_s_at18536_at18409_at19734_at15281_at18344_at19978_at13457_g_at18353_at17722_at18354_at16282_s_at14770_at20217_at16245_at16248_at13671_at14772_g_at17155_at19722_at13352_at19485_at12124_at14806_at19040_at12255_at19828_at19332_at15256_at17619_at16212_at14270_at15421_at14921_at14889_at15855_at20031_at14288_at18411_at15544_at19453_at16164_s_at12300_at17032_s_at13111_at20635_s_at12471_s_at18924_at12072_at13297_at14623_g_at20342_at18176_at16972_at14064_at14052_at16502_at13628_at18265_at18030_i_at18680_s_at14060_at17019_s_at17542_s_at15650_s_at13192_at16494_at17525_at17495_s_at17948_i_at20200_at18308_i_at12610_at15437_at19622_g_at19415_at16630_s_at19177_at14786_at14039_at16045_at12932_s_at16112_at19636_at17055_s_at13406_at16085_s_at16993_at17102_s_at20344_at20568_i_at13293_s_at19187_at20312_s_at15183_s_at15178_s_at12861_s_at13252_s_at17196_at15896_at12196_at15632_s_at16971_s_at12575_s_at12037_at15666_at17037_s_at19847_s_at14712_s_at18909_s_at16522_at13530_g_at20018_at20431_at16472_at19889_at15475_s_at12373_at17443_at18681_at17373_at13063_at12001_at14735_s_at13647_at13087_at15355_at12421_at19211_at19674_s_at12074_at12627_at20588_at20264_at20486_at20245_s_at18984_at17923_s_at14664_i_at14666_s_at14983_at18585_at13575_at15007_at17461_at15188_s_at15474_at12452_at20154_at12275_s_at13488_at18943_g_at12950_g_at20651_at14011_at20517_at13695_at14934_at14867_at19875_at12962_at12347_at12476_at17476_at19471_at20455_at13972_at19630_at16636_at17911_at12162_s_at15068_at20363_at20385_at20691_s_at19683_at20081_at12678_i_at15034_at20361_s_at19995_at15502_at16961_at19174_at18938_g_at20347_at19391_at15041_at16065_at15064_at14631_s_at18049_at16628_at18240_s_at20274_at15055_at13005_at19382_at16599_s_at14584_at15467_at18319_g_at14091_r_at12229_at17115_s_at14617_g_at19698_at12348_at16326_at18431_at15477_at17353_s_at13981_s_at14003_at20127_s_at20573_at15795_g_at12184_at17808_at20711_at19782_at16681_s_at13901_at14071_at18213_at12132_at16394_at20398_at18800_at19117_s_at13012_s_at18568_at19885_at12149_at20706_at13903_at20044_at12483_at15397_at12147_at17362_at19617_at16084_s_at20560_at19437_s_at18278_at20699_at18592_at17998_s_at17408_at19218_at12780_s_at19086_at20578_s_at17670_at14014_at12486_at18367_at19499_s_at12290_at18342_at17213_at17968_at12659_at12154_at12215_s_at19904_at14749_i_at18182_s_at20707_at16814_at19233_at17743_at14383_at12949_at17360_at15273_at15365_at20395_at12130_at16172_at12214_g_at12661_at12226_at15826_at20136_at20518_at15831_at18195_at14784_at17139_at13784_at12160_at18506_at14377_i_at19091_at12221_at15810_at12585_at19257_s_at18754_at14338_at13484_at14441_at15753_at19295_at15845_at15888_at15801_s_at18403_at17182_at15350_at12993_s_at20383_at15769_at14762_at18337_at12484_g_at17319_at17320_g_at19344_s_at19819_at18404_at19274_i_at16230_at16298_at15348_at15839_at15841_at14007_at15870_at14489_at14920_g_at18578_at18325_at19212_at17892_at15025_at19905_at12121_at19936_at15353_at18776_at18339_at17873_s_at17351_at15415_at14895_at19347_at12519_at15223_s_at16797_at13983_at13020_at18436_at12192_at14864_at15726_at17827_at17334_at16763_at17226_g_at15325_at19513_at17132_at17662_at16744_at14776_at20593_at20326_i_at19798_at13945_at19481_at14808_i_at16214_at17973_s_at17623_at13329_at15783_s_at16671_at13362_s_at19581_at18385_at20109_at18425_s_at17720_at15807_at16176_at15721_at14434_at19259_s_at17627_at18856_at19111_at14331_s_at15749_at19001_at16345_at14279_s_at14474_s_at14831_at15247_at19289_i_at16177_at18350_at13851_s_at16751_at13447_at18917_i_at20659_at20524_at20155_s_at19397_at16593_at18369_at13229_s_at19049_at12945_at20499_at18913_s_at20594_at20179_at15691_s_at14450_at19232_atAFFX−BioC−3_at14483_at20566_s_at13429_at14873_at19012_at16093_s_at17987_at17425_at18248_at15046_s_at14603_at15407_at16535_s_at20048_at20521_at14950_at19019_i_at16083_at15524_at20438_at19430_at12727_f_at19158_at15132_s_at15867_at19189_at18232_at20670_at20089_at15511_s_at12967_at15872_at12143_at20062_at20350_s_at17039_at17564_s_at15211_s_at18010_s_at13233_at16094_s_at18323_at15501_at13630_at19858_s_at15158_s_at16102_s_at14989_at17431_at12931_s_at13824_at17170_at14386_at17568_at17666_at17237_at15928_at13651_at17985_at16533_at16103_s_at17543_s_at15464_at14121_at13606_at15161_s_at15946_s_at18880_at17474_at15517_at18621_at14131_at15631_s_at12602_at15010_at17127_at17569_at17456_at16155_s_at17566_at12940_at16558_at12938_s_at17952_at15794_at17035_at20409_g_at15490_at20674_s_at20596_s_at15092_at18305_at18045_at12315_at19373_at17595_s_at12951_at12621_at18896_at19868_at13943_at17732_at16567_s_at14996_s_at19968_at19494_at13551_at15143_s_at20661_at19901_at16966_s_at19452_at19836_at19891_at18876_at18280_at14029_at12405_at15167_s_at20432_at19113_s_at14510_at20672_at18939_at12241_at19356_at19949_at15472_at20497_at12747_r_at20315_i_at14036_at18584_at18952_at16564_s_at17432_at20684_at13024_at15139_s_at19132_at16882_i_at19607_at17101_f_at17537_s_at15004_at14588_at15401_at18487_at17001_at20060_at16138_at14571_at20626_at15024_at15829_at12265_at15574_s_at13615_at14568_s_at15514_at12930_at12002_at13672_at13958_at16328_s_at20523_at18188_at15934_i_at17405_at16266_at20449_at13868_at19209_s_at17254_at17174_at13482_at18504_at18535_at12179_at17208_at18150_at12204_at14002_g_at20585_at13965_s_at19404_at12053_at20654_at13422_at19354_at16986_at18690_at12656_at20552_at14923_at13984_g_at13874_at16802_s_at19503_at14411_at13918_at18407_at16412_s_at14310_at16210_at19569_at13389_s_at16242_at12955_at18365_s_at14371_at14918_at16300_at13324_at12015_at19533_at15705_i_at19308_at13383_at14854_at17183_s_at15770_at16682_at19360_at15546_at13224_s_at18944_at20605_at12235_at20020_at13176_at14124_s_at18859_at19952_at15237_at20207_at19379_atAFFX−BioB−5_at19661_at13178_at17372_at20496_at12667_at14981_at14985_s_at15518_at17492_at12082_at18546_at18532_at12022_at13161_at14043_at17993_at12362_at13963_at14284_at18829_at17959_at17400_s_at13500_at16002_at13334_at15022_at19865_g_at12211_at12899_i_at20669_at17584_at17549_at17508_at12707_f_at19888_at19412_at18798_at19222_at13366_s_at20441_at13001_at14661_at13250_s_at18575_at18289_at19170_at19427_s_at14521_at19687_g_at12972_at14961_at18463_s_at13917_at20007_at15375_at14913_at19502_at14112_at18336_at14940_i_at12427_at14427_at15802_s_at17209_s_at17738_at19724_at17838_at19213_at19871_at18897_at20538_at19939_at18951_at14858_at12541_at15478_at12306_at14389_at19962_at18987_g_at15834_at18329_at19811_at14295_s_at16820_at14282_at12559_at12289_at12224_at12646_at14957_at19570_at17636_at17317_at15873_at14824_at16822_at19253_at18415_s_at13876_at18181_s_at12325_at12626_at20526_i_at20093_i_at12619_at16871_s_at15786_i_at19729_at15359_at19757_at17675_at17257_s_at15774_at12651_at19484_s_at13355_at16206_at14635_s_at14740_i_at14774_i_at16378_at15226_at18099_i_at19730_at15816_at16271_at16268_at17194_s_at16410_at16864_i_at18845_at12961_at20106_s_at20512_at14049_at13495_s_at18870_at12851_s_at12871_s_at13928_at13501_at13047_i_at20458_at13196_i_at12710_i_at16317_at18858_i_at20148_at12663_at17988_r_at17603_r_at20277_i_at12715_f_at19610_g_at18665_r_at18968_at20209_at16562_s_at19662_s_at12737_f_at16586_s_at18044_f_at14130_at19207_at17076_s_at17550_s_at20228_s_at16610_s_at19633_at12928_r_at15693_at19935_at12578_at14293_at20687_at14648_at17553_at14435_at17989_s_at15968_at17403_at18738_f_at20610_g_at17665_at12305_i_at15994_at17036_s_at20309_at16255_at20543_at20580_at17813_at20579_at12277_at19423_at13141_at17124_s_at16287_at14905_at15236_at18757_at13987_s_at19175_at12063_at20598_at15470_at13676_i_at17597_at15533_at14561_at17884_at14992_at15460_at17518_s_at12812_at18137_at17731_at14352_at12264_i_at13038_i_at20266_at12501_at12570_s_at17486_at13991_at19776_at19270_at20478_at17272_at15797_at20343_s_at14647_at13436_at12701_i_at12220_at17108_s_at12668_at17833_at20563_at18697_i_at20597_at19449_at16092_s_at12366_s_at19079_at19588_at13332_at12363_at18069_at15563_s_at15526_at12865_at12721_f_at12718_f_at17559_at17147_s_at19304_i_at19047_at16684_at15701_at20680_at14819_at20681_at13368_at13826_at19013_s_at13398_at12704_f_at14446_at12563_at12512_at12266_at17270_at15302_at17337_at12206_at13057_at15346_at12418_at16072_s_at16534_s_at17498_s_at17972_at17970_i_at16616_s_at14629_r_at17038_s_at13243_r_at18068_r_at13677_r_at12564_at18737_f_at17974_at16498_at12863_r_at12729_r_at13065_i_at20476_at20377_at16623_at13599_at15951_i_at17462_at17534_at14611_s_at14653_s_at18028_at13281_s_at17598_g_at12199_at17809_at19310_at16254_at16221_at15413_at18801_at12664_at19279_i_at13828_at20086_at13374_at13343_at18586_i_at12083_at12687_at19461_at12981_g_at12276_at19980_at19408_at20080_at18204_at12371_at18136_s_at12228_at16350_at18796_at19779_at13925_at18105_at12045_at19613_i_at13802_at16189_at14817_at18145_at20265_at12498_at14939_at20267_i_at13289_s_at16946_at12717_i_at13531_at20236_s_at16776_at17762_s_at18759_at15306_at12301_at20564_at20366_at12367_at15492_at20697_at20528_at19223_at14499_at15944_at17856_at17605_at15821_i_at20453_at15926_at13468_at13149_at17502_s_at18648_at15762_at19234_s_at19938_at16006_at15509_at15000_at18872_at13216_at13463_at19120_at18103_at14317_at14351_at20251_at19334_at13931_at20083_at17602_i_at13431_at16122_s_at19377_at16877_at12699_at17371_at20696_i_at13303_at15239_at18758_at20714_at20544_at20602_at13794_at18333_at16658_at15245_at16656_at17142_at15730_at13050_i_at16750_at16228_at20208_at12311_at20549_i_at18457_at19556_at17341_at18169_at12433_s_at12134_at13371_at14745_at19527_s_at16884_s_at20375_at18454_g_at14909_at14782_at14816_at15272_s_at14322_at19309_at16812_at19015_at19269_at16887_at17802_at15724_at16773_at19829_at20075_s_at13394_at13820_at16244_at14345_at16277_at18381_at17757_at15262_at17655_at16178_at15332_at15329_at17162_at17263_at15788_at19006_at19770_s_at14936_at17200_at17167_at14313_at19109_at20104_at16874_at18161_at17717_at16837_at19107_at13882_at19514_s_at15757_g_at15292_s_at18448_at16209_s_at18850_at17790_g_at13326_at19255_at19803_s_at15759_at18785_at17618_at19582_at14275_at14305_at12425_s_at12489_at17785_s_at13878_at16835_at19579_at18783_at17335_at14804_at15297_at19330_at14837_at16804_at13423_s_at16808_s_at14935_at19768_at19331_at16704_at14807_at19066_at18513_s_at20426_at19620_at20490_at15854_at19825_at12694_at20010_at12257_at20221_at16341_at18164_at18438_at13913_s_at12657_at17789_at16314_at17328_at17190_at14798_at18553_at13947_at14017_at18443_at12690_at16376_at12520_at14835_at17824_s_at19105_at14409_at16308_at14898_at13456_at12691_at17359_at17756_at17358_at20072_at19732_at19000_s_at16865_s_at15744_at17714_at15742_s_at16730_at17679_at13380_at15316_at19824_at16211_at15289_at19104_at15715_at15326_at14304_at19478_at20135_at19029_at20097_at13875_at18155_at15879_at14785_g_at20218_at14823_at13872_s_at18502_at19313_at17780_s_at19342_at14790_at13844_at13809_at12655_at17242_at19112_at15391_i_at13515_at17639_at19495_at19069_s_at18156_at17258_s_at14866_s_at17295_s_at17293_at17327_at19827_at20006_at12056_at12658_at20071_at19795_at17366_at14461_at14933_at12624_at13936_s_at17285_s_at13911_at14402_at16407_at14462_at17847_at16404_at17282_s_at14929_at18152_at14851_at15283_at16366_at18843_at20091_at14821_at16295_at19027_i_at17267_at13788_at17367_s_at18131_at19550_at16741_at18449_at16403_at16203_at14902_at19576_at13357_at15840_at19753_at16669_at15719_at19800_at16371_at14272_at16379_g_at15361_i_at16180_at15227_at18846_g_at17819_at19733_at13852_at13391_at19073_at17651_at16738_at17161_at16765_at18374_at15313_at17217_at17133_at14822_at16858_at15314_at14361_at18863_at13411_at17644_at16726_i_at14330_at16205_at16759_at16179_g_at14802_at13349_at17682_at14827_at19511_at14794_s_at15717_at17364_at19325_at13415_at19320_at18157_at16760_at15767_s_at18115_at12618_at12355_at15231_at14839_at12528_at12258_at18810_at19322_at16862_at17749_at16399_at19038_at16405_at18186_at16369_s_at20222_g_at20607_i_at20313_at14927_at17846_at13937_at12518_at20035_at12956_i_at12086_s_at17969_at16375_at13946_at14862_at12978_at12203_at20673_i_at17279_at15304_at18986_at14813_at13502_at19741_at20237_at15729_at12416_at14323_at14952_at15347_at18146_at20254_at17837_at17777_at13473_at12272_at19025_at13336_at17804_at15838_at18526_at19341_at19750_at16789_at17311_at12372_at16397_s_at13833_at18461_at14392_g_at16392_at17669_at16816_at18117_at14329_at16857_at18112_at16824_at13379_at13341_at14885_at13864_at18209_at17734_at19336_s_at13955_at17271_at19669_at14887_at19712_s_at13990_at19557_at16780_at18107_at18520_at12584_at12712_f_at19943_at19096_at16843_at18563_s_at12634_s_at15367_at14846_at12632_at17034_s_at19441_s_at20161_at20295_at15196_s_at16078_at18716_at14487_at15508_at20022_at20637_at18317_at20620_g_at15532_r_at13298_s_at13231_at17586_at17446_at12876_at12869_s_at12913_at12870_s_atAFFX−TrpnX−M_at12864_s_at16560_s_at18013_r_at14076_at14604_at17496_at13660_i_at18241_at12282_at12016_at16187_s_at12943_at12334_at15048_at17305_at12596_at16226_at15800_s_at19602_at15330_at15722_at17251_at14765_s_at17680_at16373_i_at18542_at13397_at16841_at16733_s_at20074_at19542_at13780_at19580_at19260_at18817_at19042_g_at14773_at18784_at19547_at12456_at13816_at14797_s_at17156_at19793_at18509_at17157_at13327_at16207_at16364_at19572_at18540_at19543_at16401_at15321_at19506_at18092_at15419_at19065_at16798_at15874_at19534_at13416_at19566_s_at18113_at13448_at17316_at19062_at19343_s_at16361_at14465_s_at16892_at17778_s_at18517_at17287_at18500_at15351_at17772_at13055_at19723_at16274_at20128_at14265_at16398_s_at12458_at14445_at18772_at17844_at19721_at16597_s_at13912_at19031_at16830_at20132_at19318_s_at14365_at17250_at17188_at18377_at14363_at15707_at18992_at17784_at17711_at13346_at13481_at16332_at18867_at17245_at19067_at14759_at16691_i_at17186_s_at15806_at18503_at12415_at20101_at18398_at14963_at15288_at16722_at13836_at13835_at19814_at15279_at15312_at15706_s_at20084_at17668_at14324_s_at14327_at16264_at17845_at19790_at16347_at14406_at15712_s_at16695_at14761_at20033_at13843_at14932_at13847_at18166_at14268_at19799_at15322_at16866_at16800_at17748_at15323_at14439_at12420_at19256_at13454_at14470_at17786_s_at16719_s_at17152_at16269_at14366_at16728_at16727_s_at16757_at19507_at12590_at12654_at19763_at18468_at19536_at16330_at19291_at15449_at17635_at14336_at15318_at14826_at13313_at16795_at17706_at19760_at19727_at19036_s_at17710_at17683_at19791_at15709_at14891_at14767_s_at15221_at18812_at19101_at17289_at17783_at13451_at13915_at14405_at15454_at13808_s_at19504_at17355_at16202_at15747_at19032_at18473_at18999_at18807_at15254_at20141_at14334_at19002_at16796_at15426_at18511_at19005_at13792_at20647_at16190_at20608_s_at20115_at15703_i_at13791_at20114_at20511_at18527_at13894_at12013_at12044_at13827_at15731_at13862_at17240_at14291_at12514_at12076_at17140_at17143_s_at18334_at19529_at15803_at18406_at16685_at15308_at17214_at15244_at19277_at15738_at12478_at17177_at17707_at17146_at12377_s_at12612_at15698_at12342_at15307_at19489_s_at14325_at14328_at14292_at16678_at17171_at19046_at18861_at20147_at16823_at19394_at19420_at12537_at19018_at13013_at14420_at14780_at20475_at20604_at20311_at19162_at12346_s_at12209_at12142_at16263_at20646_at18576_s_at19675_at19080_s_at20411_at20119_s_at16223_at18609_at19812_at17308_at12310_at12413_i_at18341_at17210_at14368_at16291_at13339_i_at16193_at16262_at17176_at14326_s_at16292_at17243_at18340_at17211_s_at19371_at18558_at17803_at20279_at12479_at13863_at15697_at12986_at15765_at20177_at12582_i_at18895_at20004_s_at19138_at19971_at20514_i_at13793_at20519_at19653_g_at13973_at19619_at20655_at20253_at18741_f_at17796_s_at18857_s_at19589_s_at19609_at20241_at19709_i_at12901_s_at20364_at15984_s_at13181_at13638_at15662_s_at15499_at12925_s_at13291_at16633_s_at17531_at17023_s_at16582_s_at12725_r_at12878_at18058_r_at13248_at12873_at18055_i_at17130_s_at12463_at15480_at12200_at18260_at18061_r_at18064_r_at14906_at18884_at17516_at20500_at19355_s_at20498_at16581_at15557_i_at13566_at18006_at20203_at14137_at13470_at18292_at20235_i_at12872_at14102_at17509_s_at18748_f_at17396_at12467_at20565_at15033_at20435_at13066_r_at14084_at15564_at17451_at14613_at14977_at12720_f_at12329_at11995_at14113_r_at20473_g_at15572_at16096_at17945_atAFFX−MurIL4_at16572_at17029_s_at18740_f_at14619_at15169_s_at12328_at16224_at17633_at12138_at12448_at18828_at19346_at19058_at17871_at16139_s_at17527_at18589_s_at16791_at20416_s_at15935_at13018_at18825_at20094_s_at18499_at14888_at17290_at19095_at16338_s_at19098_s_at12588_at13511_at19088_at13054_at20318_at12351_at19535_at12548_at19837_at12551_at19820_at12652_at14260_at15277_at17733_at18208_at18489_at16715_at16256_at19241_at12445_at19501_at19676_at14748_s_at12106_at12263_at14289_at18792_at18793_at18524_at14814_s_at19419_at16749_s_at15732_at17736_g_at12948_at20417_at19934_at12439_at17981_s_at19649_at12271_s_at18460_at12316_s_at18922_at17634_at20283_at18364_at14424_at16384_at19300_at18764_at12700_at12638_at19022_s_at13930_at18490_at18830_at19272_at18139_at12435_s_at16253_at15727_at12665_at12462_at12296_at17309_at20595_i_at17269_at18452_at15856_at15822_s_at11999_at16809_at19772_at20332_at14871_at13891_at15269_at13988_at18485_at20230_at17336_at18718_s_atAFFX−YEL018w/_at15827_at12533_at20268_s_at19966_at14953_at17842_i_at20349_at19243_at17767_at18765_at13994_at18766_at13505_at19306_at14318_at17582_s_at12973_at12670_at13859_at17370_at20394_at12562_at12064_at15696_at16810_at15796_at18832_s_at14845_at12100_at19608_at19775_at19808_at12309_at19235_at19357_at20306_at15925_at12573_s_at20485_at19236_at19246_at16421_i_at19668_at13153_r_at14447_at12716_f_at17107_at12731_f_at17604_f_at14125_at12464_at14548_at20532_at15525_at13251_at18293_at14457_at19081_at20466_at12466_at13469_at16973_at17607_s_at12278_at13278_f_at14744_at14544_at19451_at15779_g_at15778_at13806_at14612_at14249_i_at15485_at16638_at20373_at18889_at12994_s_at19673_g_at20242_at19857_at19202_at19180_at12965_at13029_at14404_at20120_at14529_at16059_at19220_at19168_at13009_i_at15529_at16995_at14558_at14872_at12000_at15892_at19883_at16643_s_at15621_f_at17539_at16484_s_at16634_s_at17060_s_at17874_at15575_at14514_at17514_s_atAFFX−YEL002c/WBP1_at15052_at12609_at20308_s_at19024_at19150_at12933_r_at12216_at19284_at14984_s_at13549_at13215_at13432_at17511_s_at14636_s_at14967_at18804_at18837_at19184_at12552_at18847_at19667_at20197_at17440_i_at15675_s_at14882_at13666_s_at16173_s_at15547_at13100_at17069_at13543_at12530_at13159_at17698_at15047_at12012_at20407_at14109_at14120_f_at14381_at14026_at13996_at14976_at12307_at16409_at16311_at18541_s_at18479_at16360_at12007_at12330_at20356_at12689_at13381_at19726_s_at13978_at19244_s_at12960_g_at19462_at16721_at20515_s_at20482_at19747_at18930_at13238_at14101_at14502_at12005_at18803_at13404_at19749_at13999_at18116_s_at15378_at12630_at19082_at19048_at13235_at14418_at13437_at15755_at18814_at15720_at18393_at18394_at17168_at13425_at14739_s_at18387_at18426_at18390_at16181_at15264_s_at18163_at19329_at17231_at17728_at18358_at14841_at17723_at17692_at15790_s_at17727_at17690_at16740_at17230_at18418_at17228_at13392_at17189_at17719_at17165_at17160_at17197_at16739_at15266_at15756_at14308_at15294_at13885_at16208_at16836_at13395_at16803_at15232_at17299_s_at17268_at17301_at16218_s_at14842_at18132_at15235_at15300_at19585_at14315_g_at14348_at16413_s_at19788_at13888_s_at13815_at18822_at19298_at14408_at16340_at18474_at13489_s_at16346_at20108_at16769_at16275_at13031_at14710_s_at16344_at16875_at15428_s_at15917_at18753_s_at19041_at15259_s_at19517_at18821_at20077_at19766_at18424_at18388_at16705_at14314_at17694_at17693_at12291_at11993_at20292_at20625_at12360_at19468_at12259_at20225_at20039_at20422_g_at15301_s_at16876_at13856_at15267_at19618_at12188_at12225_at12080_at12207_at20623_at12426_at15918_at14410_at13358_at12158_at17330_s_at12321_at18127_at17760_at18480_at18848_at13822_s_at17622_at18200_s_at17235_at14281_at19076_at20076_at17191_i_at17333_at16283_at16276_at17166_at13821_at16709_at19482_at19297_at18516_at14738_s_at16667_at17658_at14969_at14473_at16703_at15787_s_at16699_at14306_at15723_at17725_at15260_at17687_at17164_at15293_at17624_at17721_at17195_at16736_at17718_at13703_at17615_at16199_at15813_at16235_at14307_at14276_at17158_at19797_at19004_at18752_at19070_s_at13845_at19296_at17787_at13356_at18505_at16174_at16381_at18481_at15257_at19323_at14370_at15229_at13886_at16770_s_at15261_at16838_at14834_at16701_at15785_g_at17632_at17238_at17172_at14691_at16573_s_at18969_g_at20542_at13162_at18007_i_at15198_at20402_at15556_at13709_at15661_s_at13860_s_at17699_at16747_at17137_at13400_at19599_at15798_at19809_at17138_at16137_s_at12361_at14951_at15860_at19641_at19374_at20296_s_at13586_g_at18881_at13585_at18934_g_at12046_at20330_at18282_at20327_s_at19440_at16541_s_at17868_at15095_at19626_i_at12738_r_at20641_at20481_at16680_at14319_at15764_at14829_at15768_at16717_at18370_at18405_s_at14752_s_at16683_s_at13797_g_at19624_at16746_at13830_at14250_r_at17967_at20223_at17051_s_at12895_at14252_f_atAFFX−TrpnX−3_at14707_s_at18724_s_at12918_at12939_s_at13199_r_at20527_s_at15036_r_at16167_s_at20433_at16622_s_at16590_f_at16551_s_at16556_at16169_s_at13542_at14616_at17519_s_at17528_s_at18593_at14541_at13626_at18226_s_at15527_at14525_at13570_at12708_f_at14517_at16545_s_at15618_s_at15018_at12917_at15653_s_at17120_s_at15597_at15983_at17535_s_at15462_at16589_s_at16166_s_at13644_at13611_at13249_s_at12866_at13201_at18727_at14042_at13206_at13661_at12867_at16130_s_at15657_s_at18026_s_at20601_at12735_f_at14522_at15555_at17610_at18746_f_at12856_r_at14524_s_at17851_at12071_at20667_at16557_s_at20231_at17111_s_at12050_at14538_r_at19203_s_at12862_i_at13659_at18903_at20702_at16355_at12882_at18649_at14038_at17444_at20252_at17478_at18923_g_at15942_at15943_at16601_at12711_f_at14138_at15543_at19173_at15060_at20351_at18493_at17404_at17965_at15016_at13156_f_at17022_f_at16537_s_at18035_at12099_at15053_at15550_at13200_s_at12303_s_at20501_at19225_at13082_at18041_r_at14034_at12441_at13957_at18302_at13583_at14555_at13620_at19985_i_at20428_s_at14589_at17123_at17071_s_at18038_i_at16163_s_at12874_at16220_at12706_f_at20671_i_at20664_s_at20679_at18919_at20469_at15545_at20233_at17042_s_at15019_at17854_at20334_s_at15567_at12032_s_at18027_at17890_at17091_s_at14093_at14637_at17453_at15991_at15493_at18739_f_at12970_s_at12093_at12397_at17829_at19774_at12529_at19740_at14633_at17571_at19694_at14971_at18203_at12608_i_at12261_at18522_at20410_s_at15699_at12047_at17764_s_at17606_s_at14054_at17556_s_at18320_s_at15458_r_at12635_s_at12197_at14618_at12696_at19265_at14811_at14844_at19157_at13365_at15400_at13434_at12629_at15728_s_at12033_at16845_at20174_at13171_at17956_i_at13643_at20600_at12011_s_at20712_at14243_at19913_at19659_at19846_at12607_at12446_at18207_at19899_at13956_at19806_at18372_at19305_s_at12444_s_at12140_at14943_g_at16259_at16846_s_at12673_at15766_at15835_at20029_at18202_at19333_s_at17795_at14972_at13921_at19523_at19520_at19486_s_at15336_at15945_at18982_g_at18518_at12267_at17763_at14993_at16679_at19011_at18324_at19697_at18900_at13375_at20465_s_at13338_s_at15275_at13475_at18142_at17141_at15364_at13427_at18833_at14815_at14415_at12568_at12672_s_at16318_at13497_at18321_at17611_at20534_g_at16788_s_at18915_at18171_at12637_at15927_at13929_g_at16748_s_at16885_at19559_at14493_at14419_at16359_at17278_at16859_at18206_at18871_at12314_at18794_at20472_at16222_at17284_at12475_at18797_at18429_at15414_at14893_at13280_at12971_at17928_at20304_at20270_s_at16550_at12368_at16947_at18295_s_at12403_at12237_at19339_i_at13602_at12633_g_at12728_f_at18721_at12502_at16304_at16331_at12992_at20281_at12350_at12451_at16358_s_at16855_g_at17703_at16363_at17737_at16851_s_at13440_at20348_at19784_at14399_at14886_at17350_at19725_at17977_at18839_at19425_i_at14053_at17377_g_at12250_at15440_at12340_at17307_at18862_at16389_s_at14876_at13795_at12376_i_at14455_at14973_at19153_at19307_at13499_at13932_at19016_at17343_at19705_at12038_at12703_at17800_at12702_at14904_at20193_at20116_at19353_at18488_g_at19266_at16195_at13831_at14258_at18399_at20710_at20016_at19864_at16624_s_at18277_at20413_at13445_at12380_at20358_at16000_at14537_at14040_at14107_at19140_at14001_at19167_at12116_at18615_at19114_at19428_at20192_s_at19434_at19973_s_at20682_g_at12024_at12558_at13479_s_at18469_at20592_at19903_at12318_at18931_s_at20181_at13861_at13866_s_at19491_at20374_at15736_at14358_at13796_at18400_s_at17701_at15799_at12210_at12043_at14287_g_at19714_at13867_at20123_at14453_at19496_at16390_s_at20088_at13476_g_at17216_at18774_at13869_at17637_i_at16657_at15805_at16197_s_at15311_at19567_at18118_at14448_at20059_at18791_at18455_at13893_at12400_at16549_s_at19601_at17799_at14812_at12929_s_at20464_at13665_f_at16144_s_at18054_at12438_at19560_at17276_s_at19487_at19521_s_at17765_at16356_at13993_s_at19778_at14948_at19745_at17630_at17835_at15030_at19092_at20487_at18432_at16778_at19050_s_at13507_at17976_s_at16395_at18430_at12976_at16290_at15377_at19459_at14423_at15903_at18366_at17770_at18180_at14000_at18834_at18530_at16783_at19777_s_at17315_at17145_at16821_at18838_at17739_at18799_at16327_at18496_at19492_i_at12127_at13838_g_at12545_at13049_at14849_at19010_s_at14756_at13865_at19524_at14783_at12365_at12442_at20448_at18397_at14321_i_at13372_at19590_at18173_at18577_at12165_at17771_at12477_at16844_at16848_at19773_at16879_at13922_g_at20079_at18827_at17310_at18205_s_at19136_at18642_at18495_at17702_at12643_at12680_at19748_at18177_at18494_at17979_at12507_at18149_s_at14294_at18523_at12470_at19239_at17797_at13892_at19488_at14941_s_at14938_s_at15242_s_at16849_at18519_at15434_at16188_at16878_at13494_at20219_at19711_at16315_at17849_s_at14743_at20427_at12396_at17828_at16186_at19587_at19805_s_at17836_at12546_at19604_at19417_at16191_s_at12108_at19129_at13402_at18860_at19084_at19268_at12164_at12129_at19965_at14055_s_at17830_at19850_at19849_at15433_at16881_at19526_at19271_at13045_at12364_at19634_at16388_at17304_at19918_at12849_at19352_at18671_s_atAFFX−MurFAS_at15498_at12101_at12724_f_at12567_s_at16383_at19126_at15504_at12722_f_at17888_at13995_at17960_at18258_at19635_at20278_s_at12806_at13992_at13702_at13557_r_at15562_at16148_s_at14114_f_at12887_s_at19442_at20346_at19657_at17387_at18020_at18953_at14253_f_at12848_at12922_at18693_s_at13288_s_at16542_at12937_r_at13565_at14781_at13304_at16688_at20226_at15027_at20038_at15513_at14655_s_at18557_at13461_at19592_at13170_s_at18970_at18875_at18887_at20021_at20262_at18001_at19886_at13496_at19855_at19159_at15507_at20477_at13598_at20634_at20053_at14033_at19921_at12603_g_at16131_s_at12298_at12231_at17117_s_at15187_s_at18705_at19155_s_at12565_at13858_at19644_at18043_at16619_atAFFX−BioC−3_st12732_r_at19916_at15008_at18552_at19135_at20599_at20052_at19598_at18932_at12733_f_at16444_at20695_at20320_at17015_s_at12294_s_at18551_at14385_at16353_s_at18864_at17834_at18459_at19856_at13040_at13539_i_at18141_s_at18660_at14451_at17340_at20146_at17369_at14350_at14575_at13527_s_at18106_at19739_at15530_at19884_at18873_at12040_at20025_at12496_at19931_at20307_at19014_at17306_at12669_at12636_at20302_at12595_at19122_at19238_at20272_at16349_at19165_at17768_s_at14877_at19679_at13968_at20509_at19335_at20632_at15830_at18547_s_at17575_g_at13624_s_at12924_at17853_at18065_r_at13623_r_at14546_s_at15072_at13966_at14283_at19890_at15051_at18606_at20258_at12087_at19842_at20381_at17402_at16299_at17863_at16505_at19443_at12566_at19693_at19593_at17407_s_at20280_at16113_s_at17086_s_at18040_s_at17574_at18014_s_at19089_at15268_at12695_at17434_at13462_at15537_s_at18959_at14456_at17375_at16324_s_at20577_at16779_at13043_at14881_at17891_at19134_at15075_at18108_at20708_at17609_at13959_at12402_at16386_at18912_at19603_at13504_at12269_s_at13002_at16354_at18427_at19606_at20286_at16320_at12599_at18170_at18138_at18453_at14778_s_at17136_at13537_at12569_at15476_at14085_at14554_at15078_at15568_at16600_s_at15014_s_at12534_g_at20663_at20403_at15482_at17512_s_at15536_at17883_s_at14987_at17867_at18717_at13110_at13642_at20390_s_at15486_at19959_at16396_at14988_at14597_at15569_at12640_s_at18647_at20548_s_at15468_at17338_at19702_i_atAFFX−YEL021w/URA3_at14585_atAFFX−YEL024w/RIP1_at15071_at17515_s_at18703_at16149_s_at12437_at19600_at14086_at20329_g_at15001_at16478_at14587_at15037_at18902_at12066_at19378_at15957_atAFFX−BioB−5_st20397_at16020_at15558_r_at18067_r_at12868_at16621_s_at17088_s_at12755_at18056_r_at17915_at17885_at14511_at15017_at12999_at12390_at20423_at20259_at20359_at14379_at12593_s_at12594_at20289_at12393_at17349_g_at12352_at18961_g_at15819_s_at13810_at17524_s_at18183_s_at17811_at14825_at16827_at17175_at18537_at15253_at15274_at17339_at17313_i_at16852_at18178_s_at19240_at12985_at13408_s_at12473_at13308_at18111_s_at14394_at12557_at13976_at19810_at17746_at12481_at18802_at19052_at16889_at13801_at12247_at19030_at12688_at20720_at15848_at12683_at18998_at13486_at15420_at13938_at20388_at14937_at20484_at20717_g_at18988_at19141_at12286_s_at20424_at17663_at20247_at20188_at17203_at20553_at12185_at20624_at12614_at18620_at13516_at17354_at20386_at15429_at14626_at12583_s_at12553_at16825_at18119_s_at13512_g_at12391_s_at19574_s_at19937_at17776_at16400_at17805_at16853_at20493_at15877_at14903_at13939_at12987_s_at14464_at16793_s_at17742_g_at16325_at20046_at20550_at13399_at15335_at12995_at20657_at16285_at13421_at19516_at12308_at16372_at12422_at12019_at14013_at17640_at17788_at14865_at16758_at15773_at19262_at19577_at12388_at13870_at13914_at12954_at17218_at18126_at14836_at17262_at18770_at18836_at17253_at14769_at15224_at19072_at18531_s_at13323_at19822_at14789_at17221_at17678_at16764_at17613_at15746_at16731_at15285_at16663_at19350_at15808_at17154_at15711_at17153_at14332_at13312_at13210_at17260_s_at14264_at15820_at15430_at15363_at17821_at19578_at17792_at13980_s_at12591_at14466_at19035_at13522_at13986_at15356_at16869_at19473_at18510_at15916_at12554_s_at16377_at19476_at18816_at20131_at19571_at17750_at18128_s_at16891_s_at19573_g_at19254_at13524_at14799_s_at18815_s_at18437_at14471_at13354_at13300_at13877_at16240_at14833_at17193_s_at18095_at19264_at18101_at14407_at18818_at15751_at18545_at19009_at19328_at19479_at16664_at18412_at16237_at16665_at15784_at20103_at17686_at14341_at18352_at15752_g_at19294_at18162_at17296_at17621_at16196_at14916_at16258_at19786_at20380_at17344_g_at17281_at16233_at15282_at14791_at16861_s_at13345_at16723_at14488_at17673_s_at12343_at13898_at19237_at15700_s_at14359_at20471_at18865_at17667_at19780_at15246_at15704_at14853_s_at18410_at16192_at16725_at17144_at14360_at14757_at12644_at13309_at18528_at18462_at17244_at16659_at17671_at14362_at15450_at19275_at12572_at15345_s_at20505_at12980_at12990_at14403_s_at13969_at13413_at17751_at17321_at15733_at13477_at17677_at17241_at12982_s_at18373_at18379_at17246_at16265_at15276_at16718_at15809_at19544_at14300_g_at14267_at13979_at12488_at12386_at18805_at13804_at13811_s_at15887_at13786_s_at18440_at13390_at17709_at19103_at17357_i_at16799_at12963_at19792_at15390_at14440_at16831_at12252_at17324_at14860_at16829_at18777_s_at12283_at14356_at13306_at17807_at19537_at12119_at16826_at18840_at17256_at15710_at14788_at13378_at20683_at12357_s_at12090_at19816_s_at19028_s_at20488_at17806_at12055_at13051_s_at17346_at20352_at12256_at13337_i_at18773_s_at19754_at20112_at20220_at19345_at17283_at17781_at19059_at13376_at13840_at17810_at15250_at17774_s_at19823_at19280_at17248_at20126_i_at14262_at17980_at18767_at19242_at19715_at12542_s_at12509_s_at13510_at12273_s_at13684_at19500_at18631_at13135_at12898_g_at19713_at16753_at13653_at19976_at18894_at20649_at15908_at18637_at16109_s_at12051_at15083_at20611_at12359_s_at20686_at18012_s_at15641_s_at16357_at16570_s_at20142_at19288_at14367_at13276_at13271_g_at16568_s_at13687_s_at13895_at17893_at18268_s_at13253_f_at15214_s_at13282_s_at11997_at13439_at15039_at19460_s_at14527_at16151_at14110_i_at12270_at16323_at19755_at17653_at15385_at19575_at18844_at15904_at18866_at12662_at20555_s_at15843_at14433_at12492_at20393_at18335_at18471_at19106_at13386_at14271_at16873_i_at15750_at12025_at16916_s_atAFFX−Athal−25SrRNA_s_at12896_at12919_s_atAFFX−MurIL2_at19944_at16465_at17380_at16840_at16348_at13022_at18237_at19363_at20571_at15238_at17484_at12084_at17249_at13048_s_at20678_at19051_at18140_at19926_at16071_s_at17031_s_at17522_at20051_at15066_at12983_at20345_at15432_at15357_at19565_at13640_at17943_at20443_s_at12561_at17975_at20290_s_at18743_f_at14910_at12472_s_at19195_at17832_at17012_at14657_s_at13435_at17014_s_at15638_s_at17052_s_at19457_at14751_at15479_at12187_at12341_s_at17417_at17075_s_at13964_at12642_at12482_s_at20232_s_at17964_at19835_at15936_atAFFX−BioDn−3_st12062_at12770_at19164_g_at17503_s_at16653_s_at13072_at14100_at15538_at15483_s_at14105_at16114_s_at19612_at19393_at12168_at20224_at13428_at12598_at17054_s_at17951_at15512_at16945_at20451_at20063_at18298_at12339_at19781_at12436_at12503_at12337_at19226_g_at14982_at19664_at16520_at18707_at19851_at14047_g_at18904_s_at18990_at17551_s_at13265_at18564_at14643_s_at18275_at17925_at17933_at16129_at15539_at14998_at18234_at20196_at14676_i_at17536_at15069_s_at14087_at20370_at19413_at15559_at15686_s_at13634_s_at13245_s_at16637_s_at20165_at12434_at16615_s_at19229_at19951_at17695_at17202_at15234_at17625_at16217_s_at16677_at17628_s_at18389_at18422_at17724_at18423_at18391_s_at18360_at16771_at17691_s_at13360_at14793_at17657_at19767_at14843_at17222_at18421_at13059_at18465_s_at12190_at14892_at13424_at15219_at14444_g_at12587_at14400_at19469_at20293_at13491_at13890_at19552_at16185_at15793_at14349_at15334_at13854_at16251_at14741_s_at18482_at14280_at13782_at20110_at13330_at18450_at15758_at13919_at20098_at20133_i_at16380_at15789_at13783_s_at20105_s_at14378_s_at16839_at17201_at14375_at19293_at16772_at20139_at18196_at18098_at20143_at16870_i_at17298_at15814_at17159_at18414_s_at14342_at13396_at15287_s_at13317_at18789_at18543_at13949_at14302_at18167_s_at19685_at19077_at13789_at20078_at15775_at16807_at17292_at15290_at18198_at17649_at15791_at16702_at16267_at18345_at15220_at16692_at16661_at13526_at17843_s_at12288_at12692_at17818_at14437_at17329_at12660_at13385_at16270_at15745_at13316_at17659_at17199_at19802_at17656_at17626_at17661_at16710_at18362_at16247_at16213_at15265_at13704_at16182_at14344_at17134_at13849_at14766_at15393_s_at15394_at19037_at20140_at14277_at19735_g_at14311_at17264_at14346_at16279_at17325_at16670_at20137_at12157_at16411_s_at17696_at14374_at17850_s_at18515_at18355_at19605_at20316_s_at20096_at17716_at15225_at14803_at12319_at18960_at16362_at15455_at13458_at12648_at12014_at20009_at13444_s_at12952_at15344_at20005_s_at14395_at20554_at12682_at12516_at20589_at16660_at17735_at18778_at19033_at16200_at17814_at15320_at14333_at20124_at12455_at19315_at19054_at18466_at19787_at18771_at18190_at13487_at17151_at19758_at16406_at14004_s_at16762_at18151_at17741_at18467_at18191_i_at12382_at15425_at14475_at14339_at19907_at15905_at17791_at14012_at14894_at13426_at13952_at15914_at20557_at17227_at16241_at18363_at17642_at12622_at15708_at17150_at17149_at15853_at15782_at17812_at13841_at19756_at20102_at14372_at15748_at18096_s_at15258_at14340_at16648_at20186_at15528_at16959_at15889_at15671_s_at13292_at16639_s_at15612_s_at17050_s_at19154_i_at12859_at18015_i_at15898_at19470_at15609_s_at13464_at18914_at12600_at16565_at14670_at20513_at19997_at14610_at16896_s_at12915_at12902_at17594_s_at16609_at13272_at12844_s_at16605_s_at17587_at20551_at13119_at16818_at16713_at14129_s_at19853_at14396_s_at12026_at14412_at12389_at17412_at14922_at17475_at13342_at15251_at13485_at16819_at20321_s_at20036_at16711_s_at13061_at17672_at13370_at20507_at16115_atAFFX−BioB−M_st19090_at15494_at13997_at19813_at16147_s_at13296_at15825_at17312_at19078_at14425_at19862_at17864_at12649_at12900_r_at12850_at19421_at17855_atAFFX−BioB−M_at19927_at16895_at16116_s_at20561_at18567_at14656_i_at16086_s_at20406_g_at14593_at17984_s_at14549_at16119_s_at12875_at15473_at17058_s_at17997_s_at20504_at16787_g_at12067_at15573_at14127_at18709_at13569_at16121_s_at20587_at17859_at19522_at15604_at17028_s_at17116_at19169_g_at18327_at12254_at19967_at19655_at20643_at12009_at13962_at19678_g_at15303_at19380_at19867_at16075_f_at16811_at19915_at16944_s_at19201_at18975_g_at19696_at15085_at17097_at13944_at16611_at19838_i_at15516_s_at14059_at15613_s_at18263_at18714_at18297_at14526_at12299_at17423_s_at12736_f_at14518_at13081_s_at18211_s_at18310_at13825_s_at17526_s_at12419_at12240_at17585_s_at17919_g_at14119_r_at19597_s_at16483_at16162_s_at18254_at12757_at15151_f_at15076_at15471_at13631_at18024_s_at17376_at16991_at13540_at12941_g_at14810_at12671_at14041_at19147_at20234_s_at12942_at14528_at19196_at14413_at13472_at18663_s_at13067_s_at17523_s_at20331_at12061_at20645_s_at16911_at14480_at20205_at13942_at20622_at13056_at17318_at18878_at12684_at15939_at20457_at14074_at14098_at12549_s_at13046_g_atAFFX−CreX−5_stAFFX−BioDn−5_st18947_i_at17578_at14520_at13202_at17118_at20405_at13227_at20047_at15966_i_at13034_s_at20666_at13109_at20700_i_at15012_g_at20055_at12302_at12641_at17452_g_at20333_at15891_at12935_at15436_i_at13369_at20541_at19455_s_at18332_at17875_at13923_at16040_at12292_at12126_at13985_at20692_at20040_at12536_s_at15737_at14533_i_at16832_at13019_at20452_s_at18587_s_at14570_at14994_at14044_at17493_s_at12097_g_at20213_at16854_at13971_at16081_s_at14028_at17929_s_at16100_s_at12450_at17348_at18790_g_at20494_at17697_at12096_at17769_s_at14048_at17590_at13433_at20461_at14574_at15339_i_at19958_at13900_s_at13003_at17448_at19395_at18879_at14065_at14031_at15605_s_at16929_at15535_at16883_at15023_at14746_at20584_at18401_at14485_at19691_at16654_at18212_at11991_g_at18343_at13518_at15416_at20412_at14753_s_at12122_at15932_at14883_at20019_at15376_at12107_i_at12238_at14008_at20583_at19718_at14868_at18189_at12555_s_at12525_at20556_at20414_at18835_at13513_at12676_at19981_at17491_at14092_at12331_at17941_at19666_atAFFX−BioB−3_at20533_at15305_at19525_at19299_at20206_at18492_at16782_at12070_at19166_at14506_at13907_at12697_at19083_at14422_at15380_at18144_at12035_at12136_at14388_at19558_at17173_at15895_at15893_at14944_at13924_at16316_at16662_at20591_at12428_at19152_atAFFX−BioC−5_st15824_at13528_at14449_at12923_at17588_s_at13331_at17982_s_at12465_at16450_s_at13403_at19953_at17898_at14290_at17700_at17927_at17541_at12077_at14560_at14552_s_at12411_s_at12677_at16689_at19044_at20058_at19928_at15828_at16880_at15443_at19301_at17946_s_at18042_at15940_at18185_at14847_at15084_at20716_at12984_at12526_at12243_at17180_at12623_at16402_s_at13483_at14896_at13521_at18192_s_at14742_at12287_s_at20718_at15442_at19671_at19424_at13998_at16890_at12506_at12508_i_at16329_at16391_at18229_at20275_at12181_at17016_s_at20391_at12109_g_at15740_at15906_at13344_s_at13899_at12113_at14855_at17643_at16724_at18120_i_at15844_at18153_at19716_at13310_at20653_s_at16425_at20690_at20357_at12144_at12285_at20153_at15090_at11990_at20034_i_at13576_at12069_at16813_at20000_at12133_at17277_at19743_at12535_at20310_at15340_s_at18261_at14458_at18869_at19249_at13975_at12524_at15902_at12153_at19021_at16261_at12182_at17314_s_at20092_at18596_at15406_at19970_s_at16387_at19384_at18122_at16481_s_at12556_at14550_at12944_at12079_s_at18441_s_at20045_at19418_at14975_at14857_at13699_at20216_s_at19628_at18312_s_at14023_at17041_s_at16607_s_at14123_at18823_s_at13179_at14912_at20399_at13603_f_at13228_at13582_at13650_at13155_i_at13535_at18583_at20569_s_at19160_at19458_at12504_at19194_at12440_at12262_at17903_at16111_f_at11989_at13010_s_at14390_at12248_s_at19922_at19639_at17766_at17841_at20447_at20677_at15054_at12027_at20445_at18271_s_at14960_at18497_at13257_s_at18980_at12317_at20715_at12249_s_at13026_at15061_at13902_at19387_at14640_atAFFX−MurIL10_at18016_r_at15093_at20013_at15565_at20163_at18060_i_at18901_at18057_i_at14062_at19163_at17521_at13571_at15015_atAFFX−TrpnX−5_at18691_at18725_s_at18005_at14106_at16552_at14718_f_at14577_at17558_s_at13541_at20159_at16061_s_at18599_at16907_at15500_at17460_at15561_at19444_g_at19416_at19893_at14884_at19383_at20182_at14452_at17099_s_at17397_at16321_at14878_at19161_at20603_at17545_s_at14079_at17013_at13632_at15074_s_at15461_at12065_at16900_at18749_f_at14573_at16013_at15011_at19950_at15038_at12734_f_at17950_at16928_at16473_at18051_r_at14551_at14512_r_at16613_s_at18008_r_at13639_at19228_at17912_at19115_at12295_at13301_at14547_at13954_g_at20439_at14556_at12161_at15465_at12398_at16055_s_at18908_i_at19381_at18037_s_at17070_at18704_at15002_at13039_at13465_at12274_at19554_at19923_at12401_at18294_at13498_s_at13610_s_at18135_at19439_at13619_at14482_at19123_at16856_i_at14035_at20298_at19897_s_at18608_at19188_at13567_at12146_at12242_at18559_g_at13935_at19872_at13478_at17664_s_at18824_at19429_at16257_at18266_at16786_at14991_at15947_at14046_at19983_at20502_at15013_at19955_at12399_at20705_at12104_at19490_at12103_at15457_at14880_at20170_at13961_at19807_s_at16289_at20082_at14917_at20175_at17472_at17007_at13633_at14567_at19615_at17469_at18625_at13114_at19831_i_at20503_at14051_at15175_s_at15959_at17879_at18636_at13123_at13503_at15897_s_at14104_at14732_at16128_at14027_at17410_at14598_at17583_s_at16594_at17068_at14532_at19372_at14122_at17983_at12854_s_at12297_at12336_at14576_at18715_at15566_at14353_at13600_at14566_at20609_at20419_at13536_at15080_at17045_at13608_at18633_at19629_at14607_at17608_at19881_at15176_s_at15026_at14649_at15050_at15938_at13685_at16070_s_at14515_at13112_at15487_at19880_at16351_at12883_at17576_at20297_at19127_at18744_f_at14504_s_at19390_at20061_at14006_at18723_at12236_at19290_at15399_at19143_at16754_at18168_at13459_at13812_at18384_at19348_at13032_at19621_at14443_at14606_at19876_at12474_at19385_at15337_at20341_at12977_at13977_at20322_at18538_at12957_s_at18841_at17247_at16293_at15240_at15735_at17674_at17178_at13506_at14285_at20575_at20570_at20633_at12468_g_at16025_s_at19338_at15435_at12268_at19337_at16850_at18451_s_at15374_at19303_at18763_g_at17773_at20675_at15243_at14357_at20249_at17641_g_at18371_at19555_at18486_at20418_at17831_at19248_at13351_at17294_at14863_at19057_at18428_at14497_at13970_at19245_at20122_at15409_at13799_at18147_i_at15869_at18210_at12988_at14263_at19349_at19055_at17179_at13377_s_at15865_at13781_at19564_at14818_g_at17224_at19110_s_at18544_at19285_at17185_s_at17646_at17184_at12890_s_at17820_at19551_at18094_s_at14931_at18125_at16216_at18472_at19975_at17650_at20581_i_at15358_at13813_at14663_s_at17715_at16272_at18779_at15228_s_at20073_at19765_at14899_at18477_at12997_s_at16374_s_at19688_at19689_at12326_i_at16775_s_at13000_at15922_at15217_s_at13686_at13058_at12685_at12021_at17192_s_at17648_at17684_at12378_at13982_at16313_at19764_at15324_at18781_at17793_at18133_at17759_at14901_at13490_at13033_at12424_at12423_s_at18997_s_at13848_at18478_at14805_at19480_s_at13419_at18347_s_at17755_at13817_at19097_at18193_at17363_at13363_s_at19003_at19477_at15885_at17617_at16697_at18539_at19731_at14274_at16768_at17822_s_at12155_at14438_at14859_s_at14467_at12111_at20392_at19873_at12358_at13418_at17331_at19251_at19821_at19351_at13839_at18435_at12681_at15284_at18657_at19909_at20011_at16342_at17148_at19056_at19744_at14428_at19568_g_at14266_at20492_at13519_at14966_at17288_s_at13321_at15884_at14019_at19324_at19287_at19509_at19064_at18093_at12586_s_at17848_at19316_at15741_s_at16201_at18039_s_at17971_s_at14298_g_at15739_at17713_at13384_at17712_at17782_at13311_at13474_at18795_at17236_s_at19497_at19094_at15381_at19247_at18501_at12544_at19312_at17347_at17839_at16198_at14787_at20125_at17740_at14925_at18775_at19785_at19278_at13353_at13909_at12208_at19435_at20415_at18996_at18148_at19783_at14962_at15354_at15911_at14261_at17212_at13855_at13818_s_at15725_at17729_at19771_at14775_at19804_at17135_at13823_at13790_at20145_at13889_at20111_at17826_at16743_g_at13364_at19519_at17726_at20107_at18854_at18419_at18392_at18361_at17688_at18357_at16273_at16183_at16278_at16706_at19737_at18386_at17232_at18483_at18100_at19546_at18129_at16707_at17229_at15298_at16281_at19584_at15333_at15233_at18819_at15230_at14801_at19510_at13879_at19327_at16767_s_at16668_at16246_at15327_i_at17265_at19736_at18446_at17654_at18359_at15222_at16175_g_at18787_at16676_at16742_at12059_at18408_s_at15395_at19539_at18444_at16370_s_at17163_at15331_at17266_at19518_at19583_at15362_at16735_at16666_at15286_s_at16694_at15716_at16249_at13853_at14959_at18508_at18442_at13846_s_at15907_at19020_at20516_at17291_at12620_at15445_at14299_at18121_s_at15919_at13455_at19738_s_at18134_at18165_at19586_at17629_at18378_at15319_at15811_at19292_at15252_g_at19326_at16872_at17332_s_at18852_at14337_at13819_at19796_at13417_at18349_at17614_at15713_at14830_at18380_s_at20130_s_at15780_at15249_at14760_at15315_at15349_at14296_at15743_at14496_at16234_at19472_at12625_at18184_at12320_at19540_at13420_at13318_at16828_at19541_at17753_at18158_at15875_at17647_at16729_at13442_at19794_at14795_at18534_at14792_at13348_at16231_at19286_at16368_s_at13805_at19023_at17747_at16708_at17233_at19261_at18786_at13328_at18160_at18811_at16801_s_at14869_at19769_at19074_at15754_at16284_s_at18855_at18788_at13787_at17234_at17652_at16698_g_at16737_at17685_at12089_at12323_at15291_at15295_at19475_at19100_at16215_at17198_at18356_at18395_at16243_at14343_at17689_at16280_at17223_at17620_at16732_at13361_at18383_at15718_at19043_at18130_at13325_at14764_at17612_at17225_at16734_at16700_at18376_at14373_at13388_at17356_at17220_at18159_at14278_at19007_s_at15360_at16834_at18514_at17297_at13393_at17758_at18853_at19801_at18851_at17326_at17754_at13359_at14309_at13319_at17219_s_at17825_at18849_s_at17761_s_at18447_at14347_at18476_at14376_at16833_at19317_at19276_at18434_at15248_at19719_at19282_at17816_at14297_at13814_at14364_at19538_s_at16794_at19321_at17259_at17817_at15714_at18809_at19102_at19071_at18124_at13320_at18123_at18498_at12417_at19283_at19817_s_at13053_at13315_at17261_at13453_at16867_at20095_at19505_at19217_at19877_at12490_at19879_at12244_s_at13533_at14879_g_at17302_at20540_at18174_at18562_at18491_at18242_g_at20371_at19465_at16339_at13060_at13446_at20693_at12527_at19874_at20043_at20250_at17020_at20041_at19870_s_at14132_at12379_at20474_at15495_at15763_at20376_at20606_at13035_at13430_at12163_at15067_at18935_at19124_i_at13466_at17540_s_at20003_s_at19206_at19859_at18570_at13405_at16225_g_at19562_at20026_at15924_at15309_s_at16752_at18768_at13832_at16352_at17801_at16777_at20520_s_at13307_at19311_g_at18868_at12884_at16686_at20085_s_at13016_at12693_at20508_at19912_s_at12091_at20212_at14890_at19911_at12246_s_at12194_at20240_at12327_s_at18964_at19151_s_at14997_at16761_at15804_at15772_at12580_at18842_at16336_at19532_at12414_s_at12085_at12123_at14796_at19815_at19281_i_at13333_at16714_at13401_at16194_at14355_at13857_at12686_at17501_s_at20027_at12499_at19528_at15096_at18368_at20613_at19692_at20049_at19929_at14908_at18050_at15062_at20703_at12714_f_at14535_at19830_at15056_at16614_s_at17206_at16909_at14136_at14590_at18017_f_at18214_at15077_at19914_at17121_s_at14024_s_at12510_at13989_at19445_at19986_at13367_at20276_at18104_at20436_at12186_at12105_at20430_at16575_s_at15560_s_at17992_at13191_at16500_at18941_g_at20630_i_at14414_at17940_g_at12469_at20169_at20243_at15956_at19414_at17554_atAFFX−Athal−5SrRNA_at18747_f_at17110_s_at14251_f_at17920_s_at17422_at18920_at18571_at15073_at16974_at18921_g_at13154_s_at18066_r_at15044_at18602_at14050_at14115_f_at13168_i_at17557_s_at18059_i_at19611_s_at19595_s_at14484_at14530_at13612_at17513_s_at12713_f_at20328_at12814_at16426_at15858_at15930_at20404_at17889_at13926_at12139_at20676_g_at19045_at12429_at18590_at13452_at20355_at16007_at12279_at15644_at13407_s_at18955_at12487_at15851_i_at17255_at18445_at17361_s_at18187_at13520_at13028_g_at12653_at15913_at17869_at19399_at13021_at17794_at12461_at12996_i_at16343_at20157_at12615_at12088_at14335_at19068_i_at20099_at13350_at16307_at18375_at14758_at19319_at19474_at19508_at16868_at15878_at13030_s_at13412_at18464_at12073_at16790_at15447_at19093_i_at18114_at12947_at12946_at15342_at16334_s_at17779_at18175_at19026_at17745_at19250_at12260_g_at12395_at12495_s_at14380_at17520_s_at12017_at16910_s_at12018_at18641_at20090_at20387_at16886_at15760_at18549_s_at14569_at20248_at18619_at17798_at16745_at19273_at14777_i_at20338_at14639_at20535_at17581_g_at17280_at14382_at16815_at16781_at12639_at20619_at19680_at12152_at19214_at15021_at20642_at19405_at20065_at18396_at19396_at15484_at20483_at14928_at12281_at18307_at16303_at15842_at18780_at19367_at15880_at14970_at12991_at19761_at16305_at19145_at12054_at19148_at14442_at15396_at13941_at19908_s_at17599_s_at19085_at14259_i_at13800_at12513_at19456_i_at12547_at15915_at17676_at14472_at18338_at14949_at14754_at16655_at17638_s_at16756_at15812_at17708_at13837_at15863_at17274_at14426_at16219_at15338_at13834_at13409_at19498_s_at12173_at19087_at19751_at18525_at14915_at15439_at12979_at18110_at18143_s_at19631_at19053_at17704_at16755_at15876_at15702_s_at15278_at14755_at19530_at14747_at18470_at14850_at13340_s_at15280_at15310_at17345_at13373_s_at15734_at12041_at13443_at14750_s_at15241_s_at13302_at19493_s_at20087_at19340_s_at

This field of research is evolving rapidly, and data and analysis methodschange frequently.

Data files:arabidopsis.xml

Page 19: Data Descriptions

“book”2007/9/21page 171!

!!

!

!!

!!

7.12 Music 171

7.12 Music

Source: Collected by Dianne Cook.

Number of cases: 62Number of variables: 7Description: Using an Apple computer, each track was read into the musicediting software Amadeus II, and the first 40-second clip was snipped andsaved as a WAV file. (WAV is an audio format developed by Microsoft R$,commonly used on Windows but becoming less popular.) These files wereread into R using the package tuneR (Ligges 2006), which converts the audiofile into numeric data. All of the CDs contained left and right channels, andvariables were calculated on both channels.

Variable Explanation

artist Abba, Beatles, Eels, Vivaldi, Mozart, Beethoven,Enya

type rock, classical, or new wavelvar, lave, lmax average, variance, maximum of the frequencies of the

left channellfener an indicator of the amplitude or loudness of the soundlfreq median of the location of the 15 highest peak in the

periodogram

Primary question: Can we distinguish between rock and classical tracks? Canwe group the tracks into a small number of clusters according to their simi-larity on audio characteristics?

Data restructuring: This dataset is very clean and simplified. The originaldata contained 72 variables, most of which have been excluded.

Analysis notes: Answers to the primary question might be used to arrangetracks on a digital music player or to make recommendations. Other questionsof interest might be:

• Do the rock tracks have di!erent characteristics than classical tracks?• How does Enya compare with rock and classical tracks?• Are there di!erences between the tracks of di!erent artists?

Page 20: Data Descriptions

“book”2007/9/21page 172!

!!

!

!!

!!

172 7 Datasets

Data files:music-sub.csv,music-sub.xml

Subset of data used in this book. The last fivetracks in the data (58–62) have the artist and typeof music loosely disguised so that they can be usedto test classifiers that students built using the restof the data.

music-all.csv,music-all.xml

Full datasets, 72 variables, and a few missing val-ues.

music-clust.csv,music-clust.xml

Subset of data, augmented with results from dif-ferent cluster analyses

music-SOM1.xml,music-SOM2.xml

Di!erent SOM models appended to the data.

7.13 Cluster Challenge

Source: Simulated by Dianne Cook.

Number of cases: 250Number of variables: 5Description: Simulated data included as a challenge to find the number ofclusters.

Primary question: How many clusters in this data?

Data files:clusters-unknown.csv

7.14 Adjacent Transposition Graph

Source: Constructed by Deborah F. Swayne.

Number of cases: 24 nodes and 36 edges in the n = 4 adjacent transpositiongraph; 120 nodes and 240 edges in the n = 5 graph.Number of variables: 3 variables in the n = 4 graph; 4 variables in the n = 5graph.Description: The n = N adjacent transposition graph is generated as follows.Start with all permutations of the sequence 1, 2, ..., N . There are N ! suchsequences; make each one a vertex in the graph. Connect two vertices by

Page 21: Data Descriptions

“book”2007/9/21page 173!

!!

!

!!

!!

7.15 Florentine Families 173

an edge if one permutation can be turned into the other by transposing twoadjacent elements.Principal question: Can a graph layout algorithm be used to arrange the nodesso that it is easy to understand the di!erent permutations of rankings?

Data files:adjtrans4.xml, adjtrans5.xml

7.15 Florentine Families

Source:This data is widely known within the social network community, and is

readily available from a number of sources. It was compiled by John Padgettfrom historical documents such as Kent (1978). The 16 families were chosenfor analysis from a much larger collection of 116 leading Florentine familiesbecause of their historical prominence. Padgett and Ansell (1993) and Breigerand Pattison (2006) extensively analyzed the data.

We obtained it from the R package SNAData, by D. Scholtens (2006), partof the Bioconductor project; Scholtens obtained it from Wasserman & Faust(1994).

Number of cases: 16 nodes; two sets of edges, one with 15 edges and the otherwith 20.Number of variables: 3 variables on each node; one on each edge.Description:

The data include families who were locked in a struggle for political controlof the city of Florence in around 1430. Two factions were dominant in thisstruggle: one revolved around the infamous Medicis, and the other around thepowerful Strozzis.

Variable Explanation

Wealth Family net wealth in 1427 (in thousands oflira)

NumberPriorates Number of seats on the civic council held bythe family between 1282 and 1344

NumberTies Total number of business or marriage tiesAveNTies in Business andMarital tables

Average number of business (loans, credits,joint partnerships) or marital ties per family

Primary question: How are the dominant families of old Florence connectedto each other?

Data files:FlorentineFam.xml, constructed from SNAData.

Page 22: Data Descriptions

“book”2007/9/21page 174!

!!

!

!!

!!

174 7 Datasets

7.16 Morse Code Confusion Rates

Source: Rothkopf, E. Z. (1957), A Measure of Stimulus Similarity and Errorin some Paired-Associate Learning Tasks, Journal of Experimental Psychology53, 94–101.

Number of pairwise distances: 1,260Description:

In an experiment, inexperienced subjects were exposed to pairs of Morsecodes in rapid order. The subjects had to decide whether the two codes in apair were identical. The data were summarized in a table of confusion rates.

Confusion rates are similarity measures: Codes that are often confusedare interpreted as “similar” or “close.” Similarity measures are converted todissimilarity measures so that multidimensional scaling can be applied.

Morse codes consist of sequences of short and long sounds, which are called“dots” and “dashes” and written using the characters “.” and “-”. Examplesare:

Letter Code Letter Code Digit Code

A . - F . . - . 1 . - - - -B - . . . G - - . 2 . . - - -C - . - . H . . . .D - . . T -E . X - . . -

The codes are of varying length, with the shorter codes representing lettersthat are more common in English. The digits are all five-character codes.

Variable Explanation

Length Length of the code, rescaled to [0,1]Dashes Number of dashesD Dissimilarity between codes

Primary question: Which codes are similar and often confused with eachother?

Data restructuring:The original data came as an asymmetric 36 ! 36 matrix of similarities,

Si,j , i, j = 1, ..., n. The values were converted to dissimilarities Di,j and sym-metrized, using D2

i,j = Si,i +Sj,j"2Si,j . Two variables were derived from theMorse codes themselves, Length and Dashes.

Page 23: Data Descriptions

“book”2007/9/21page 175!

!!

!

!!

!!

7.17 Personal Social Network 175

The dissimilarity matrix was reconfigured to conform to GGobi’s XMLformat, a set of n (n " 1) = 1, 260 edges with associated dissimilarity. Asecond set of 33 edges was added to link similar codes; it is for display only,to aid in interpretation of the configuration, and is not used by the MDSalgorithm.

Analysis notes: Start with the edges turned o!, and focus on the movementof the points. Add the edges when the layout is complete to understand thestructure of the final configuration. Re-start MDS a few times from randomstarting positions, and compare the resulting configurations.

Data files:morsecodes.xml

7.17 Personal Social Network

Source: Provided by Chris Volinsky and Deborah F. Swayne.

Number of cases: 140 nodes (people) and 203 edges (contacts between people).Number of variables: two categorical variables for each node; one categoricaland two real variables for each edge.Description: This is a personal social network, collected by selecting one per-son, adding that person’s contacts, each contact’s contacts, and so on. In itsoriginal form, the nodes were telephone numbers and the edges representedcalls from one number to another (Cortes, Pregibon & Volinsky 2003), but theprivacy of individuals has been protected by disguising the telephone numbersas names and changing the meaning of the original variables.

People:

Variable Explanation

maritalstat categorical: married, never married, or otherhours binary: full time or part time

Contacts:

Variable Explanation

interactions a measure of the amount of time spent talkingcenter triangle binary; is the point part of a 3-node cycle?log10(interactions) base 10 log of interactions

Data files:snetwork.xml