statistical analysis code for analysis of castxb6 f2 mouse cross 2. network analysis ... · 2011....
TRANSCRIPT
-
Statistical analysis code for analysis of CASTxB6 F2 mouse cross
2. Network analysis in liver and adipose
Peter Langfelder
March 23, 2011
Contents
1 Setting up the R session and loading of data 1
2 Relationships among the physiological traits 3
3 Network construction and module identification 53.a Scale-free topology analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.b Network construction and module identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.c Merging of closely-related modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.d Trimming of genes with low module membership . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.e Identification and removal of linkage-driven modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.f Gene clustering dendrograms and module colors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4 GO enrichment analysis 174.a Exporting lists of genes in each module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.b GO enrichment analysis in WGCNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5 Modules related to physiological traits 205.a Module-trait relationships for all modules that relate significantly to a trait . . . . . . . . . . . . . . . 225.b Network plots of all module eigengenes and traits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6 Output of module membership, eigengenes, and eigengene correlations 26
7 Overlap of liver and adipose modules 28
8 Gene significance and module membership in HDL-related modules are correlated 30
9 Module significance in male data validates association found in female data 32
10 Cross-referencing with genes implicated in GWA studies 35
1 Setting up the R session and loading of data
In this document we detail our network analysis of the CASTxB6 cross. We use the WGCNA package [1] to constructthe gene co-expression network, find modules, relate them to the clinical traits, study GO enrichment, and othertasks. We use the pre-processed data created in part 1.
1
-
# Set working directory. This step is necessary if your data is saved in a directory other than the current
directory. Replace the path name below with the directory where the data is stored on your drive.
# setwd("Z:/home/plangfelder/Work/Mouse-ReciprocalCXB/CxBOnly");
# Load the WGCNA library
library(WGCNA)
# This setting is important, do not leave out
options(stringsAsFactors = FALSE);
options(width = 109)
set.seed(1); #needed for .Random.seed to be defined
We now set up a few basic variables and load the preprocessed expression data. Liver and adipose wil be indexed 1and 2, respectively. The files necessary for this step have been generated in part 1 of the analysis.
nSets = 2;
setLabels = c("Female Liver", "Female Adipose");
shortLabels = c("Liver", "Adipose");
shortshortLabels = c("L", "A");
# Load expression data
files = c("../CxBOnly-Liver-outliersRemoved-exprFemaOR-pValFemaOR.RData",
"../CxBOnly-Adipose-outliersRemoved-exprFemaOR-pValFemaOR.RData");
express = list();
for (set in 1:nSets)
{
x = load(file = files[set]);
express[[set]] = list(data = exprFemaOR);
}
expr = express;
rm(express);
collectGarbage();
exprSize = checkSets(expr);
nSamples = exprSize$nSamples;
collectGarbage()
We now load the trait data and isolate numeric traits measured at the time the animals were sacrificed.
rawTr = read.csv(file = bzfile("../../../Data-AllMouse/CXB_Clinical_traits.csv.bz2"))
numTraitInd = c(15:46, 48)
numTraits = vector(mode = "list", length = nSets);
# The following is relative to numTraitInd
selTraitInd = c(5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 26, 27, 28, 29, 30, 31, 32, 33)
selTraits = vector(mode = "list", length = nSets);
for (set in 1:nSets)
{
mice = rownames(expr[[set]]$data);
expr2tr = match(mice, rawTr$Mice_id);
temp = rawTr[expr2tr, numTraitInd]
rownames(temp) = rawTr$Mice_id[expr2tr];
numTraits[[set]] = list(data = as.matrix(temp));
selTraits[[set]] = list(data = as.matrix(temp[, selTraitInd]));
}
collectGarbage();
2
-
Some trait measurements appear to be incorrect or outliers. For example, one mouse (F2 391) has a recorded bodylength of nearly 30 cm (1 foot), which is clearly a measurement (or record) error. Similarly, some fat measurementsare unrealistically high. We remove the unrealistic measurements from the data.
for (set in 1:nSets)
{
suspicious = numTraits[[set]]$data[,"length_cm"] > 20;
numTraits[[set]]$data[suspicious,"length_cm"] = NA;
selTraits[[set]]$data[suspicious,"length_cm"] = NA;
}
# remove the e_fat outlier
for (set in 1:nSets)
{
suspicious = numTraits[[set]]$data[,"efat_g"] > 20;
numTraits[[set]]$data[suspicious,"efat_g"] = NA;
selTraits[[set]]$data[suspicious,"efat_g"] = NA;
}
nSelTraits = length(selTraitInd)
tcInd = match("e_tc_mgdl", colnames(selTraits[[1]]$data));
Next we modify trait names to make them more descriptive.
renameTable = matrix(c("e_bweight_g", "e_fat_g", "e_mus_g", "e_fluid_g", "e_fat_per", "e_tg_mgdl",
"e_tc_mgdl", "e_hdl_mgdl", "e_uc_mgdl", "e_ffa_mgdl", "e_glu_mgdl", "length_cm", "efat_g", "rfat_g",
"vfat_g", "sfat_g", "insulin_pgml", "leptin_ngml", "bmd_mgcm2",
"weight", "fat", "muscle", "fluid", "fat.frac", "trigly", "tot.chol", "HDL",
"unest.chol", "FFA", "glucose", "length", "efat", "rfat", "vfat", "sfat",
"insulin", "leptin", "BMD"), ncol = 2, nrow = nSelTraits);
ind = match(renameTable[, 1], colnames(selTraits[[1]]$data));
renameTable = renameTable[ind, ];
2 Relationships among the physiological traits
Here we plot a heatmap of correlations among the traits.
# Calculate the matrix and order it using a hierarchical clustering dendrogram
mat = bicor(selTraits[[1]]$data, use = ’p’);
order = hclust(as.dist(1-mat), method = ’a’)$order;
# Open a suitably sized graphics window
sizeGrWindow(9,7);
# Alternatively, plot into a file. Make sure the directory Plots exists or change the file name
# appropriately.
# pdf(file = "Plots/Liver-allTraitCorHeatmap.pdf", width = 9, height = 7);
par(mar = c(5, 6, 2, 1));
labeledHeatmap(mat[order, order],
xLabels = renameTable[order, 2],
yLabels = renameTable[order, 2],
colors = greenWhiteRed(50),
zlim = c(-max(abs(mat)), max(abs(mat))),
setStdMargins = FALSE, cex.lab = 1.2,
main = "Correlation heatmap of physiological traits",
textMat = round(mat[order, order], 2), cex.text = 0.7)
# If plotting into a file, close it
dev.off();
The result is shown in Figure 1. Many of the traits are strongly correlated.
3
-
Correlation heatmap of physiological traits
−1
−0.5
0
0.5
1
fluid
insuli
ntri
gly FFA
unes
t.cho
l
tot.c
hol
HDL
BMD
mus
cle
lengt
h
gluco
selep
tin fat
fat.fr
ac
weigh
tsfa
tvfa
tefa
trfa
t
fluid
insulin
trigly
FFA
unest.chol
tot.chol
HDL
BMD
muscle
length
glucose
leptin
fat
fat.frac
weight
sfat
vfat
efat
rfat
1 −0.23 −0.17 −0.27 −0.19 −0.29 −0.42 0.1 0.01 −0.12 −0.3 −0.65 −0.67 −0.74 −0.45 −0.57 −0.55 −0.57 −0.6
−0.23 1 0.07 0.05 0.06 0.18 0.2 0.06 0.07 0.05 0.27 0.32 0.3 0.29 0.25 0.31 0.26 0.31 0.3
−0.17 0.07 1 0.76 0.51 0.38 0.31 0.09 0.09 −0.07 0.2 0.3 0.27 0.26 0.26 0.31 0.29 0.26 0.27
−0.27 0.05 0.76 1 0.58 0.43 0.46 0 −0.02 −0.04 0.11 0.36 0.32 0.34 0.26 0.32 0.34 0.31 0.33
−0.19 0.06 0.51 0.58 1 0.81 0.68 0 −0.05 −0.04 0.24 0.26 0.18 0.19 0.17 0.23 0.21 0.19 0.25
−0.29 0.18 0.38 0.43 0.81 1 0.84 0.02 0.04 0.06 0.33 0.41 0.32 0.35 0.3 0.33 0.35 0.3 0.38
−0.42 0.2 0.31 0.46 0.68 0.84 1 0.11 0.1 0.16 0.33 0.59 0.47 0.49 0.46 0.49 0.48 0.47 0.52
0.1 0.06 0.09 0 0 0.02 0.11 1 0.53 0.36 0.07 0.2 0.17 0.1 0.37 0.2 0.26 0.19 0.19
0.01 0.07 0.09 −0.02 −0.05 0.04 0.1 0.53 1 0.57 0.18 0.32 0.48 0.34 0.67 0.41 0.45 0.43 0.38
−0.12 0.05 −0.07 −0.04 −0.04 0.06 0.16 0.36 0.57 1 0.04 0.37 0.4 0.34 0.59 0.41 0.45 0.44 0.4
−0.3 0.27 0.2 0.11 0.24 0.33 0.33 0.07 0.18 0.04 1 0.33 0.34 0.32 0.41 0.38 0.42 0.3 0.35
−0.65 0.32 0.3 0.36 0.26 0.41 0.59 0.2 0.32 0.37 0.33 1 0.83 0.81 0.74 0.82 0.77 0.81 0.8
−0.67 0.3 0.27 0.32 0.18 0.32 0.47 0.17 0.48 0.4 0.34 0.83 1 0.97 0.85 0.89 0.86 0.9 0.86
−0.74 0.29 0.26 0.34 0.19 0.35 0.49 0.1 0.34 0.34 0.32 0.81 0.97 1 0.78 0.85 0.82 0.85 0.85
−0.45 0.25 0.26 0.26 0.17 0.3 0.46 0.37 0.67 0.59 0.41 0.74 0.85 0.78 1 0.85 0.88 0.89 0.85
−0.57 0.31 0.31 0.32 0.23 0.33 0.49 0.2 0.41 0.41 0.38 0.82 0.89 0.85 0.85 1 0.88 0.88 0.88
−0.55 0.26 0.29 0.34 0.21 0.35 0.48 0.26 0.45 0.45 0.42 0.77 0.86 0.82 0.88 0.88 1 0.89 0.89
−0.57 0.31 0.26 0.31 0.19 0.3 0.47 0.19 0.43 0.44 0.3 0.81 0.9 0.85 0.89 0.88 0.89 1 0.91
−0.6 0.3 0.27 0.33 0.25 0.38 0.52 0.19 0.38 0.4 0.35 0.8 0.86 0.85 0.85 0.88 0.89 0.91 1
Figure 1: Heatmap of correlations among the physiological traits. Many traits are strongly correlated, particularlythe adiposity traits.
4
-
3 Network construction and module identification
In this section we construct the co-expression network and identify co-expression modules. We construct a “signedhybrid” network in which the adjacency aij of nodes i, j with expression profiles xi, xj is defined as
aij ={
bicorβ(xi, xj) for bicor(xi, xj) > 00 otherwise
, (1)
where bicor is the biweight mid-correlation [3], a type of robust (that is, outlier-insensitive) correlation.
3.a Scale-free topology analysis
One of the important network construction parameters is the soft-thresholding power β. We apply the approximatescale-free topology criterion to select an appropriate power in each tissue separately. Note of caution: this codetakes some time (possibly several hours) to run. Please be patient, or, if you trust our results, this part can beskipped.
powers = c(seq(1,10,by=1));
powerTables = vector(mode = "list", length = nSets);
for (set in 1:nSets)
powerTables[[set]] = list(data =
pickSoftThreshold(expr[[set]]$data, powerVector=powers,
networkType = "signed hybrid",
verbose = 2 )[[2]]);
save(powerTables, file = "CxBOnly-Female-powerTables.RData");
collectGarbage();
We plot the results of the scale-free topology analysis.
sizeGrWindow(12,9)
#pdf(file = "Plots/Female-AL-ScaleFreeTopology.pdf", width = 12, height = 9);
par(mfrow = c(2,2));
cex1 = 0.7;
for (set in 1:nSets)
{
plot(powerTables[[set]]$data[,1], -sign(powerTables[[set]]$data[,3])*powerTables[[set]]$data[,2],
xlab="Soft Threshold (power)",ylab="Scale Free Topology Model Fit,signed R^2",type="n",
main = paste("Scale independence in ", setLabels[set]));
addGrid();
text(powerTables[[set]]$data[,1], -sign(powerTables[[set]]$data[,3])*powerTables[[set]]$data[,2],
labels=powers,cex=cex1,col="red");
# this line corresponds to using an R^2 cut-off of h
abline(h=0.90,col="red")
plot(powerTables[[set]]$data[,1], powerTables[[set]]$data[,5],
xlab="Soft Threshold (power)",ylab="Mean Connectivity", type="n",
main = paste("Mean connectivity in", setLabels[set]))
addGrid();
text(powerTables[[set]]$data[,1], powerTables[[set]]$data[,5], labels=powers, cex=cex1,col="red")
}
# If plotting into a file, close it.
dev.off();
The resulting plot is shown in Figure 2. The networks become approximately scale-free when the soft-thresholdingpower becomes 3 to 4. We choose the power 4 for both the liver and adipose networks (but in general the powerscould be different).
5
-
2 4 6 8 10
0.2
0.4
0.6
0.8
Scale independence in Female Liver
Soft Threshold (power)
Sca
le F
ree
Top
olog
y M
odel
Fit,
sign
ed R
^2
1
2
34 5
6 78 9 10
2 4 6 8 10
050
010
0015
00
Mean connectivity in Female Liver
Soft Threshold (power)
Mea
n C
onne
ctiv
ity
1
2
34 5 6 7 8 9 10
2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
Scale independence in Female Adipose
Soft Threshold (power)
Sca
le F
ree
Top
olog
y M
odel
Fit,
sign
ed R
^2
1
2
3 4 5 6 7 8 9 10
2 4 6 8 10
050
010
0015
0020
00
Mean connectivity in Female Adipose
Soft Threshold (power)
Mea
n C
onne
ctiv
ity
1
2
3
45 6 7 8 9 10
Figure 2: Scale-free topology analysis. The left panels show the scale-free topology fit index R2 as a function of thesoft-thresholding power. The right panel shows mean network connectivity. The networks become approximatelyscale-free when the soft-thresholding power becomes 3 to 4.
6
-
3.b Network construction and module identification
Here we use the function blockwiseModules to construct the networks and identify modules. The function has multiplearguments and options; here we leave most of them at their default values. We save the result of the calculation soit only needs to be executed once.
Note of caution: this code assumes that the computer it runs on has enough memory to handle the full dataset. This is usually at least 16 GB but preferrably 32 GB. If your computer’s RAM is not large enough, the codewill trigger an error. In that case please download the file Female-LA-mods.RData from our web site and continue theanalysis below.
Second note of caution: If you do have a large-enough computer and run this code, be prepared to wait severalhours. The calculation can be speeded up substantially by installing a fast BLAS library such as ATLAS BLAS orGotoBLAS and compiling R against it. If you do not know what “installing a library and compiling R against it”means, your best bet is to be patient and/or run this calculation overnight. Again, you may want to download theresult Female-LA-mods.RData.
# Set up basic parameters
softPower = c(4,4);
minModSize = c(25, 30);
mergeCutHeight = 0.25;
cutHeight = 0.995;
collectGarbage()
# Call the module construction function for each tissue separately
mods = list();
for (set in 1:nSets)
{
mods[[set]] = blockwiseModules(expr[[set]]$data,
maxBlockSize = 30000,
networkType = "signed hybrid",
corType = "bicor", power = softPower[set],
TOMType = "signed",
TOMDenom = "mean",detectCutHeight = cutHeight,
minModuleSize = minModSize[set],
deepSplit = 2,
mergeCutHeight = mergeCutHeight, saveTOMs = TRUE,
saveTOMFileBase = spaste("CxBOnly-", shortLabels[set], "-consensusTOM"),
reassignThreshold = 1e-6,
minCoreKME = 0.5, minKMEtoStay = 0.3,
numericLabels = TRUE, verbose = 3);
collectGarbage();
}
# Save the results
save(mods, file = "Female-LA-mods.RData");
If the above code already ran once or instead of executing the code above you simply downloaded the resultFemale-LA-mods.RData, load it:
load(file = "Female-LA-mods.RData");
7
-
3.c Merging of closely-related modules
Here we take a look at the eigengene network of the unmerged modules and merge modules whose eigengenes arehighly correlated. We choose the thresholds for merging to be correlation 0.80 in liver and 0.90 in adipose.
# Set the cut heights (1-correlation)
mergeCut = c(0.20, 0.10)
merge = list();
# Call the module merge function on each tissue
for (set in 1:nSets)
{
merge[[set]] = mergeCloseModules(expr[[set]]$data, mods[[set]]$unmergedColors, cutHeight = mergeCut[set],
getNewUnassdME = TRUE, relabel = TRUE);
}
# Plot the eigengene dendrograms before and after merging
sizeGrWindow(12, 9);
#pdf("Plots/Female-LA-mergingDendrograms-%02d.pdf", onef = FALSE, width = 12, height = 10);
for (set in 1:nSets)
{
par(mfrow = c(2,1));
par(mar = c(0.2, 4, 2.5, 0.2));
plot(merge[[set]]$oldDendro, main = paste(setLabels[set], "modules before merging"),
sub = "", xlab = "", cex = 0.7);
abline(mergeCut[set], 0, col = "red");
plot(merge[[set]]$dendro, main = paste(setLabels[set], "modules after merging"),
sub = "", xlab = "", cex = 0.7);
abline(mergeCut[set], 0, col = "red");
}
# If plotting into a file, close it
dev.off();
# Put together variables for further use
labels = list();
colors = list();
MEs = list();
for (set in 1:nSets)
{
labels[[set]] = merge[[set]]$colors;
colors[[set]] = labels2colors(labels[[set]]);
MEs[[set]] = orderMEs(merge[[set]]$newMEs);
}
# Save the results so this code does not need re-running later.
save(merge, labels, colors, MEs, file = "Female-LA-merge-colors-labels-MEs.RData");
If the above code already ran once, the results can be loaded in one line of code:
load(file = "Female-LA-merge-colors-labels-MEs.RData");
The resulting module merging dendrograms are shown in Figures 3 and 4. Several modules have been merged inliver but none in adipose.
8
-
ME
43M
E69
ME
79M
E22
ME
29M
E72
ME
77M
E6
ME
55M
E58
ME
57M
E87
ME
68M
E70
ME
71M
E61
ME
67M
E84
ME
53M
E66
ME
48M
E80
ME
73M
E37
ME
5M
E35
ME
2M
E16
ME
38M
E19
ME
15M
E25
ME
59M
E49
ME
63M
E52
ME
86M
E45
ME
64M
E28
ME
60M
E32
ME
44M
E7
ME
13M
E26
ME
36M
E9
ME
17M
E30
ME
56M
E65
ME
42M
E39
ME
78 ME
51M
E4
ME
20M
E3
ME
8 ME
12M
E21
ME
89M
E18
ME
10M
E31 M
E1
ME
23M
E47
ME
14M
E24
ME
40M
E50
ME
34M
E74
ME
33M
E46
ME
62M
E83
ME
11M
E76
ME
75M
E41
ME
88M
E82
ME
85M
E27
ME
54M
E81
0.0
0.2
0.4
0.6
0.8
1.0
Female Liver modules before merging
Hei
ght
ME
50M
E48
ME
29M
E21
ME
26M
E33
ME
3M
E13
ME
18 ME
10M
E11
ME
22M
E81
ME
8M
E15
ME
1M
E17
ME
9M
E41
ME
4M
E38
ME
39M
E31
ME
68M
E53
ME
44M
E57
ME
42M
E78
ME
35M
E56
ME
23M
E45
ME
63M
E60
ME
62M
E72
ME
61M
E2
ME
5M
E37
ME
19M
E24
ME
58M
E71
ME
7M
E36
ME
51M
E52
ME
79M
E54
ME
65M
E74
ME
49M
E59
ME
16M
E14
ME
20M
E40
ME
73M
E64
ME
27M
E6
ME
25M
E32
ME
43M
E28
ME
66M
E30
ME
46M
E55
ME
76M
E12
ME
70M
E67
ME
34M
E80
ME
75M
E77
ME
47M
E69
0.2
0.4
0.6
0.8
1.0
Female Liver modules after merging
Hei
ght
Figure 3: Liver module eigengene dendrograms based on dissimilarity equal 1− bicor.
9
-
ME
16
ME
5
ME
18
ME
20
ME
6
ME
9
ME
2
ME
7
ME
8
ME
12
ME
4
ME
14
ME
15
ME
19
ME
3
ME
11
ME
10
ME
13
ME
1
ME
17
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Female Adipose modules before merging
Hei
ght
ME
17
ME
5
ME
18
ME
20
ME
6
ME
10
ME
2
ME
7
ME
9
ME
11
ME
4
ME
14
ME
15
ME
19
ME
3
ME
12
ME
8
ME
13
ME
1
ME
16
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Female Adipose modules after merging
Hei
ght
Figure 4: Adipose module eigengene dendrograms based on dissimilarity equal 1− bicor.
10
-
3.d Trimming of genes with low module membership
The module identification method sometimes assigns genes into modules although the gene has very low modulemembership (defined as the correlation of the gene expression profile and the eigengene). Although such moduleassignment could be meaningful, we aim for tighter modules and hence we remove module genes whose modulemembership is below the threshold of 0.30. Since removing any gene from a module in principle changes its eigengene,we iterate this process until no genes are removed.
mes = list();
origMEs = list();
trimLabs = labels;
for (set in 1:nSets)
{
changed = TRUE
threshold = 0.30;
mes[[set]] = moduleEigengenes(expr[[set]]$data, labels[[set]]);
origMEs[[set]] = mes[[set]];
trimLabs[[set]] = labels[[set]]
while (changed)
{
changed = FALSE;
nMods = ncol(mes[[set]]$eigengenes)
modNames = substring(names(mes[[set]]$eigengenes), 3)
#pind= initProgInd();
for (mod in 1:nMods) if (modNames[mod]!=’0’)
{
modGeneInd = (as.character(trimLabs[[set]])==modNames[mod]);
nModGenes = sum(modGeneInd);
KME = bicor(expr[[set]]$data[, modGeneInd], mes[[set]]$eigengenes[, mod], use = ’p’);
remove = KME < threshold;
if (sum(remove)>0) changed = TRUE;
printFlush("module", modNames[mod], ": removing", sum(remove), "of", length(remove), "genes.");
trimLabs[[set]][modGeneInd][remove] = 0;
#pind = updateProgInd(mod/nMods, pind);
}
#printFlush("");
# Redo module eigengene calculation
if (changed) mes[[set]] = moduleEigengenes(expr[[set]]$data, colors = trimLabs[[set]]);
}
}
Next we check how much the eigengenes have changed. We calculate the correlations between the “original” (i.e.,before gene trimming) and new eigengenes.
#out of curiosity: correlations between original and trimmed module eigengenes:
signif(diag(cor(mes[[1]]$eigengenes, origMEs[[1]]$eigengenes)), 3);
signif(diag(cor(mes[[2]]$eigengenes, origMEs[[2]]$eigengenes)), 3);
signif(min(abs(diag(cor(mes[[1]]$eigengenes, origMEs[[1]]$eigengenes))[-1])), 3);
signif(min(abs(diag(cor(mes[[2]]$eigengenes, origMEs[[2]]$eigengenes))[-1])), 3);
Excluding the eigengene of the improper module 0 (that collects the unassigned genes), the minimum correlation ofold and new eigengenes is 0.999, which indicates that although some outlying genes were removed, the eigengeneshave practically not changed. We now re-form eigengenes, save the results of this part and replace the module labelswith the trimmed labels.
MEs = list();
ordMEs = list();
for (set in 1:nSets)
{
MEs[[set]] = moduleEigengenes(expr[[set]]$data, trimLabs[[set]]);
11
-
ordMEs[[set]] = orderMEs(MEs[[set]], greyName = "ME0");
}
# Save the results so they can be loaded in future
save(trimLabs, MEs, ordMEs, file = "Female-LA-trimLabs.RData");
#load(file = "Female-LA-trimLabs.RData");
labels = trimLabs;
colors = lapply(labels, labels2colors)
3.e Identification and removal of linkage-driven modules
Some of the smaller modules appear to be linkage-driven in the sense that they group together genes located ina single chromosomal region and their eigengene is highly correlated with a genotype at that locus. Although thegenes in such modules are co-expressed in this particular CASTxB6 cross, they would likely not be co-expressed ina random (diverse) population. Therefore we identify such modules are remove them from the analysis (by settingthe module labels of the corresponding genes to 0). We start by loading and formatting the genotype data.
# (re) read the gene annotation table
file = bzfile(description = "../../../Data-AllMouse/CXB_GeneAnnotation.csv.bz2");
#file = "../Data-CXB/CXB_all_gene_annotation.csv"
annotX = read.csv(file = file);
# Read the SNP data and sort them
file = bzfile(description = "../../../Data-AllMouse/CXB_GENOTYPES_numeric.csv.bz2");
gtInfo = read.csv(file = file);
file = bzfile(description = "../../../Data-AllMouse/CXB_GENOTYPES_alpha.csv.bz2");
gtAlpha = read.csv(file = file);
# Correct the coding of numeric genotypes
num2alpha = match(gtInfo$marker_name, gtAlpha$marker_name);
gtAlphaN = gtAlpha[num2alpha, ]
all.equal(names(gtAlphaN), names(gtInfo))
gtInfo[gtAlphaN==’H’] = 1;
gtInfo[gtAlphaN==’B’] = 2;
collectGarbage()
# Sort the SNPs:
snpHasAnno = is.finite(gtInfo$chro_number) & is.finite(gtInfo$marker_pos_Bp);
gtInfoA = gtInfo[snpHasAnno, ];
SNPorder = order(gtInfoA$chro_number, gtInfoA$marker_pos_Bp);
gtInfoS = gtInfoA[SNPorder, ];
gtCols = substring(names(gtInfoS), 1, 3)=="F2_";
gtSamples = names(gtInfoS)[gtCols];
Next we identify and remove modules whose highest correlation with a SNP is above 0.5.
# Identify modules whose correlation with the best SNP is above 0.5
cleanLabels = labels;
for (set in 1:nSets)
{
common = intersect(rownames(expr[[set]]$data), gtSamples);
expr2gt = match(rownames(expr[[set]]$data), gtSamples);
print(table(is.na(expr2gt)))
# all expression-measured samples have a genotype, good.
gt = t(gtInfoS[, gtCols][, expr2gt]);
gtAnnot = gtInfoS[, c(2,5,6)];
collectGarbage()
x = bicorAndPvalue(gt, MEs[[set]]$eigengenes);
bestP = apply(x$p, 2, min, na.rm = TRUE)
whichP = apply(x$p, 2, which.min)
maxCor = apply(abs(x$bicor), 2, max);
which = apply(abs(x$bicor), 2, which.max);
12
-
if (!isTRUE(all.equal(whichP, which))) stop("which and whichP do not agree.");
suspicious = maxCor > 0.5;
suspInfo = data.frame(module = substring(names(MEs[[set]]$eigengenes)[suspicious], 3),
gtAnnot[which[suspicious], ],
absCor.SNP.ME = maxCor[suspicious],
pValue.SNP.ME = bestP[suspicious]);
modules = as.numeric(substring(names(MEs[[set]]$eigengenes)[suspicious], 3))
printFlush(paste("Suspicious modules: ", paste(modules, collapse = ", ")));
cleanLabels[[set]] [is.finite(match(labels[[set]], modules))] = 0;
write.csv(suspInfo, file = spaste("CxBOnly-Female-", shortLabels[set], "-HighSNP-MEcorrelations.csv"),
quote = FALSE, row.names = FALSE);
}
We again replace the module labels by the cleaned labels and recalculate module eigengenes for further use.
# From here on only use cleaned labels:
labels = cleanLabels;
MEs = list();
ordMEs = list();
MEs0 = list(); # Leave grey eigengene out
for (set in 1:nSets)
{
MEs[[set]] = moduleEigengenes(expr[[set]]$data, labels[[set]]);
ordMEs[[set]] = orderMEs(MEs[[set]], greyName = "ME0");
MEs0[[set]] = moduleEigengenes(expr[[set]]$data, labels[[set]], excludeGrey = TRUE, grey = 0);
}
save(labels, MEs, MEs0, ordMEs, file = "Female-LA-labels-MEs-ordMEs-afterCleaning.RData");
#load(file = "Female-LA-labels-MEs-ordMEs-afterCleaning.RData");
colors = lapply(labels, labels2colors)
3.f Gene clustering dendrograms and module colors
Here we take a look at the gene clustering trees in both tissues. This allows us to visually verify that the moduleidentification procedure led to modules that actually correspond to distinguishable branches of the gene clusteringdendrogram. We also add color-coded indicators of gene significance for the individual traits. It is better to savethe plots directly into a pdf (large file size, full resolution with zoom-in) or png (smaller file size but also smallerresolution), but the plot can also be viewed on-screen.
# Calculate gene significance for all traits
basePVal = 0.01;
traitGeneColors = list();
for (set in 1:nSets)
{
z = qnorm(1-basePVal)/sqrt(nSamples[set]-3);
baseCor = tanh(z);
cor = bicor(expr[[set]]$data, selTraits[[set]]$data, use = "p");
cor[abs(cor) < baseCor] = 0;
traitGeneColors[[set]] = numbers2colors(cor, signed = TRUE);
colnames(traitGeneColors[[set]]) = colnames(selTraits[[set]]$data);
}
# Plot the gene clustering trees and the gene significance
sizeGrWindow(12,9)
#pdf(file = "Plots/Female-LA-geneDendrograms-AllTraits-%02d.pdf", w = 30, h = 15, onefile = FALSE)
#png(file = "Plots/Female-LA-geneDendrograms-AllTraits-%02d.png", w = 1200, h = 600)
for (set in 1:nSets)
{
par(lheight=1.3);
plotDendroAndColors(mods[[set]]$dendrograms[[1]],
13
-
cbind(traitGeneColors[[set]], colors[[set]]),
c(renameTable[, 2], "modules"),
autoColorHeight = FALSE,
colorHeight = 0.6,
rowText = spaste(labels[[set]], ": ", colors[[set]]),
textPositions = nSelTraits + 1,
marAll = c(0, 8, 2, 3),
ylab = "", xlab = "", sub = "", dendroLabels = FALSE, hang = 0.03,
addGuide = TRUE, guideHang = 0.05, cex.rowText = 1.3, cex.colorLabels = 1.2,
rowWidths = c(rep(1, nSelTraits + 1), 15),
addTextGuide = TRUE,
main = spaste(shortLabels[set],
" gene dendrogram, association with traits and module colors"),
cex.main = 1.4);
}
dev.off()
We show the results (in the png version) in Figures 5 and 6. The dendrograms exhibit clear branches that areidentified as modules. In the large-resolution pdf figures one can also see the smaller branches that correspond tosmaller modules.
14
-
Figure 5: The upper panel shows the gene clustering tree (dendrogram) in liver. Each “leaf”, i.e., a short verticalline, corresponds to one gene (more precisely, a microarray probe). Branches of the dendrogram correspond tomodules. Below the dendrogram, color rows annotated by clinical traits give the gene significance for (correlationwith) the corresponding trait. Red color corresponds to positive gene significance (GS), and green color correspondsto negative GS. White color indicates no gene significance; color saturation corresponds to GS strength. The lastcolor row indicates module assignment. Module colors are annotated below the module color row.
15
-
Figure 6: The upper panel shows the gene clustering tree (dendrogram) in adipose. Each “leaf”, i.e., a short verticalline, corresponds to one gene (more precisely, a microarray probe). Branches of the dendrogram correspond tomodules. Below the dendrogram, color rows annotated by clinical traits give the gene significance for (correlationwith) the corresponding trait. Red color corresponds to positive gene significance (GS), and green color correspondsto negative GS. White color indicates no gene significance; color saturation corresponds to GS strength. The lastcolor row indicates module assignment. Module colors are annotated below the module color row.
16
-
4 GO enrichment analysis
Here we perform a functional enrichment analysis of the found modules. There are two main methods one can use:either export lists of genes in each module and use external software, or use the function GOenrichmentAnalysis inWGCNA to calculate enrichment in GO terms. We first show how to export the gene lists for each module for usewith external software, then perform the actual analysis using GOenrichmentAnalysis.
4.a Exporting lists of genes in each module
We (re-)load the gene annotation table and export the matchin Locus Link IDs (also known as Entrez IDs). Theoutput is a set of text files with names such as Liver-3.txt etc. in the subdirectory FEA of the current directory. Ifthe directory FEA does not exist, please create it or modify the variable outFileBase below.
file = bzfile(description = "../../../Data-AllMouse/CXB_GeneAnnotation.csv.bz2");
#file = "../Data-CXB/CXB_all_gene_annotation.csv"
annot = read.csv(file = file);
# Loop over tissues
for (set in 1:nSets)
{
# Base of the file names
outFileBase = spaste("FEA/", shortLabels[set], "-");
nMods = ncol(MEs[[set]]$eigengenes)
modNames = as.numeric(substring(names(MEs[[set]]$eigengenes), 3))
# loop over modules
for (mod in 1:nMods)
{
modGeneInd = (labels[[set]]==modNames[mod]);
modProbes = colnames(expr[[set]]$data)[modGeneInd];
annotInd = match(modProbes, annot$sequence);
annotInd = annotInd[!is.na(annotInd)];
modGeneIDs = annot$LocusLinkID[annotInd];
modGeneIDs = modGeneIDs[!is.na(modGeneIDs)];
write.table(data.frame(LLID = modGeneIDs),
file = paste(outFileBase, modNames[mod], ’.txt’, sep = ""), quote = F, row.names = F,
col.names = F);
}
allAnnotInd = match(names(expr[[set]]$data), annot$sequence);
allAnnotInd = allAnnotInd[!is.na(allAnnotInd)];
GeneIDs = annot$LocusLinkID[allAnnotInd];
GeneIDs = GeneIDs[!is.na(GeneIDs)];
# Also write out a file of all genes in the network, useful as a background list in the analysis.
write.table(data.frame(LLID = GeneIDs),
file = paste(outFileBase, ’all.txt’, sep = ""), quote = F, row.names = F,
col.names = F);
}
4.b GO enrichment analysis in WGCNA
Here we perform the GO enrichment analysis directly in WGCNA. This is usually much more convenient thanuploading each module separately to a separate application, but is restricted to GO. This calculation will takeseveral minutes.
# (re-)read gene annotation
file = bzfile(description = "../../../Data-AllMouse/CXB_GeneAnnotation.csv.bz2");
#file = "../Data-CXB/CXB_all_gene_annotation.csv"
annot = read.csv(file = file);
# Calculate enrichment information
bt = list();
17
-
for (set in 1:nSets)
{
expr2annot = match(colnames(expr[[set]]$data), annot$sequence);
LLID = annot$LocusLinkID[expr2annot];
table(is.na(LLID))
fin = !is.na(LLID);
finLLID = LLID[fin];
finLabels = labels[[set]][fin];
system.time ( {
bt[[set]] = GOenrichmentAnalysis(finLabels, finLLID, organism = "mouse",
nBest = 20, nBiggest = 0, includeOffspring = TRUE);
} );
}
# Save the results for future use
save(bt, file = "Female-LA-GOEnrichemnt-trimmedAndCleanedLabels.RData");
Next we re-format the full information into a more manageable form and print it. To make the prinout readable,please make the R console at least 100 characters wide. The table is also saved into an excel sheet that can be openedusing MS Excel or OpenOffice Calc.
# If necessary, load the results
load(file = "Female-LA-GOEnrichemnt-trimmedAndCleanedLabels.RData");
# Loop over tissues
for (set in 1:nSets)
{
res = bt[[set]]$bestPTerms[[4]]$enrichment;
# Write an excel sheet containing the full information
write.table(res, file = spaste("CxBOnly-Female-", shortLabels[[set]], "-GOenrichment.txt"),
row.names = FALSE, sep = "\t", quote = FALSE);
# Print a "narrower" version
res2 = res[, c(1, 2, 4, 6, 8, 12, 13)];
res2[, c(4, 5)] = signif(apply(res2[, c(4,5)], 2, as.numeric), 2)
rownames(res2) = NULL
names(res2) = c("Mod", "Size", "Rnk", "p.Bonf", "fracModSz", "ont", "termName");
terms = res2$termName;
sterms = substring(terms, 1, 60);
res2$termName = sterms;
options(width = 100);
modules = sort(as.numeric(unique(res2$Mod)));
for (m in modules)
{
printFlush(spaste("=========== Module:", m, "; module size: ", sum(labels[[set]]==m)))
print(res2[res2$Mod==m, -c(1,2)]);
}
}
The result is a long printout of Bonferroni-corrected enrichment p-values. For example, for liver module 6 we get
=========== Module:6; module size: 666Rnk p.Bonf fracModSz ont termName
121 1 2.3e-31 0.460 MF catalytic activity122 2 6.7e-31 0.210 CC mitochondrion123 3 1.1e-28 0.150 MF oxidoreductase activity124 4 6.3e-26 0.140 BP oxidation reduction125 5 1.6e-20 0.500 CC cytoplasm126 6 9.4e-20 0.340 CC cytoplasmic part127 7 2.3e-10 0.099 BP lipid metabolic process
18
-
128 8 1.3e-09 0.480 BP metabolic process129 9 1.4e-09 0.077 BP organic acid metabolic process130 10 1.4e-09 0.077 BP carboxylic acid metabolic process131 11 1.8e-09 0.037 MF oxidoreductase activity, acting on CH-OH group of donors132 12 2.4e-08 0.035 MF electron carrier activity133 13 7.7e-08 0.580 CC intracellular part134 14 2.0e-07 0.032 MF oxidoreductase activity, acting on the CH-OH group of donors135 15 4.4e-07 0.590 CC intracellular136 16 8.8e-07 0.033 MF tetrapyrrole binding137 17 2.0e-06 0.032 MF heme binding138 18 2.4e-06 0.038 BP steroid metabolic process139 19 2.6e-06 0.450 CC intracellular membrane-bounded organelle140 20 2.9e-06 0.450 CC membrane-bounded organelle
Of note is also the adipose module 6 (809 probes) that is extremely highly enriched in the term mitochondrion:
=========== Module:6; module size: 809Rnk p.Bonf fracModSz ont termName
121 1 2.5e-234 0.500 CC mitochondrion122 2 9.2e-142 0.600 CC cytoplasmic part123 3 2.7e-109 0.210 CC mitochondrial part124 4 3.1e-100 0.190 CC mitochondrial envelope125 5 2.8e-98 0.190 CC mitochondrial membrane126 6 9.8e-98 0.170 CC mitochondrial inner membrane127 7 1.9e-92 0.680 CC cytoplasm128 8 6.9e-71 0.140 BP generation of precursor metabolites and energy129 9 5.4e-67 0.650 CC intracellular membrane-bounded organelle130 10 7.5e-67 0.650 CC membrane-bounded organelle131 11 3.3e-56 0.180 BP oxidation reduction132 12 6.4e-55 0.067 CC respiratory chain133 13 5.7e-50 0.080 BP electron transport chain134 14 7.7e-49 0.720 CC intracellular part135 15 1.9e-48 0.170 MF oxidoreductase activity136 16 2.1e-46 0.490 MF catalytic activity137 17 3.4e-46 0.730 CC intracellular138 18 7.4e-28 0.043 BP cellular respiration139 19 1.6e-27 0.550 BP metabolic process140 20 3.8e-20 0.063 MF cofactor binding
We now create text labels for the modules that reflect the name of the term with highest enrichment. We only createa GO label if the corresponding Bonferroni corrected p-value is better than 10−4.
# Crate GO labels for modules
goLabels = list();
goModules = list();
goPvalue = list();
for (set in 1:nSets)
{
goAnn = bt[[set]]$bestPTerms[[4]]$enrichment
nModules = length(unique(goAnn$module));
best = tapply(c(1:nrow(goAnn)), goAnn$module, min);
goModules[[set]] = goAnn$module[best][-1];
goLabels[[set]] = spaste(goModules[[set]], ": ", goAnn$termName[best][-1]);
goPvalue[[set]] = goAnn$BonferoniP[best][-1];
goLabels[[set]] [goPvalue[[set]] > 1e-4] = goModules[[set]] [goPvalue[[set]] > 1e-4];
}
collectGarbage();
19
-
The labels are as follows:
> goLabels[[1]][1] "1: receptor activity" "10: proteasome complex"[3] "11" "12: G-protein coupled receptor activity"[5] "13" "14: intracellular part"[7] "15" "16"[9] "17" "18: nucleus"
[11] "19: extracellular matrix" "2: intracellular"[13] "20: mitochondrion" "21"[15] "22: ribosome" "23: nucleosome assembly"[17] "24: cell adhesion" "25"[19] "26: cellular amino acid metabolic process" "27: mitochondrial part"[21] "29" "3"[23] "33: endoplasmic reticulum" "38"[25] "4: G-protein coupled receptor activity" "45"[27] "48" "5: intracellular"[29] "50" "58: cell cycle"[31] "6: catalytic activity" "64"[33] "65: serine-type peptidase activity" "68"[35] "7: immune response" "70"[37] "71: MHC class I protein complex" "73: nucleosome assembly"[39] "76: hemoglobin complex" "8: receptor activity"[41] "81" "9"
[[2]][1] "1: G-protein coupled receptor activity" "10: mitochondrion"[3] "11: extracellular matrix" "12"[5] "13" "14: ribosome"[7] "15: cell cycle" "16: membrane fraction"[9] "2: nucleus" "3: G-protein coupled receptor activity"
[11] "4: lymphocyte activation" "5: multicellular organismal development"[13] "6: mitochondrion" "7: membrane"[15] "8" "9"
5 Modules related to physiological traits
Here we identify modules related to physiological traits. We use the robust biweight midcorrelation to measure theassociation between each module eigengene and each trait. We consider the association significant if the correlationis above 0.35, corresponding to a p-value of roughly 10−5. Taking into account the number of modules (42 in liver)and number of traits (19), this translates roughly to a Bonferroni corrected p-value threshold of 10−2. The followingrather long section of code generates a list containing information about modules associated to each trait in eachtissue.
# Set up lists to hold the information
bestModules = list();
traitGeneColors = list();
exprSize = checkSets(expr)
nSamples = exprSize$nSamples;
# Correlation thresholds
thresholds = c(0.35, 0.35);
nSelTraits = checkSets(selTraits)$nGenes;
# Loop over tissues
for (set in 1:nSets)
20
-
{
bestModules[[set]] = list();
traitGeneColors[[set]] = list();
modSizes = table(labels[[set]])[match(substring(names(MEs[[set]]$eigengenes), 3),
names(table(labels[[set]])))]
modColors = rep("grey", length(modSizes))
modNumbers = as.numeric(substring(names(MEs[[set]]$eigengenes), 3))
modColors[modNumbers!=0] = standardColors()[modNumbers[modNumbers!=0]]
# Loop over traits
for (t in 1:nSelTraits)
{
bestModules[[set]][[t]] = list();
bestModules[[set]][[t]]$trait = colnames(selTraits[[set]]$data)[t];
x = bicorAndPvalue(MEs[[set]]$eigengenes, selTraits[[set]]$data[, t])
cors = x$bicor;
pvals = x$p;
# Put the p-values into a single data frame
significance = data.frame(modSizes, cors, pvals, as.numeric(MEs[[set]]$varExplained),
modColors);
names(significance) = c("nGenes", spaste("r.", shortLabels[set]),
spaste("p.", shortLabels[set]), spaste("PVE.", shortLabels[set]), "Color");
rownames(significance) = names(MEs[[set]]$eigengenes);
order = order(significance[, 3]);
significant = significance[, 3] < 0.001;
bestModules[[set]][[t]]$significance = significance[order, ];
printSignif = significance;
printSignif[, c(2:4)] = signif(significance[, c(2:4)], 3);
bestModules[[set]][[t]]$printSignif = printSignif[order, ];
bestModules[[set]][[t]]$bestSignif = printSignif[order, ][
(abs(printSignif[order, 2]) > thresholds[set]), ]
printFlush("\n==============================================================================\n");
printFlush("Significance for", bestModules[[set]][[t]]$trait, ":");
options(width = 100);
print(bestModules[[set]][[t]]$bestSignif);
moduleList1 = as.numeric(substring(rownames(bestModules[[set]][[t]]$bestSignif), 3))
bestModules[[set]][[t]]$bestModules = moduleList1;
bestModules[[set]][[t]]$nGenesInBestModules = sum(bestModules[[set]][[t]]$bestSignif$nGenes);
}
}
# Save the results for future use
save(bestModules, file = "Female-LA-bestModules.RData");
For use in subsequent analysis, we also form a separate list of modules associated with each trait.
keepModules = list();
nKeep = rep(0, nSets);
traitGeneColors = list();
nSamples = exprSize$nSamples;
nGenes = exprSize$nGenes;
basePVal = 0.01;
bestLabels = list();
bestColors = list();
for (set in 1:nSets)
{
z = qnorm(1-basePVal)/sqrt(nSamples[set]-3);
baseCor = tanh(z);
cor = bicor(expr[[set]]$data, selTraits[[set]]$data, use = "p");
cor[abs(cor) < baseCor] = 0;
traitGeneColors[[set]] = numbers2colors(cor, signed = TRUE);
colnames(traitGeneColors[[set]]) = colnames(selTraits[[set]]$data);
21
-
keepModules[[set]] = vector();
bestLabels[[set]] = matrix(0, nGenes, nSelTraits);
for (t in 1:nSelTraits)
{
keepModules[[set]] = c(keepModules[[set]], bestModules[[set]][[t]]$bestModules);
keep = labels[[set]] %in% bestModules[[set]][[t]]$bestModules
bestLabels[[set]][keep, t] = labels[[set]][ keep ];
}
keepModules[[set]] = sort(unique(keepModules[[set]]));
nKeep[set] = length(keepModules[[set]]);
bestColors[[set]] = labels2colors(bestLabels[[set]]);
}
hdlInd = match("e_hdl_mgdl", colnames(selTraits[[1]]$data));
Which modules are associated with HDL? In liver we find the following:
> bestModules[[1]] [[hdlInd]] $ bestSignifnGenes r.Liver p.Liver PVE.Liver Color
ME6 666 0.564 3.14e-13 0.368 redME64 40 0.470 4.02e-09 0.484 skyblue2ME11 379 -0.455 1.41e-08 0.333 greenyellowME20 161 0.419 2.28e-07 0.378 royalblueME10 409 -0.419 2.32e-07 0.338 purpleME16 231 0.398 1.04e-06 0.321 lightcyanME21 145 -0.396 1.14e-06 0.368 darkredME18 201 -0.385 2.39e-06 0.350 lightgreen
In adipose, we only find one module:
> bestModules[[2]] [[hdlInd]]$bestSignifnGenes r.Adipose p.Adipose PVE.Adipose Color
ME7 791 0.463 4.44e-10 0.424 black
5.a Module-trait relationships for all modules that relate significantly to a trait
Here we produce color-coded tables of module significance (defined as robust correlation of the module eigengeneand the trait) of between modules and traits. We restrict the modules to those that relate significantly to at leastone trait. We first calculate matrices holding the module significances and the corresponding p-values. We use therobust biweight midcorrelation to quantify module significance.
nTraits = dim(selTraits[[1]]$data)[2];
ordTraits = consensusOrderMEs(selTraits, greyLast = FALSE);
TraitSignif = vector(mode="list", length = nSets);
TraitCor = vector(mode="list", length = nSets);
TraitLabels = colnames(ordTraits[[1]]$data);
newTraitLabels = renameTable[ match(TraitLabels, renameTable[, 1]), 2];
MELabels = list();
for (set in 1:nSets)
{
MELabels[[set]] = colnames(ordMEs[[set]]$eigengenes);
tmp = bicorAndPvalue(ordMEs[[set]]$eigengenes, ordTraits[[set]]$data)
TraitSignif[[set]] = tmp$p
TraitCor[[set]] = tmp$bicor
}
minp = 1; maxp = 0;
for (set in 1:nSets)
{
minp = min(minp, TraitSignif[[set]]);
22
-
maxp = max(maxp, TraitSignif[[set]]);
}
if (minp
-
Female Liver module−trait significance
−0.5
0
0.5
fluid
FFA
trigly
tot.c
hol
unes
t.cho
l
mus
cleBM
Dlen
gth
efat
gluco
se
insuli
nHD
Llep
tin fat
fat.fr
ac rfat
vfat
weigh
tsfa
t
ME16
ME20: mitochondrion
ME64
ME6: catalytic activity
ME25
ME27: mitochondrial part
ME73: nucleosome assembly
ME19: extracellular matrix
ME7: immune response
ME58: cell cycle
ME70
ME21
ME26: cellular amino acidmetabolic process
ME45
ME13
ME18: nucleus
ME10: proteasome complex
ME11
ME9
ME1: receptor activity
ME17
−0.312e−04
0.150.07
0.0860.3
0.270.001
0.130.1
0.170.04
0.0810.3
0.336e−05
0.482e−09
0.335e−05
0.0650.5
0.41e−06
0.526e−09
0.473e−09
0.52e−10
0.496e−10
0.54e−10
0.491e−09
0.497e−10
−0.260.002
0.00630.9
0.00291
0.361e−05
0.110.2
0.0660.4
0.0440.6
0.230.006
0.312e−04
0.240.004
0.150.08
0.422e−07
0.375e−05
0.34e−04
0.338e−05
0.352e−05
0.321e−04
0.33e−04
0.33e−04
−0.53e−10
0.34e−04
0.170.05
0.343e−05
0.170.05
0.10.2
−0.0110.9
0.250.002
0.452e−08
0.343e−05
0.170.06
0.474e−09
0.546e−10
0.496e−10
0.532e−11
0.53e−10
0.511e−10
0.447e−08
0.431e−07
−0.461e−08
0.120.2
0.0950.3
0.545e−12
0.240.005
0.295e−04
0.120.2
0.393e−06
0.591e−14
0.524e−11
0.280.001
0.563e−13
0.532e−09
0.64e−15
0.615e−16
0.637e−17
0.648e−18
0.654e−18
0.591e−14
−0.140.1
−0.140.09
−0.190.02
0.422e−07
0.120.2
0.0980.2
0.00531
0.170.04
0.150.07
0.160.06
0.140.1
0.352e−05
0.210.03
0.160.05
0.160.06
0.230.006
0.210.01
0.180.03
0.150.08
−0.338e−05
−0.0460.6
−0.0960.3
0.210.01
−0.0210.8
0.0690.4
−0.0220.8
0.160.07
0.288e−04
0.416e−07
0.270.002
0.240.004
0.280.003
0.287e−04
0.33e−04
0.295e−04
0.337e−05
0.321e−04
0.230.006
−0.180.04
−0.110.2
−0.0620.5
0.230.005
0.0480.6
0.170.04
0.140.1
0.260.002
0.250.003
0.452e−08
0.323e−04
0.220.009
0.140.1
0.240.004
0.230.006
0.295e−04
0.329e−05
0.352e−05
0.260.002
−0.180.03
0.20.02
0.140.1
0.180.03
0.0570.5
0.375e−06
0.0810.3
0.343e−05
0.475e−09
0.312e−04
0.0760.4
0.240.005
0.377e−05
0.53e−10
0.481e−09
0.473e−09
0.511e−10
0.572e−13
0.53e−10
0.00670.9
0.130.1
0.150.08
0.295e−04
0.180.03
0.383e−06
0.120.2
0.260.002
0.352e−05
0.210.01
0.0240.8
0.270.001
0.20.03
0.344e−05
0.295e−04
0.377e−06
0.438e−08
0.481e−09
0.422e−07
0.0380.7
0.130.1
0.240.004
0.220.01
0.120.2
0.376e−06
0.0690.4
0.220.009
0.33e−04
0.240.004
0.0830.4
0.110.2
0.180.06
0.33e−04
0.250.003
0.33e−04
0.41e−06
0.444e−08
0.311e−04
−0.0240.8
0.180.04
0.180.04
0.0660.4
0.260.002
−0.296e−04
−0.33e−04
−0.377e−06
−0.20.02
0.130.1
0.0610.5
0.0320.7
−0.0320.7
−0.160.06
−0.140.1
−0.140.1
−0.190.02
−0.240.005
−0.0950.3
0.296e−04
−0.180.03
−0.120.2
−0.295e−04
−0.140.09
−0.295e−04
−0.120.1
−0.361e−05
−0.532e−11
−0.384e−06
−0.040.7
−0.41e−06
−0.511e−08
−0.523e−11
−0.531e−11
−0.517e−11
−0.552e−12
−0.599e−15
−0.551e−12
0.312e−04
−0.287e−04
−0.250.002
−0.180.03
−0.120.2
−0.352e−05
−0.180.03
−0.41e−06
−0.622e−16
−0.392e−06
−0.140.1
−0.33e−04
−0.486e−08
−0.611e−15
−0.66e−15
−0.67e−15
−0.656e−18
−0.681e−20
−0.671e−19
0.120.1
−0.110.2
−0.120.2
−0.443e−08
−0.220.008
0.0690.4
0.140.1
0.0910.3
−0.0570.5
−0.270.001
−0.160.07
−0.33e−04
−0.130.2
−0.0510.5
−0.0710.4
−0.0930.3
−0.120.2
−0.0720.4
−0.050.6
0.180.04
−0.160.05
−0.296e−04
−0.250.003
−0.160.06
−0.288e−04
−0.130.1
−0.170.04
−0.32e−04
−0.352e−05
−0.20.02
−0.260.002
−0.376e−05
−0.33e−04
−0.270.001
−0.338e−05
−0.321e−04
−0.391e−06
−0.33e−04
0.312e−04
−0.160.05
−0.230.007
−0.422e−07
−0.220.008
−0.338e−05
−0.110.2
−0.240.004
−0.423e−07
−0.47e−07
−0.250.004
−0.392e−06
−0.433e−06
−0.452e−08
−0.443e−08
−0.482e−09
−0.473e−09
−0.52e−10
−0.422e−07
0.180.03
0.0150.9
0.00351
−0.54e−10
−0.150.08
−0.240.004
−0.0560.5
−0.288e−04
−0.383e−06
−0.352e−05
−0.240.005
−0.422e−07
−0.342e−04
−0.361e−05
−0.336e−05
−0.415e−07
−0.452e−08
−0.446e−08
−0.391e−06
0.41e−06
−0.0710.4
−0.0650.4
−0.444e−08
−0.160.06
−0.392e−06
−0.120.1
−0.423e−07
−0.591e−14
−0.517e−11
−0.230.01
−0.461e−08
−0.471e−07
−0.65e−15
−0.611e−15
−0.612e−15
−0.633e−17
−0.675e−20
−0.592e−14
0.160.06
−0.20.02
−0.230.005
−0.0740.4
−0.140.09
−0.321e−04
−0.180.03
−0.210.01
−0.361e−05
−0.0530.5
−0.0890.3
−0.120.1
−0.343e−04
−0.382e−06
−0.361e−05
−0.343e−05
−0.344e−05
−0.415e−07
−0.352e−05
0.0990.2
−0.180.04
−0.220.009
−0.0980.2
−0.0590.5
−0.34e−04
−0.110.2
−0.230.007
−0.376e−06
0.0180.8
−0.0640.5
−0.150.09
−0.280.002
−0.361e−05
−0.352e−05
−0.329e−05
−0.338e−05
−0.392e−06
−0.337e−05
0.140.1
−0.130.1
−0.240.005
−0.210.01
−0.130.1
−0.329e−05
−0.110.2
−0.210.01
−0.311e−04
−0.140.09
−0.150.09
−0.230.007
−0.342e−04
−0.338e−05
−0.312e−04
−0.312e−04
−0.296e−04
−0.383e−06
−0.287e−04
Figure 7: Module significance of selected liver modules for traits measured for this cross. Numbers in the tableindicate the robust correlations and the corresponding p-values. The table is colored by correlation with red colorrepresenting positive correlation and green negative correlation.
24
-
Female Adipose module−trait significance
−0.5
0
0.5
fluid
FFA
trigly
tot.c
hol
unes
t.cho
l
mus
cleBM
Dlen
gth
efat
gluco
se
insuli
nHD
Llep
tin fat
fat.fr
ac rfat
vfat
weigh
tsfa
t
ME7: membrane
ME9
ME11: extracellular matrix
ME4: lymphocyte activation
ME8
ME13
ME16: membrane fraction
−0.651e−20
0.267e−04
0.210.006
0.39e−05
0.180.02
0.275e−04
0.150.05
0.41e−07
0.812e−39
0.292e−04
0.311e−04
0.464e−10
0.84e−31
0.81e−38
0.88e−38
0.83e−37
0.781e−35
0.751e−31
0.827e−42
−0.422e−08
0.160.04
0.150.06
0.190.02
0.140.08
0.269e−04
0.190.01
0.332e−05
0.631e−19
0.315e−05
0.278e−04
0.38e−05
0.527e−11
0.615e−18
0.582e−16
0.591e−16
0.621e−18
0.631e−19
0.641e−20
−0.42e−07
0.150.06
0.120.1
0.190.01
0.10.2
0.49e−08
0.292e−04
0.451e−09
0.662e−22
0.282e−04
0.240.003
0.274e−04
0.541e−11
0.676e−23
0.652e−21
0.681e−23
0.689e−24
0.724e−27
0.78e−26
0.0870.3
−0.0220.8
−0.0720.4
0.0260.7
0.0690.4
−0.386e−07
−0.267e−04
−0.49e−08
−0.283e−04
−0.20.009
−0.120.2
−0.110.2
−0.250.003
−0.39e−05
−0.220.004
−0.316e−05
−0.392e−07
−0.444e−09
−0.392e−07
0.0630.4
−0.0390.6
−0.0560.5
−0.0260.7
−0.0380.6
−0.210.007
−0.260.001
−0.170.03
−0.323e−05
−0.150.06
−0.140.09
−0.0680.4
−0.130.1
−0.266e−04
−0.240.002
−0.268e−04
−0.324e−05
−0.362e−06
−0.323e−05
0.0770.3
−0.120.1
−0.180.02
−0.080.3
−0.110.2
−0.31e−04
−0.284e−04
−0.160.04
−0.331e−05
−0.160.04
−0.0960.2
−0.0640.4
−0.150.08
−0.314e−05
−0.291e−04
−0.31e−04
−0.355e−06
−0.41e−07
−0.362e−06
−0.393e−07
0.150.05
0.0780.3
0.190.02
0.0430.6
0.274e−04
0.180.02
0.451e−09
0.512e−12
0.0580.5
0.190.02
0.316e−05
0.513e−10
0.541e−13
0.546e−14
0.553e−14
0.56e−12
0.491e−11
0.531e−13
Figure 8: Module significance of selected adipose modules for traits measured for this cross. Numbers in the tableindicate the robust correlations and the corresponding p-values. The table is colored by correlation with red colorrepresenting positive correlation and green negative correlation.
25
-
5.b Network plots of all module eigengenes and traits
We now produce plots of networks composed of module eigengenes and traits in each tissue. The plots are too big todisplay comfortably on screen, but can be viewed using a pdf viewer which will usually provide a zoom function. Theeigengene network plot contains two panels, one with a dendrogram of eigengenes and traits, and the correspondingcolor-coded heatmap and correlation/p-value table.
widths = c(20, 13)
for (set in 1:nSets)
{
mets = list(a = list(data = cbind(MEs[[set]]$eigengenes, selTraits[[set]]$data)));
colnames(mets$a$data) = c(colnames(MEs[[set]]$eigengene), renameTable[, 2]);
omets = consensusOrderMEs(mets);
pdf(file = spaste("Plots/Female-", shortLabels[set], "-ME-selTraitNetworkHeatmaps.pdf"),
width = widths[set], height = 2*widths[set]);
plotEigengeneNetworks(omets, shortLabels[set], marDendro = c(0,2,2,2), zlimPreservation = c(0,1),
marHeatmap = c(5,5,2,2), setMargins = TRUE,
plotAdjacency = FALSE,
printAdjacency = TRUE, cex.adjacency = 0.5)
dev.off();
}
6 Output of module membership, eigengenes, and eigengene correla-tions
In this section we output a whole lot of the network information into text csv files that can be viewed in MS Excel,OpenOffice Calc and other similar spreadsheet software. We begin with lists of samples used in network analysis and“expressions” of module eigengenes.
# Samples that are used for network analysis
for (set in 1:nSets)
{
samples = rownames(expr[[set]]$data);
write.table(data.frame(samples), col.names = FALSE, row.names = FALSE, quote = FALSE,
file = spaste("DataForOthers/CxBOnly-Female-", shortLabels[set], "-networkSamples.txt"));
}
# Module eigengenes
for (set in 1:nSets)
{
write.table(as.data.frame(cbind(Mice_id = rownames(expr[[set]]$data), MEs[[set]]$eigengenes)),
col.names = TRUE, row.names = FALSE, sep = ",", quote = FALSE,
file = spaste("DataForOthers/CxBOnly-Female-", shortLabels[set], "-moduleEigengenes.csv"))
}
Next we output a table of module-trait associations.
# Module-trait relationships
moduleTraitRels = list();
for (set in 1:nSets)
{
moduleTraitRels[[set]] = bicorAndPvalue(MEs[[set]]$eigengenes, selTraits[[set]]$data)
outMat = rbind(moduleTraitRels[[set]]$bicor, moduleTraitRels[[set]]$p);
dim(outMat) = c(ncol(MEs[[set]]$eigengenes), 2*nSelTraits)
nameMat = matrix(cbind(spaste("bicor.", colnames(selTraits[[set]]$data)),
spaste("p.", colnames(selTraits[[set]]$data))),
2, nSelTraits, byrow = TRUE)
colnames(outMat) = as.vector(nameMat);
26
-
write.table(as.data.frame(cbind(Eigengene = colnames(MEs[[set]]$eigengenes), outMat)),
col.names = TRUE, row.names = FALSE, sep = ",", quote = FALSE,
file = spaste("DataForOthers/CxBOnly-Female-", shortLabels[set], "-moduleTraitBicor.csv"))
}
Lastly, we output a large table of fuzzy module membership and gene significance for all traits. The modulemembership MM of a gene (probe), also known as module eigengene-based connectivity kME, is given by thecorrelation of the gene expression profile and the module eigengene. Similarly, gene significance for a trait is givenby the correlation of the gene expression profile with the numeric trait.
file = bzfile(description = "../../../Data-AllMouse/CXB_GeneAnnotation.csv.bz2");
#file = "../Data-CXB/CXB_all_gene_annotation.csv"
annot = read.csv(file = file);
for (set in 1:nSets)
{
# Calculate module membership, a.k.a kME
KMEall = bicorAndPvalue(expr[[set]]$data, MEs[[set]]$eigengenes);
KMEmod = rep(NA, nGenes)
KMEmodP = rep(NA, nGenes)
modLevels = sort(unique(labels[[set]]));
nMods = length(modLevels);
for (mod in 1:nMods)
{
inMod = labels[[set]]==modLevels[mod];
# This assumes MEs[[set]]$eigengenes are sorted the same way as modLevels
KMEmod[inMod] = KMEall$bicor[inMod, mod];
KMEmodP[inMod] = KMEall$p[inMod, mod];
}
kmeMat = rbind(KMEall$bicor, KMEall$p);
dim(kmeMat) = c(nGenes, 2*nMods);
nameMat = matrix(cbind(spaste("k", colnames(MEs[[set]]$eigengenes)),
spaste("p.k", colnames(MEs[[set]]$eigengenes))),
2, nMods, byrow = TRUE)
colnames(kmeMat) = as.vector(nameMat);
# Connect probe names to gene names
genes = colnames(expr[[set]]$data)
expr2annot = match(genes, annot$sequence);
annotInfo = annot[expr2annot, c(4,5,6,7,8,9)];
# Calculate gene significance
GS = bicorAndPvalue(expr[[set]]$data, selTraits[[set]]$data);
GSmat = rbind(GS$bicor, GS$p);
dim(GSmat) = c(nGenes, 2*nSelTraits);
nameMat = matrix(cbind(spaste("GS.", colnames(selTraits[[set]]$data)),
spaste("pGS", colnames(selTraits[[set]]$data))),
2, nSelTraits, byrow = TRUE)
colnames(GSmat) = as.vector(nameMat);
# Put it all together
info = cbind(annotInfo,
moduleLabel = labels[[set]],
moduleColor = labels2colors(labels[[set]]),
KME.labelModule = KMEmod,
pKME.labelModule = KMEmodP,
GSmat,
kmeMat);
# Save the big table into a text csv file
write.table(info,
col.names = TRUE, row.names = FALSE, sep = ",", quote = FALSE,
file = spaste("DataForOthers/CxBOnly-Female-", shortLabels[set], "-moduleMembership.csv"))
27
-
collectGarbage();
}
7 Overlap of liver and adipose modules
We now produce a color-coded overlap table of liver and adipose modules.
# Call the overlapTable function to calculate the overlaps
overlap = overlapTable(labels[[1]], labels[[2]]);
# Prepare axis labels for the table plot
modSizes = lapply(labels, table);
xLabels = spaste("A.", sort(unique(labels[[2]])), " (", modSizes[[2]], ")");
yLabels = spaste("L.", sort(unique(labels[[1]])), " (", modSizes[[1]], ")");
# Content of the table
textMat = spaste(overlap$countTable, "|", signif(overlap$pTable, 1));
mat = overlap$pTable;
mat[mat
-
Overlap of adipose and liver modules
0
10
20
30
40
50
60
A.0
(607
5)
A.1
(540
0)
A.2
(303
1)
A.3
(180
4)
A.4
(155
9)
A.5
(115
1)
A.6
(809
)
A.7
(791
)
A.8
(442
)
A.9
(519
)
A.10
(490
)
A.11
(335
)
A.12
(358
)
A.13
(311
)
A.14
(246
)
A.15
(239
)
A.16
(63)
L.0 (9910)L.1 (2821)L.2 (1262)L.3 (1283)
L.4 (909)L.5 (777)L.6 (666)L.7 (637)L.8 (522)L.9 (435)
L.10 (409)L.11 (379)L.12 (343)L.13 (323)L.14 (268)L.15 (257)L.16 (231)L.17 (190)L.18 (201)L.19 (193)L.20 (161)L.21 (145)L.22 (135)L.23 (128)L.24 (126)
L.25 (92)L.26 (87)L.27 (80)L.29 (84)L.33 (63)L.38 (58)L.45 (50)L.48 (45)L.50 (51)L.58 (42)L.64 (40)L.65 (38)L.68 (32)L.70 (33)L.71 (33)L.73 (31)L.76 (28)L.81 (25)
3439|1e−157 1679|1 1187|1 613|1 588|1 651|1e−24 323|0.9 369|0.004 97|1 211|0.7 217|0.2 152|0.1 86|1 107|1 65|1 107|0.2 19|1
287|1 2018|0 26|1 64|1 45|1 56|1 34|1 35|1 119|4e−18 7|1 8|1 7|1 65|4e−04 34|0.7 7|1 3|1 6|0.8
156|1 47|1 729|0 22|1 41|1 28|1 31|1 20|1 1|1 94|7e−26 59|5e−09 16|0.7 1|1 2|1 7|1 8|0.9 0|1
159|1 107|1 21|1 781|0 46|1 33|1 18|1 20|1 37|0.006 6|1 5|1 3|1 11|1 20|0.2 9|0.9 4|1 3|0.7
306|3e−08 328|2e−20 122|0.3 29|1 16|1 27|1 5|1 21|1 13|0.9 6|1 10|1 4|1 10|0.9 2|1 5|1 4|1 1|0.9
144|1 30|1 375|2e−133 18|1 42|0.9 22|1 35|0.06 21|0.9 2|1 30|0.002 33|9e−05 8|0.9 1|1 2|1 6|0.8 8|0.5 0|1
139|1 88|1 85|0.5 21|1 26|1 34|0.4 77|6e−21 68|3e−16 8|0.9 33|1e−05 49|2e−14 17|0.01 3|1 4|1 5|0.8 8|0.4 1|0.8
146|1 35|1 41|1 8|1 258|1e−138 27|0.8 10|1 43|1e−05 1|1 19|0.1 3|1 21|3e−04 2|1 1|1 6|0.7 15|0.002 1|0.8
136|0.4 242|6e−33 33|1 15|1 13|1 18|1 8|1 3|1 11|0.4 1|1 8|0.9 3|1 18|0.001 6|0.7 2|1 5|0.6 0|1
62|1 145|3e−07 9|1 26|0.9 14|1 11|1 6|1 10|0.9 33|1e−11 0|1 1|1 0|1 23|3e−07 89|4e−81 2|0.9 2|0.9 2|0.3
116|0.1 29|1 61|0.1 18|1 46|3e−04 32|0.006 39|1e−08 15|0.4 4|0.9 6|0.9 14|0.05 7|0.4 3|0.9 1|1 10|0.01 6|0.2 2|0.3
103|0.3 74|0.9 26|1 8|1 46|5e−05 20|0.4 28|1e−04 22|0.009 7|0.6 10|0.3 15|0.01 6|0.5 2|1 5|0.6 5|0.4 2|0.9 0|1
81|0.8 109|8e−05 51|0.1 25|0.6 18|0.9 13|0.9 4|1 6|1 9|0.2 5|0.9 6|0.7 5|0.5 3|0.9 3|0.8 1|1 2|0.9 2|0.2
99|0.03 29|1 7|1 43|3e−04 107|1e−46 14|0.7 2|1 2|1 8|0.3 2|1 0|1 1|1 3|0.9 3|0.8 1|1 2|0.8 0|1
66|0.7 16|1 74|7e−11 6|1 32|9e−04 9|0.9 11|0.3 8|0.7 0|1 18|3e−05 15|5e−04 5|0.3 1|1 2|0.9 4|0.3 1|0.9 0|1
22|1 41|1 2|1 7|1 6|1 7|1 1|1 2|1 48|9e−34 1|1 0|1 1|1 107|3e−129 8|0.02 2|0.8 0|1 2|0.1
59|0.5 19|1 19|1 3|1 35|4e−06 30|1e−06 12|0.1 21|4e−05 0|1 7|0.2 11|0.009 7|0.05 1|1 0|1 1|0.9 5|0.09 1|0.5
19|1 90|9e−14 4|1 33|7e−06 9|0.9 4|1 5|0.8 1|1 14|1e−05 0|1 0|1 1|0.9 4|0.3 4|0.2 2|0.6 0|1 0|1
74|3e−04 17|1 6|1 7|1 56|7e−21 10|0.5 6|0.7 4|0.9 5|0.3 2|0.9 1|1 2|0.8 2|0.8 1|0.9 5|0.06 1|0.9 2|0.1
32|1 9|1 15|1 1|1 24|0.002 24|2e−05 2|1 27|3e−10 0|1 17|1e−06 3|0.8 31|1e−23 0|1 0|1 2|0.6 6|0.01 0|1
40|0.6 24|1 18|0.8 0|1 10|0.6 7|0.7 22|3e−08 12|0.008 3|0.6 9|0.009 7|0.05 3|0.4 3|0.4 2|0.6 0|1 1|0.8 0|1
49|0.02 11|1 19|0.5 3|1 13|0.2 13|0.03 12|0.004 6|0.4 0|1 2|0.8 4|0.4 1|0.9 4|0.2 6|0.01 0|1 2|0.4 0|1
9|1 22|1 4|1 6|1 1|1 1|1 0|1 1|1 0|1 0|1 0|1 0|1 0|1 0|1 91|1e−153 0|1 0|1
29|0.8 9|1 34|2e−05 1|1 4|1 6|0.6 9|0.03 7|0.1 0|1 15|1e−07 6|0.05 3|0.3 0|1 0|1 0|1 5|0.01 0|1
26|0.9 3|1 4|1 5|1 9|0.5 21|7e−07 1|1 16|5e−06 1|0.9 2|0.8 1|0.9 13|3e−08 0|1 0|1 1|0.7 2|0.4 21|7e−33
33|0.02 17|0.9 8|0.9 2|1 2|1 4|0.7 10|0.001 7|0.03 3|0.2 2|0.6 2|0.6 0|1 0|1 1|0.7 0|1 1|0.6 0|1
24|0.4 22|0.3 6|1 4|0.9 8|0.2 3|0.8 10|8e−04 2|0.8 2|0.5 2|0.6 2|0.5 0|1 1|0.7 1|0.7 0|1 0|1 0|1
9|1 8|1 6|1 2|1 0|1 1|1 40|3e−37 5|0.1 2|0.4 2|0.5 4|0.08 0|1 0|1 0|1 1|0.6 0|1 0|1
36|5e−04 13|1 2|1 5|0.8 12|0.009 4|0.6 5|0.2 1|0.9 3|0.2 0|1 0|1 0|1 1|0.7 2|0.3 0|1 0|1 0|1
19|0.2 7|1 8|0.6 2|1 5|0.4 4|0.4 2|0.6 1|0.9 0|1 3|0.2 1|0.7 2|0.2 1|0.6 1|0.6 3|0.03 4|0.004 0|1
21|0.05 17|0.2 6|0.8 2|0.9 0|1 3|0.5 5|0.05 1|0.9 1|0.7 0|1 0|1 1|0.6 1|0.6 0|1 0|1 0|1 0|1
21|0.009 4|1 7|0.5 2|0.9 2|0.9 7|0.01 2|0.5 0|1 0|1 1|0.7 0|1 1|0.5 0|1 1|0.5 1|0.4 1|0.4 0|1
11|0.6 13|0.2 0|1 2|0.9 2|0.8 2|0.7 10|2e−06 0|1 3|0.05 0|1 1|0.6 0|1 0|1 1|0.4 0|1 0|1 0|1
17|0.1 4|1 4|0.9 4|0.6 1|1 2|0.7 2|0.5 2|0.5 0|1 2|0.3 1|0.7 11|1e−10 0|1 0|1 0|1 1|0.4 0|1
4|1 4|1 1|1 1|1 4|0.3 1|0.9 0|1 2|0.4 0|1 0|1 0|1 0|1 0|1 0|1 1|0.4 24|1e−37 0|1
8|0.8 24|5e−07 1|1 1|1 2|0.8 1|0.9 0|1 1|0.7 1|0.5 1|0.6 0|1 0|1 0|1 0|1 0|1 0|1 0|1
8|0.8 20|6e−05 2|1 0|1 3|0.5 0|1 0|1 2|0.4 2|0.2 1|0.6 0|1 0|1 0|1 0|1 0|1 0|1 0|1
8|0.6 14|0.007 1|1 0|1 3|0.4 0|1 0|1 1|0.7 2|0.1 0|1 1|0.5 0|1 1|0.4 1|0.3 0|1 0|1 0|1
9|0.5 2|1 1|1 0|1 1|0.9 0|1 18|2e−18 1|0.7 0|1 1|0.5 0|1 0|1 0|1 0|1 0|1 0|1 0|1
20|2e−05 0|1 3|0.8 0|1 6|0.02 0|1 1|0.7 2|0.3 0|1 0|1 0|1 1|0.4 0|1 0|1 0|1 0|1 0|1
5|0.9 3|1 3|0.8 0|1 0|1 0|1 3|0.09 2|0.3 0|1 1|0.5 2|0.1 1|0.4 0|1 1|0.3 1|0.3 9|2e−11 0|1
20|5e−07 4|0.9 0|1 0|1 2|0.6 0|1 0|1 1|0.6 0|1 0|1 0|1 1|0.3 0|1 0|1 0|1 0|1 0|1
4|0.9 3|0.9 0|1 14|4e−10 1|0.8 1|0.7 0|1 0|1 2|0.08 0|1 0|1 0|1 0|1 0|1 0|1 0|1 0|1
Figure 9: Overlap of liver (y-axis) and adipose (x-axis) modules. Each row corresponds to a liver module indicatedon the left by name, color, and number of probes in the module. Conversely, each column corresponds to anadipose module indicted at the bottom. Numbers in the table indicate the number of probes in the overlap and thecorresponding Fisher exact p-value. The table is colored according to − log10 p, with the colors scale indicated onthe right. The large modules 1–4, and “module” 0, overlap very strongly between the tissues. Some other, smallermodules, also show strong overlaps, but HDL-related modules overlap more weakly with modules in the oppositetissue.
29
-
8 Gene significance and module membership in HDL-related modulesare correlated
Here we show that in HDL-related modules, highly connected genes (referred to as intramodular hub genes) also tendto have high gene significance for HDL. We use the function verboseScatterplot to plot annotated scatterplots ofgene significance (GS) vs. module membership (also known as eigengene-based connectivity kME).
hdlInd = match("e_hdl_mgdl", colnames(selTraits[[1]]$data));
sizeGrWindow(7, 8);
#pdf(file = "Plots/Female-LA-HubgeneSignifForHDL.pdf", width = 7, height = 8);
par(mfrow = c(3,3));
par(mar = c(3.5, 3.5, 4, 0.5));
par(mgp = c(1.8, 0.6, 0));
for (set in 1:nSets)
{
# Select only modules related to HDL
moduleList1 = bestModules[[set]][[hdlInd]]$bestModules;
for (mod in moduleList1) # For each module...
{
# Find the module in the eigengenes
modGeneInd = (labels[[set]] == mod);
meInd = match(paste("ME", mod, sep=""), names(MEs[[set]]$eigengenes));
# Calculate GS, KME, and module eigengenes significance (MES)
nModGenes = sum(modGeneInd);
KME = bicor(expr[[set]]$data[, modGeneInd], MEs[[set]]$eigengenes[, meInd], use = ’p’);
GS = bicor(expr[[set]]$data[, modGeneInd], selTraits[[set]]$data[, hdlInd], use = ’p’);
MS = bicor(MEs[[set]]$eigengenes[, meInd], selTraits[[set]]$data[, hdlInd], use = ’p’);
# Plot GS vs. kME
verboseScatterplot(KME, GS,
main = paste(shortLabels[set], mod, standardColors()[mod], "\nMES =",
signif(MS, 2), "\n"),
xlab = paste("kME in", shortLabels[set]),
ylab = paste("GS.HDL in", shortLabels[set]), abline = TRUE, cex.lab = 1.2,
cex.main = 1.2, cex.axis = 1.2);
}
}
# If plotting into a file, close it.
dev.off();
The result is shown in Figure 10. We observe that GS.HDL and kME are strongly correlated, that is hub genes inHDL-related modules also tend to be strongly related to HDL.
30
-
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
● ●●●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●●
●
●
●●
● ●
●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●● ●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
● ●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●●
●
●
●
●
● ●
●
●●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
● ●
●
●●
●●
●●●●
●● ●
●
●
●
●
●
●
●●
●
●
0.4 0.6 0.8
0.0
0.2
0.4
0.6
Liver 6 red MES = 0.56
cor=0.62, p=5.6e−72
kME in Liver
GS
.HD
L in
Liv
er
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
0.4 0.6 0.8
0.20
0.30
0.40
0.50
Liver 64 skyblue2 MES = 0.47
cor=0.6, p=4.3e−05
kME in Liver
GS
.HD
L in
Liv
er
●
● ●●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
● ● ●
●
●●
●
●
●
● ●
●
●
●
●
●●
●●
●●
●
●
●
● ●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
● ●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●
●●●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●●
●●●
●
● ●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
● ●●●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
0.3 0.4 0.5 0.6 0.7 0.8−0.
6−
0.4
−0.
20.
0
Liver 11 greenyellow MES = −0.46
cor=−0.53, p=7.8e−29
kME in Liver
GS
.HD
L in
Liv
er●
●
●●
●
●
●
●
●
●●
●
●●●
●
●●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
0.4 0.5 0.6 0.7 0.8
0.0
0.2
0.4
Liver 20 royalblue MES = 0.42
cor=0.36, p=2.7e−06
kME in Liver
GS
.HD
L in
Liv
er
●
●●
●●
●●●
●
●●
●
●
●
●
●●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
● ●
●
●●●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●�