analysis of the yeast transcriptional regulatory network

Post on 06-Feb-2016

41 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Analysis of the yeast transcriptional regulatory network. Transcription Factor (TF). A TF is a protein that binds to DNA sequences and regulates the transcriptions of corresponding genes. Usually the binding site of a TF is one small segment of specific promoter sequence. - PowerPoint PPT Presentation

TRANSCRIPT

Analysis of the yeast transcriptional regulatory network

Transcription Factor (TF)

A TF is a protein that binds to DNA sequences and regulates the transcriptions of corresponding genes.

Usually the binding site of a TF is one small segment of specific promoter sequence.

The activity of a TF is regulated according to the cell’s need, largely through signal transduction. It may not be directly observed, but can be reflected by the genes it regulates.

Expression regulatory network

Identifying the expression regulatory network is a crucial step towards understanding the cellular regulation system.

Inferring network from microarray data alone

Inferring network from microarray data and TF-TG (Target Gene) Information

Ihmels J, Friedlander G, Bergmann S, Sarig O, Ziv Y, Barkai N. Revealing modular organization in the yeast transcriptional network.Nat Genet. 2002 Aug;31(4):370-7.

Segal E et al. Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data.Nat Genet. 2003 Jun;34(2):166-76.

TF Activity

Use TF-TG relation benefit the regulatory network identification

TF expression level is not a good measure of the TF activity. The activated protein level of a TF, rather than its expression level, is what controls gene expression.

The activity of a transcription factor is regulated according to the cell’s need, largely through signal transduction. It may not be directly observed, but can be reflected by the genes it regulates.

Identify TF Activity by NCA

Network Component Analysis

Liao JC et al. Network component analysis: reconstruction

of regulatory signals in biological systems.

Proc Natl Acad Sci U S A. 2003 Dec 23;100(26):15522-7.

NCA compared with PCA, ICA

NCA Model

Without further constraints, [E] cannot be uniquely decomposed to [A] and [P].

Criteria for Unique NCA [E] = [A][P]

1. The connectivity matrix [A] must have full-column rank.

2. When a node in the regulatory layer is removed along with all of the output nodes connected to it, the resulting network must be characterized by a connectivity matrix that still has full-column rank. This condition implies that each column of [A] must have at least L-1 zeros.

3. [P] must have full row rank. In other words, each regulatory signal cannot be expressed as a linear combination of the other regulatory signals.

Criteria 2

Estimation of [E]=[A][P]

Iteratively estimate [A] and [P]: A0 P1 A1 P2… until convergenceConvergence criterion: decrease of least square error < cutoff

NCA, infer TF activity in Yeast

[E] = [A] [P]

How to define the restrictions to CS? i.e. which CS{i,j}=0?

Identify the TF-TG relation by ChIP-chip experiment

Yeast cell cycle regulation

441 genes vs. 33 transcription factors

Inference of regulatory network by Two-stage

constrained factor analysis

Yu T, Li KC.

Inference of transcriptional regulatory network by two-stage constrained space factor analysis.

Bioinformatics. 2005 Nov 1;21(21):4033-8.

Inference of regulatory network by Two-stage constrained factor analysis

Shortcoming of Liao et. al.’s approach:E = AP

Let Cij = I{Eij}, the constraint of where the loading matrix A can be non-zero

C comes from very noisy source.

Estimate C, A, P simultaneously.

Model setting

Gene expression matrix

Gene x Condition

Regulation strength matrix

(to be estimated)

Gene x TF

TF activity matrix (to be estimated)

TF x Condition

Error matrix

Connection constraint matrix

Gene x TF

1: connection; 0: no connection

Constrained by: jibbc jijiji , ,,,, ∀≡×

KNC ×

Up to here, it is the NCA model by Liao et al.

Model Fitting

by and

However, we do not assume full knowledge on C. We require C to be bounded

Higher-confidence set, from biological evidence

Lower-confidence set, from ChIP data

Model FittingDifficulties:

Simultaneous estimation of both the structure and coefficients amounts to finding optimum in a very complex function.

The number of parameters to be estimated is overwhelming.

Solution:

Find a reasonable local optimum.

Use the high-confidence set to find a starting point as close to the global optimum as possible.

Implementation:

Stepwise model fitting.

Start with a network backbone with only the high-confidence set, and grow the network gradually, drawing new connections from the low-confidence set.

Set C=CMIN, estimate each activity profile tk by the consensus of the expression of the regulated genes.

Is the reduction of total RSS in the last few steps too small?

From (CMAX-C), find the TF-gene pair that best agree with current estimate of B and T

NO

Estimate B and T by alternating least squares, using ridge regression.

YES

Fix estimate of T, regress each gene expression profile on the activity profiles of TF’s that are associated with it in CMAX. Use BIC and p-value to select TF’s.

Result

Data:

Regular growth ChIP data;

cell-cycle microarray data;

99 TFs enter our study.

Start with 891 evidenced relationships and 29154 lower-confidence relationships.

Final network has 3846 TF-gene connections.

TF’s that exhibit correlated expression and activity:

Time-shifting between a TF’s activity profile and its expression profile:

(1) Fit the activity profile using cubic spline

(2) interpolate the spline to get shifted profile

(3) obtain correlation between the expression profile and shifted activity profile

(4) maximize absolute correlation with regard to minute shift.

TF’s that have activity lagging behind expression:

SWI4

TF’s that have activity lagging behind expression:

Between-TF regulations:

top related