based on ellipsoidal calculus and optimization - unibw.de fileregression analysis for clusters in...

21
Regression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration of our Teacher and Friend Professor Alexander Rubinov Erik Kropat 1 , Gerhard-Wilhelm Weber 2 and Jan-J. R ¨ uckmann 2 1 Institute for Theoretical Computer Science, Mathematics and Operations Research Universit¨ at der Bundeswehr M ¨ unchen, Neubiberg, Germany E-mail: [email protected] 2 Institute of Applied Mathematics, Middle East Technical University, Turkey; Faculty of Economics, Business and Law, University of Siegen, Germany; Center for Research on Optimization and Control, University of Aveiro, Portugal; Faculty of Science, Universiti Teknologi Malaysia (UTM), Skudai, Malaysia E-mail: [email protected] 3 School of Mathematics University of Birmingham, United Kingdom E-mail: [email protected] Abstract. In this paper, we discuss regression models for gene-environment networks under ellipsoidal uncertainty. Functionally related groups of genes and environmental factors are identified by clustering techniques and the corresponding uncertain states are represented in terms of ellipsoids. The time- dependent expression values are determined by a regulatory system where the interactions between the clusters are defined by (affine-) linear coupling rules. Ellipsoidal calculus is applied to determine explicit representations of the uncertain multivariate states of the system. Various regression models are intro- duced for an estimation of the unknown system parameters which depend on uncertain (ellipsoidal) mea- surement data. Herewith, we offer an Elliptic Operations Research, in which we analyze the structure of the optimization problems obtained, especially, in view of their solvability by semidefinite programming and interior point methods, we discuss the structural frontiers and research challenges, and we conclude with an outlook. Keywords. Gene-Environment Networks – Ellipsoidal OR – Exact Clustering – Semidefinite Program- ming – Dynamical Systems – Computational Statistics – Uncertainty – Identification – Systems Biology – Computational Biology AMS (MOS) subject classification: 93A30, 92D10, 90C34 1 Introduction Genetic systems are often characterized by the presence of a high number of vari- ables and parameters resulting in an extraordinary complexity of the underlying

Upload: lenga

Post on 22-Aug-2019

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

Regression Analysis for Clusters in Gene-Environment Networksbased on Ellipsoidal Calculus and Optimization

In Commemoration of our Teacher and FriendProfessor Alexander Rubinov

Erik Kropat1, Gerhard-Wilhelm Weber2 and Jan-J. Ruckmann2

1Institute for Theoretical Computer Science, Mathematics and Operations ResearchUniversitat der Bundeswehr Munchen, Neubiberg, Germany

E-mail: [email protected] of Applied Mathematics, Middle East Technical University, Turkey;

Faculty of Economics, Business and Law, University of Siegen, Germany;Center for Research on Optimization and Control, University of Aveiro, Portugal;

Faculty of Science, Universiti Teknologi Malaysia (UTM), Skudai, MalaysiaE-mail: [email protected]

3School of MathematicsUniversity of Birmingham, United Kingdom

E-mail: [email protected]

Abstract. In this paper, we discuss regression models for gene-environment networks under ellipsoidaluncertainty. Functionally related groups of genes and environmental factors are identified by clusteringtechniques and the corresponding uncertain states are represented in terms of ellipsoids. The time-dependent expression values are determined by a regulatory system where the interactions between theclusters are defined by (affine-) linear coupling rules. Ellipsoidal calculus is applied to determine explicitrepresentations of the uncertain multivariate states of the system. Various regression models are intro-duced for an estimation of the unknown system parameters which depend on uncertain (ellipsoidal) mea-surement data. Herewith, we offer an Elliptic Operations Research, in which we analyze the structure ofthe optimization problems obtained, especially, in view of their solvability by semidefinite programmingand interior point methods, we discuss the structural frontiers and research challenges, and we concludewith an outlook.

Keywords. Gene-Environment Networks – Ellipsoidal OR – Exact Clustering – Semidefinite Program-ming – Dynamical Systems – Computational Statistics – Uncertainty – Identification – Systems Biology– Computational Biology

AMS (MOS) subject classification: 93A30, 92D10, 90C34

1 IntroductionGenetic systems are often characterized by the presence of a high number of vari-ables and parameters resulting in an extraordinary complexity of the underlying

Page 2: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

2 E. Kropat, G.-W. Weber, J.-J. Ruckmann

regulatory networks. The technological progress of the last decades and the devel-opment of high throughput technologies resulted in a generation of massive quanti-ties of data that can be used to enlighten the hidden network structure. Along withthese developments, the availability of extremely large data sets also challengedmathematics and bioinformatics and led to new methods for an analysis of highlyinterconnected systems. However, for many years, mathematical studies focussedon genetic items, neglecting the importance of the regulating effects of additionalenvironmental factors. In this context, the concept of gene-environment networkswas developed [42, 43, 48, 49, 50, 53, 54, 56, 57], supported by twin research andvarious other studies. These networks mathematically express and assess the im-portance of factors such as, e.g., poison, pollution, stress, but also measurementsin education, health care, environmental protection and in the development of thefinancial markets, for an estimation of the biological states. In fact, this impactalso holds true in the reverse direction, and as a special class of gene-environmentnetworks we mention eco-finance networks [24, 51].

Gene-environment networks are composed of two distinct groups of variables.The first group consists of the genes (proteins or other cell components), the mostrelevant data items to be observed. In addition, many other cell components andfactors like toxins, transcription factors, radiation, etc. play an important role ingenetic networks. This additional group of environmental items exerts a strong in-fluence on the behaviour of the genes. However, the complexity of the data, itsvolume and variability, demands for highly developed methods that allow to detectrelationships between changes in gene expression and in additional environmentalfactors. In recent years, gene-environment networks under various kinds of uncer-tainty have been analyzed. These models were based on interval arithmetics, butalso spline regression and stochastic differential equations. We note that this con-cept encompasses a broader scope of scientific inquiry than when genetic networkswere initially considered. Many other examples from system biology and life sci-ences refer to the so-called target-environment networks, where environmental ef-fects have to be considered. Among them are, e.g., metabolic networks [13, 34, 54],immunological networks [19], networks in toxicogenomics [28], but also social- andecological networks [18]. We refer to [17, 22, 38, 39, 40, 59, 60] for applications,practical examples and numerical calculations.

Gene-environment networks are generally built on measurement data from mi-croarray experiments and environmental measurements which are always effectedby errors and random noise. Error intervals can capture the deviations from specificdata values by imposing individual bounds on each variable. Various regressionmodels have been developed and studied with the help of generalized Chebychevapproximation and, equivalently to that approximation, semi-infinite and even gen-eralized semi-infinite optimization [45, 46, 47, 48, 49, 50, 54, 55, 57]. However,error intervals referring to single variables do not reflect correlations of the multi-variate data within specific clusters of genetic and environmental items. A naturalextension is a representation of uncertainty in terms of ellipsoidal sets which aredirectly related to multivariate Gaussian distributions and the corresponding covari-ance matrices. Ellipsoids have proved to be very flexible with respect to correlations

Page 3: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

Regression Analysis for Clusters in Gene-Environment Networks 3

of the data, whereas intervals and parallelpipes usually come from a perspectivewhere stochastic dependencies among any two of the errors made in the measure-ments of the expression values of genes and environmental levels are not takeninto account explicitly [6]. Moreover, paraxial sets are usually smaller than the el-lipsoids and their orthogonal projections into the 2-dimensional Cartesian planes,respectively [6]. Indeed, those confidence ellipsoids are obtained with respect tostochastic dependencies of the error variables. Those dependencies are the case inreality, e.g., in microarray experiments and in environmental studies as well. In re-verse, any ellipsoid can be inscribed into a sufficiently large parallelpipe which, inaddition, could be suitably located and directed in space around its eigenaxes.

Clustering and classification allow to identify certain clusters of data commonlyexerting influence on other groups of genes and/or environmental factors. Theuncertain expression values of these clusters are represented by ellipsoidal states.The multiple interactions of genetic and environmental items are determined by(affine-) linear coupling rules which define a time-discrete regulatory model. Thefuture states and predictions of the linear model can be calculated by ellipsoidal cal-culus and explicit representations of these predictions in terms of centers and shapematrices can be determined by an iterative procedure. Various regression mod-els are introduced for an estimation of the unknown parameters of the regulatorymodel, which heavily depend on the cluster structure. These models compare theactual measurements and the (ellipsoidal) predictions of the linear model; however,in a set-theoretic sense. Beside a notion of distance between centers of ellipsoids,nonnegative criteria functions associated with the shape matrix of the ellipsoid areused in the objective function of the regression model. The trace and the determi-nant are examples of such measures and they establish the basis for the definition ofregression models for parameter estimation of the regulatory system. Semi-definiteprogramming as well as conic programming and interior point methods can then beapplied for solution.

The paper is organized as follows: In Section 2, we review basic facts aboutellipsoidal calculus. In Section 3, the time-discrete model under ellipsoidal uncer-tainty is introduced. After formulating an algorithm for predictions of ellipsoidalfuture states of genetic and environmental factors, we turn to regression analysisin Section 4. Various regression models are introduced. Their solvability and therelated optimization methods are discussed in Section 5.

2 Ellipsoidal Calculus

The parameter-dependent time-discrete model for gene-environment networks ofthis paper and the corresponding regression models will be based on ellipsoidalcalculus. In this first section, we shortly review the basic operations needed to dealwith ellipsoidal uncertainty such as sums, intersections (fusions) and affine-lineartransformations of ellipsoids. The family of ellipsoids in Rp is closed with respectto affine-linear transformations but neither the sum nor the intersection is generallyellipsoidal, so both must be approximated by ellipsoidal sets.

Page 4: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

4 E. Kropat, G.-W. Weber, J.-J. Ruckmann

2.1 Ellipsoidal DescriptionsAn ellipsoid in Rp will be parameterized in terms of its center c ∈ Rp and a sym-metric non-negative definite configuration (or shape) matrix Σ ∈ Rp×p as

E(c, Σ) = {Σ1/2u + c | ‖u‖ ≤ 1},

where Σ1/2 is any matrix square root satisfying Σ1/2(Σ1/2)T = Σ. When Σ is offull rank, the non-degenerate ellipsoid E(c, Σ) may be expressed as

E(c, Σ) = {x ∈ Rp | (x− c)T Σ−1(x− c) ≤ 1}.

The eigenvectors of Σ point in the directions of principal semiaxes of E . The lengthsof the semiaxes of the ellipsoid E(c, Σ) are given by

√λi, where λi are the eigen-

values of Σ for i = 1, . . . , p. The volume of the ellipsoid E(c,Σ) is given byvol E(c, Σ) = Vp

√det(Σ), where Vp is the volume of the unit ball in Rp, i.e.,

Vp =

πp/2

(p/2)! , for even p,

2pπ(p−1)/2((p− 1)/2)!p! , for odd p.

2.2 Affine TransformationsThe family of ellipsoids is closed with respect to affine transformations. Given anellipsoid E(c,Σ) ⊂ Rp, matrix A ∈ Rm×p and vector b ∈ Rm we get AE(c,Σ) +b = E(Ac + b, AΣAT ). Thus, ellipsoids are preserved under affine transformation.If the rows of A are linearly independent (which implies m ≤ p), and b = 0, theaffine transformation is called projection [27].

2.3 Sums of K EllipsoidsGiven K bounded ellipsoids of Rp, Ek = E(ck,Σk), k = 1, . . . ,K, their geometric(Minkowksi) sum E1 + E1 = {z1 + z2 | z1 ∈ E1, z2 ∈ E2} is not generally anellipsoid. However, it can be tightly approximated by parameterized families ofexternal ellipsoids. We adapt the notion of the minimal trace ellipsoid from [16]and introduce the outer ellipsoidal approximation E(σ, P ) = ⊕K

k=1Ek containingthe sum S =

∑Kk=1 Ek of ellipsoids which is defined by

σ =K∑

k=1

ck

and

P =( K∑

k=1

√Tr Σk

)( K∑

k=1

Σk√Tr Σk

).

Page 5: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

Regression Analysis for Clusters in Gene-Environment Networks 5

2.4 Intersection of Ellipsoids

The intersection of two ellipsoids is generally not an ellipsoid. For this reason wereplace this set by the outer ellipsoidal approximation of minimal volume and adaptthe notion of fusion of ellipsoids from [36]. Given two non-degenerate ellipsoidsE(c1, Σ1) and E(c2,Σ2) inRp with E(c1,Σ1)∩E(c2,Σ2) 6= ∅we define an ellipsoid

Eλ(c0, Σ0) := {x ∈ Rp |λ(x− c1)T Σ−11 (x− c1)

+ (1− λ)(x− c2)T Σ−12 (x− c2) ≤ 1},

where λ ∈ [0, 1]. The ellipsoid Eλ(c0, Σ0) coincides with E(c1, Σ1) and E(c2, Σ2)for λ = 1 and λ = 0, respectively. In order to determine a tight external ellip-soidal approximation Eλ(c0,Σ0) of the intersection of E(c1,Σ1) and E(c2,Σ2), weintroduce

X := λΣ−11 + (1− λ)Σ−1

2

andτ := 1− λ(1− λ)(c2 − c1)T Σ−1

2 X−1Σ−11 (c2 − c1).

The ellipsoid Eλ(c0, Σ0) is given by the center

c0 = X−1(λΣ−11 c1 + (1− λ)Σ−1

2 c2)

and shape matrixΣ0 = τX−1.

The fusion of E(c1, Σ1) and E(c2, Σ2), whose intersection is a nonempty boundedregion, is defined as the ellipsoid Eλ(c0, Σ0) for the value λ ∈ [0, 1] that min-imizes its volume [36]. The fusion of E(c1, Σ1) and E(c2, Σ2) is E(c1, Σ1), ifE(c1, Σ1) ⊂ E(c2,Σ2); or E(c2,Σ2), if E(c2, Σ2) ⊂ E(c1,Σ1); otherwise, it isEλ(c0, Σ0) defined as above where λ is the only root in (0, 1) of the following poly-nomial of degree 2p− 1:

τ(detX ) Tr (co(X )(Σ−11 − Σ−1

2 ))− p(detX )2

× (2cT0 Σ−1

1 c1 − 2cT0 Σ−1

2 c2 + cT0 (Σ−1

2 − Σ−11 )c0 − cT

1 Σ−11 c1 + cT

2 Σ−12 c2) = 0.

Here, co(X ) denotes the matrix of cofactors of X . Since X−1 = co(X )/ detX , werepresent this polynomial as

τ(detX )2 Tr (X−1(Σ−11 − Σ−1

2 ))− p(detX )2

× (2cT0 Σ−1

1 c1 − 2cT0 Σ−1

2 c2 + cT0 (Σ−1

2 − Σ−11 )c0 − cT

1 Σ−11 c1 + cT

2 Σ−12 c2) = 0.

We note that it is also possible to define an inner ellipsoidal approximation. Themethod of finding the internal ellipsoidal approximation of the intersection of twoellipsoids is described in [44].

Page 6: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

6 E. Kropat, G.-W. Weber, J.-J. Ruckmann

3 Gene-Environment Systems under Ellipsoidal Un-certainty

The previous preparations enable us to go on with our Ellipsoidal Operations Re-search via the dynamics which we present subsequently.

3.1 Time-Discrete Gene-Environment SystemsWe consider time-discrete gene-environment regulatory systems with n genes andm environmental factors. Often, certain groups of genes and environmental itemscan be identified which exert a more or less regulating influence on other groupsof data items. For this reason, the set of genes is divided in R disjoint clustersCr ⊂ {1, . . . , n}, r = 1, . . . , R in a preprocessing step of clustering. Sim-ilarly, the set of all environmental items is divided in S disjoint clusters Ds ⊂{1, . . . , m}, s = 1, . . . , S. Here, motivated by the application in the genetic con-text, we focus on non-overlapping clusters and assume a strict sub-division of thevariables, so that the relations Cr1 ∩ Cr2 = ∅ for all r1 6= r2 and Ds1 ∩ Ds2 = ∅for all s1 6= s2 are fulfilled. We note that our method is also applicable for over-lapping clusters. This particular case will be addressed in a forthcoming paper [25].There exist many clustering techniques in the fields of unsupervised learning anddata mining and we do not focus on a specific method. One reasonable approachconsists in nearest neighbor methods [21] in combination with the discriminantadaptive nearest neighbor metric that allow an identification of ellipsoidal neigh-borhoods [20]. Also clustering methods based on minimal spanning trees [8] ormethods from nonsmooth optimization [7] are suited for the ellipsoidal approach.

Each cluster corresponds to a functionally related group of genes or environ-mental factors and the uncertain states of these cluster are represented in terms ofellipsoids

Xr = E(µr, Σr) ⊂ R|Cr|, Es = E(ρs, Πs) ⊂ R|Ds|.

The dynamics and interactions between the various clusters of genetic and envi-ronmental items are given by the linear model

X(κ+1)j = ξj0 +

( R⊕r=1

AGGjr X(κ)

r

)+

( S⊕s=1

AEGjs E(κ)

s

)

E(κ+1)i = ζi0 +

( R⊕r=1

AGEir X(κ)

r

)+

( S⊕s=1

AEEis E(κ)

s

)

(EC)

with κ ≥ 0 and j = 1, 2, . . . , R, i = 1, 2, . . . , S.In this time-discrete system, four types of cluster interactions and regulating ef-

fects are involved: (GG) gene → gene, (EG) environment → gene, (GE) gene →environment, (EE) environment → environment. The system (EC) is defined by(affine) linear coupling rules, what implies that all future states of genetic and envi-

ronmental clusters are ellipsoids themselves. In particular, the sumsR⊕

r=1

AGGjr X(κ)

r

Page 7: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

Regression Analysis for Clusters in Gene-Environment Networks 7

andS⊕

s=1

AEGjs E(κ)

s describe the cumulative effects of all genetic and environmen-

tal clusters exerted on the elements of cluster Cj in a set theoretic or ellipsoidal

sense. In the same way, the (ellipsoidal) sumsR⊕

r=1

AGEir X(κ)

r andS⊕

s=1

AEEis E(κ)

s

refer to the additive genetic and environmental effects on cluster Ei. The degreeof connectivity between the individual clusters is given by the (unknown) interac-tions matrices AGG

jr ⊂ R|Cj |×|Cr|, AEGjs ⊂ R|Cj |×|Ds|, AGE

ir ⊂ R|Di|×|Cr|, andAEE

is ⊂ R|Di|×|Ds|. These matrices are in turn sub-matrices of the general interac-tion matrices AGG ∈ Rn×n, AEG ∈ Rn×m, AGE ∈ Rm×n, AEE ∈ Rm×m.Since we assume disjoint clusters, the aforementioned sub-matrices are distinctbuilding blocks of the general interactions matrices where the elements of the corre-sponding clusters define the position within the general interactions matrices. Theintercepts ξj0 ∈ R|Cj | and ζi0 ∈ R|Di| are disjoint components of the vectorsξ0 = (ξ10, . . . , ξR0)T ∈ Rn and ζ0 = (ζ10, . . . , ζS0)T ∈ Rm, respectively. Wenote that the initial values of the linear system (EC) can be defined by the first

genetic and environmental measurements, i.e., X(0)j = X

(0)

j and E(0)i = E

(0)

i .Remark: We note that the interconnections between the various clusters of

genes and environmental factors can be represented in terms of a highly intercon-nected gene-environment network. The nodes of this network can be identified withthe clusters, weighted by the (time-dependent) ellipsoidal states. The branches be-tween the nodes (or clusters) are weighted by the matrices and intercept vectors ofthe linear coupling rules of model (EC). Although the parameters and the weightsof the branches are static, the evolution of the ellipsoidal states turns the gene-environment network in a dynamic graph model. Hereby, network analysis andconcepts from discrete mathematics become applicable and features like connect-edness, cycles and shortest paths can be investigated [23].

3.2 Algorithm

The linear system (EC) introduced in Section 3.1 refers to the time-discrete be-haviour of the ellipsoidal states of clusters and environmental variables. By apply-ing ellipsoidal calculus from Section 2, the centers and configuration matrices ofthe predictions X

(κ+1)j and E

(κ+1)s of (ellipsoidal) genetic and environmental clus-

ter states can be calculated iteratively. Throughout this section we assume κ ≥ 0.The state of the genetic cluster Cj , j = 1, 2, . . . , R, is given by the ellipsoid

X(κ+1)j = E

(κ+1)j , Σ(κ+1)

j

)

with center

µ(κ+1)j = ξj0 +

R∑r=1

AGGjr µ(κ)

r +S∑

s=1

AEGjs ρ(κ)

s

Page 8: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

8 E. Kropat, G.-W. Weber, J.-J. Ruckmann

and configuration matrix

Σ(κ+1)j =

(√TrG(κ)

j +√

TrH(κ)j

)·( G(κ)

j√TrG(κ)

j

+H(κ)

j√TrH(κ)

j

),

where

G(κ)j =

( R∑r=1

√Tr

(AGG

jr Σ(κ)r (AGG

jr )T)) ·

( R∑r=1

AGGjr Σ(κ)

r (AGGjr )T

√Tr

(AGG

jr Σ(κ)r (AGG

jr )T))

,

H(κ)j =

( S∑s=1

√Tr

(AEG

js Π(κ)s (AEG

js )T)) ·

( S∑s=1

AEGjs Π(κ)

s (AEGjs )T

√Tr

(AEG

js Π(κ)s (AEG

js )T))

.

Similarly, the states of the environmental cluster Di, i = 1, 2, . . . , S, can berepresented in terms of the ellipsoid

E(κ+1)i = E

(ρ(κ+1)i , Π(κ+1)

i

)

with center

ρ(κ+1)i = ζi0 +

R∑r=1

AGEir µ(κ)

r +S∑

s=1

AEEis ρ(κ)

s

and configuration matrix

Π(κ+1)i =

(√TrM(κ)

i +√

TrN (κ)i

)·( M(κ)

i√TrM(κ)

i

+N (κ)

i√TrN (κ)

i

),

where

M(κ)i =

( R∑r=1

√Tr

(AGE

ir Σ(κ)r (AGE

ir )T)) ·

( R∑r=1

AGEir Σ(κ)

r (AGEir )T

√Tr

(AGE

ir Σ(κ)r (AGE

ir )T))

,

N (κ)i =

( S∑s=1

√Tr

(AEE

is Π(κ)s (AEE

is )T)) ·

( S∑s=1

AEEis Π(κ)

s (AEEis )T

√Tr

(AEE

is Π(κ)s (AEE

is )T))

.

Page 9: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

Regression Analysis for Clusters in Gene-Environment Networks 9

4 Regression Analysis Under Ellipsoidal Uncertainty

4.1 The Regression ProblemThe unknown system parameters of model (EC) are given by the entries of theinteraction matrices AGG

jr , AEGjs , AGE

ir , and AEEis and the intercepts ξj0 and ζi0.

For an estimation of these parameters and, thus, for regression analysis, we have tocompare the predictions of the linear model (EC) and the observations from geneticand environmental measurements. The (ellipsoidal) measurements

X(κ)

r = E(µ(κ)

r , Σ(κ)

r

) ⊂ R|Cr|, E(κ)

s = E(ρ(κ)

s ,Π(κ)

s

) ⊂ R|Ds|,

with r = 1, 2, . . . , R, s = 1, 2, . . . , S and κ = 0, 1, . . . , T are taken at samplingtimes t0 < t1 < . . . < tT and the first T predictions of the linear model are givenby the ellipsoids

X(κ+1)j = E(

µ(κ+1)j , Σ(κ+1)

j

):= ξj0 +

( R⊕r=1

AGGjr X

(κ)

r

)+

( S⊕s=1

AEGjs E

(κ)

s

),

E(κ+1)i = E(

ρ(κ+1)i , Π(κ+1)

i

):= ζi0 +

( R⊕r=1

AGEir X

(κ)

r

)+

( S⊕s=1

AEEis E

(κ)

s

),

with j = 1, 2, . . . , R, i = 1, 2, . . . , S and κ = 0, 1, . . . , T − 1.

The main idea of our ellipsoidal regression analysis is to maximize the overlapof the predictions and measurement values (both ellipsoids). For this reason, weintroduce the ellipsoids

∆X(κ)r := X(κ)

r ∩X(κ)

r and ∆E(κ)s := E(κ)

s ∩ E(κ)

s ,

with r = 1, 2, . . . , R, s = 1, 2, . . . , S and κ = 1, . . . , T , where ∩ denotes the fusionof ellipsoids introduced in Subsection 2.4. In addition, the centers of the ellipsoidsare adjusted, so that their distance becomes minimized (cf. Figure 1). This leads usto the following regression problem:

(R) MaximizeT∑

κ=1

{ R∑r=1

∥∥∥∆X(κ)r

∥∥∥∗−

∥∥∥ µ(κ)r − µ(κ)

r

∥∥∥2

2

+S∑

s=1

∥∥∥∆E(κ)s

∥∥∥∗−

∥∥∥ ρ(κ)s − ρ(κ)

s

∥∥∥2

2

}.

Here, ‖ · ‖∗ denotes a measure that reflects the geometrical size of the intersections(fusions) and we assume that ‖∆X

(κ)r ‖∗ = 0, if ∆X

(κ)r = ∅ and ‖∆E

(κ)s ‖∗ = 0, if

∆E(κ)s = ∅. There exist various measures related to the shape of the intersections,

e.g., the volume (which corresponds to the ellipsoid matrix determinant) or the sumof squares of semiaxes (which corresponds to the trace of the configuration matrix).

Page 10: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

10 E. Kropat, G.-W. Weber, J.-J. Ruckmann

These examples lead to specific formulations of the regression problem (R) andthey depend on the shape matrices of the fusions ∆X

(κ)r and ∆E

(κ)s . For further

details on geometrical (ellipsoidal) measures and the related regression problemswe refer to [25].

X(κ)r

X(κ)r

µ(κ)r =µ(κ)

rX(κ)r

X(κ)r

µ(κ)r

µ(κ)r

Figure 1: Overlap of ellipsoids: The intersections of the two ellipsoids X(κ)r and

X(κ)

r have the same geometrical size with the same measure of fusions on the leftand the right side. On the right side, the centers µ

(κ)r and µ

(κ)r are adjusted in order

to minimize the difference between the centers of ellipsoids.

For a deeper analysis of the regression problem (R), explicit representations of

the fusions ∆X(κ)r and ∆E

(κ)s are required. The fusion ∆X

(κ)r = X

(κ)r ∩X

(κ)

Cris

an ellipsoid E(∆µ

(κ)r , ∆Σ(κ)

r

)with center

∆µ(κ)r =

[X (κ)r

]−1(λ[Σ(κ)

r

]−1µ(κ)

r + (1− λ)[Σ

(κ)

r

]−1µ(κ)

r

)

and shape matrix∆Σ(κ)

r = ξ(κ)r

[X (κ)r

]−1,

whereX (κ)

r := λ[Σ(κ)

r

]−1 + (1− λ)[Σ

(κ)

r

]−1

and

ξ(κ)r := 1− λ(1− λ)

(µ(κ)

r − µ(κ)r

)T [Σ

(κ)

r

]−1[X (κ)r

]−1[Σ(κ)r

]−1(µ(κ)

r − µ(κ)r

).

Page 11: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

Regression Analysis for Clusters in Gene-Environment Networks 11

The parameter λ is the only root in (0, 1) of the following polynomial of degree2|Cr| − 1:

ξ(κ)r

(detX (κ)

r

)2 Tr([X (κ)

r

]−1([

Σ(κ)r

]−1 − [Σ

(κ)

r

]−1))

− |Cr|(detX (κ)

r

)2

×(2[∆µ(κ)

r

]T [Σ(κ)

r

]−1µ(κ)

r − 2[∆µ(κ)

r

]T [Σ

(κ)

r

]−1µ(κ)

r

+[∆µ(κ)

r

]T([

Σ(κ)

r

]−1 − [Σ(κ)

r

]−1)∆µ(κ)

r − [µ(κ)

r

]T [Σ(κ)

r

]−1µ(κ)

r

+[µ(κ)

r

]T [Σ

(κ)

r

]−1µ(κ)

r

)= 0.

Similarly, the fusion ∆E(κ)s = E

(κ)s ∩ E

(κ)

s is an ellipsoid E(∆ρ

(κ)s , ∆Π(κ)

s

)with center

∆ρ(κ)s =

[Y(κ)s

]−1(λ[Π(κ)

s

]−1ρ(κ)

s + (1− λ)[Π

(κ)

s

]−1ρ(κ)

s

)

and shape matrix∆Π(κ)

s = η(κ)s

[Y(κ)s

]−1,

whereY(κ)

s := λ[Π(κ)

s

]−1 + (1− λ)[Π

(κ)

s

]−1

and

η(κ)s := 1− λ(1− λ)

(ρ(κ)

s − ρ(κ)s

)T [Π

(κ)

s

]−1[Y(κ)s

]−1[Π(κ)s

]−1(ρ(κ)

s − ρ(κ)s

).

The parameter λ is the only root in (0, 1) of the following polynomial of degree2|Ds| − 1:

η(κ)s

(detY(κ)

s

)2Tr

([Y(κ)s

]−1([

Π(κ)s

]−1 − [Π

(κ)

s

]−1))

− |Ds|(detY(κ)

s

)2

×(2[∆ρ(κ)

s

]T [Π(κ)

s

]−1ρ(κ)

s − 2[∆ρ(κ)

s

]T [Π

(κ)

s

]−1ρ(κ)

s

+[∆ρ(κ)

s

]T([

Π(κ)

s

]−1 − [Π(κ)

s

]−1)∆ρ(κ)

s − [ρ(κ)

s

]T [Π(κ)

s

]−1ρ(κ)

s

+[ρ(κ)

s

]T [Π

(κ)

s

]−1ρ(κ)

s

)= 0.

We now return to the objective function of regression problem (R), where thegeometrical size of the fusions ∆X

(κ)r and ∆E

(κ)s is measured. As a measure for

the size of a p-dimensional ellipsoid E(0, Q) nonnegative-valued criteria functionsψ(E(0, Q)) defined on the set of all nondegenerate ellipsoids can be applied. Thesefunctions are monotonous by increasing with respect to inclusion, i.e., ψ(E1) ≤ψ(E2) if E1 ⊆ E2. Such measures are, e.g.,

(a) the trace of Q,

ψT (E(0, Q)) := Tr Q = λ1 + . . . + λp,

where λi are the eigenvalues of Q (i.e., Tr Q is equal to the sum of the squaresof the semiaxes),

Page 12: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

12 E. Kropat, G.-W. Weber, J.-J. Ruckmann

(b) the determinant of Q,

ψDet(E(0, Q)) := detQ = λ1 · . . . · λp,

which is equal to the product of eigenvalues and proportional to the volume

vol E(0, Q) = πp2 (detQ)

12(Γ(

p2 + 1

))−1

of the ellipsoid, where Γ stands for the Gamma-function.

For further details on criteria functions we refer to [26], p. 101. The measures statedabove lead to different representations of the regression problem (R) and we studythem now in more detail.

4.2 The Trace CriterionBy measuring the size of the ellipsoids ∆X

(κ)r and ∆E

(κ)s in terms of their (squared)

lengths of semiaxes and, thus, the traces of the shape matrices, we can state thefollowing regression problem:

(RTr) MaximizeT∑

κ=1

{ R∑r=1

[Tr

(∆Σ(κ)

r

)−|Cr|∑

j=1

(κ)r,j − µ

(κ)r,j

)2]

+S∑

s=1

[Tr

(∆Π(κ)

s

)−|Ds|∑

i=1

(ρ(κ)s,i − ρ

(κ)s,i

)2]}

.

As the trace of the shape matrix of an ellipsoid is equal to the sum of the squares ofthe semiaxes, the regression problem takes the form

(R′Tr) MaximizeT∑

κ=1

{ R∑r=1

[|Cr|∑

j=1

λ(κ)r,j −

(κ)r,j − µ

(κ)r,j

)2]

+S∑

s=1

[|Ds|∑

i=1

Λ(κ)s,i −

(ρ(κ)s,i − ρ

(κ)s,i

)2]}

,

where λ(κ)r,j and Λ(κ)

s,i are the eigenvalues of ∆Σ(κ)r and ∆Π(κ)

s , respectively.

4.3 The Determinant CriterionAnother objective function of the regression model can be build with the determi-nants of the configuration matrices of the ellipsoids ∆X

(κ)r and ∆E

(κ)s :

(RDet) MaximizeT∑

κ=1

{ R∑r=1

[det

(∆Σ(κ)

r

)−|Cr|∑

j=1

(κ)r,j − µ

(κ)r,j

)2]

+S∑

s=1

[det

(∆Π(κ)

s

)−|Ds|∑

i=1

(ρ(κ)s,i − ρ

(κ)s,i

)2]}

.

Page 13: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

Regression Analysis for Clusters in Gene-Environment Networks 13

Equivalent formulations of (RDet) can be given in terms of the eigenvalues of theconfiguration matrices

(R′Det) MaximizeT∑

κ=1

{ R∑r=1

[(|Cr|∏

j=1

λ(κ)r,j

)−|Cr|∑

j=1

(κ)r,j − µ

(κ)r,j

)2]

+S∑

s=1

[(|Ds|∏

i=1

Λ(κ)s,i

)−|Ds|∑

i=1

(ρ(κ)s,i − ρ

(κ)s,i

)2]}

,

and the volumes of the ellipsoids ∆X(κ)r and ∆E

(κ)s

(R′′Det) MaximizeT∑

κ=1

{ R∑r=1

[(π

2|Cr| Γ

( |Cr|2

+ 1)

vol(∆X(κ)

r

))2

−|Cr|∑

j=1

(κ)r,j − µ

(κ)r,j

)2]

+S∑

s=1

[(π

2|Ds| Γ

( |Ds|2

+ 1)

vol(∆E(κ)

s

))2

−|Ds|∑

i=1

(ρ(κ)s,i − ρ

(κ)s,i

)2]}

.

5 Optimization MethodsWe now turn to optimization methods for the regression models of the previoussubsections. These models depend on the shape matrices Σ(κ)

r and Π(κ)s and the

distance of the centers µ(κ)r and ρ

(κ)s of the fusions ∆X

(κ)r and ∆E

(κ)s . In this sit-

uation, methods from semidefinite programming [14] can be applied, because theobjective functions of these volume-related programming problems depend on, e.g.,the determinant or eigenvalues of symmetric positive semidefinite matrices. In ad-dition, the squared Euclidean norm is conic representable ([10], p. 88) and, thus,semidefinite representable ([10], p. 132). However, some of the regression modelsintroduced above have to be slightly modified in order to obtain positive semidefi-nite representable objective functions [11]. For example, the objective function ofthe regression model (RDet) depends directly on the determinant of shape matri-ces, but the determinant det (M) considered as a function of symmetric positivesemidefinite n × n-matrices M (short: M < 0) is neither a convex nor a concavefunction of M (if n ≥ 2). However, if p is a rational number with 0 ≤ p ≤ 1

n , then

f(M) =

{−detp (M) , M < 0,

∞ , otherwise,

is positive semidefinite representable ([11], p. 81). For this reason, we introducethe regression model

Page 14: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

14 E. Kropat, G.-W. Weber, J.-J. Ruckmann

(RDet) MaximizeT∑

κ=1

{−

R∑r=1

[detp

(∆Σ(κ)

r

)+|Cr|∑

j=1

(κ)r,j − µ

(κ)r,j

)2]

−S∑

s=1

[detq

(∆Π(κ)

s

)+|Ds|∑

i=1

(ρ(κ)s,i −ρ

(κ)s,i

)2]}

,

where the rational numbers p, q fulfill the conditions 0 ≤ p ≤ 1|Cr| and 0 ≤ q ≤

1|Ds| . Since det(M) =

∏ni=1 λi(M), where λi(M) are the eigenvalues of M , we

can replace (R′Det) by

(R′Det) MaximizeT∑

κ=1

{−

R∑r=1

[(|Cr|∏

j=1

λ(κ)r,j

)p

+|Cr|∑

j=1

(κ)r,j − µ

(κ)r,j

)2]

−S∑

s=1

[(|Ds|∏

i=1

Λ(κ)s,i

)q

+|Ds|∑

i=1

(ρ(κ)s,i −ρ

(κ)s,i

)2]}

,

and instead of (R′′Det) we suggest

(R′′Det) MaximizeT∑

κ=1

{ R∑r=1

[(π

2|Cr| Γ

( |Cr|2

+ 1)

vol(∆X(κ)

r

))2p

−|Cr|∑

j=1

(κ)r,j − µ

(κ)r,j

)2]

+S∑

s=1

[(π

2|Ds| Γ

( |Ds|2

+ 1)

vol(∆E(κ)

s

))2q

−|Ds|∑

i=1

(ρ(κ)s,i − ρ

(κ)s,i

)2]}

.

We note that in case of positive definite shape matrices ∆Σ(κ)r and ∆Π(κ)

s negativepowers of the determinant can be used. If p is a positive rational, the function

f(M) =

{det−p (M) , M Â 0,∞ , otherwise,

of the symmetric n × n-matrix M is positive semidefinite representable ([11], p.83). Here, M Â 0 means that M is positive semidefinite. Now, with two positiverationals p, q we obtain a further regression model

(R′′′Det) MaximizeT∑

κ=1

{ R∑r=1

[det−p

(∆Σ(κ)

r

)+|Cr|∑

j=1

(κ)r,j − µ

(κ)r,j

)2]

+S∑

s=1

[det−q

(∆Π(κ)

s

)+|Ds|∑

i=1

(ρ(κ)s,i −ρ

(κ)s,i

)2]}

.

Page 15: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

Regression Analysis for Clusters in Gene-Environment Networks 15

In the regression model (R′Tr), sums of all eigenvalues of the configuration ma-trices ∆Σ(κ)

r and ∆Π(κ)s are considered, which can also be regarded as positive

semidefinite representable functions ([11], p. 80). In general, interior point meth-ods can applied which have a moderate complexity [29, 30, 31, 33]. Alternatively,for regression problems that consider sums of eigenvalues in the objective function,asscociated bilevel problems can be introduced which could be solved by gradientmethods. In fact, in [32] structural frontiers of conic programming are discussedwith other optimization methods compared, and future applications in machinelearning and data mining prepared. However, we would like to underline that inthe areas regression and classification of statistical learning (cf. e.g., [21, 37]), ouroptimization based methods provided and further promise very good and competi-tive results [15, 22, 41, 52, 58].

REMARK (Future Research). We represented genetic and environmental stateswith the help of ellipoids. This model addressed both data and predicted variables,and it served as an input into design or system matrices. Since we finally arrived atconic optimization problems, we may also state the corresponding dual problems.Here we will in future investigate, and also invite the reader to this, which roles theellipsoids then play in terms of the outputs of the interactions. This could establisha Dual Ellipsoidal OR.

Moreover, in [49] we expressed that we can navigate between entries and entireblock matrices, e.g., quadrants, of the interaction and design matrices, a control orregulatory measurements, e.g., in the context of Kyoto Protocol [35, 50]. A closerunderstanding of this navigating in the presence of ellipsoidal uncertainty in thestate variables and, what is more, in the blockwise (clustered) entries of our systemmatrices, is subject of future research, too.

6 ConclusionIn this paper, we analyzed a time-discrete regression model for gene-environmentregulatory networks under ellipsoidal uncertainty. The ellipsoidal approach - com-plemented by powerful optimization methods of semidefinite programming and theefficiency of interior point methods - offers a new perspective for the analysis ofregulatory systems in systems biology and life sciences, and it could even estab-lish an Ellipsoidal OR. The representation of the dynamic states of the underlyingsystem in terms of ellipsoids was motivated by our and our colleagues’ studies ongene-environment networks and eco-finance networks, where errors and uncertaintyare modeled by intervals [58, 59, 60]. Here, we further extended this approach byreplacing the intervals by more general uncertainty sets. Ellipsoids are more flexiblethan intervals and parallelpipes what also refers to stochastic dependencies betweenthe various genes and environmental factors. Moreover, ellipsoids are directly re-lated to covariance matrices and they provide good approximations of convex sets.In particular, models based on Gaussian random noise refer to the ellipsoidal ap-proach. However, Gaussian random distributions are often used as simplifications,

Page 16: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

16 E. Kropat, G.-W. Weber, J.-J. Ruckmann

and in many applications non-Gaussian probability distributions have to be applied.In future works, we will further extend our models based on ellipsoidal calculus andwe will turn to a more set-theoretic representation of uncertainty based on semi-algebraic sets [9, 12]. We will combine this new perception with refined optimiza-tion methods, and by this we will offer a further avenue for the analysis of regulatorysystems, particularly with regard to applications and real-world data. Furthermore,we refined collaborative game theory under interval uncertainty [1, 2, 3, 4] by ourellipsoidal calculus [5], herewith including correlations, as being expressed by sub-coalitions found by clustering. In terms if, e.g., joint implementation, collaborativegame theory serves for an new interpretation of the TEM model, which is some dy-namical model related with the Kyoto Protocol and a part of our gene-environmentnetworks [35, 48, 49]. With this study, we aim scientific advances and the improve-ment of living conditions on earth.

References[1] S.Z. Alparslan Gok, Cooperative Interval Games, PhD Thesis, Institute of

Applied Mathematics, Middle East Technical University, Ankara, Turkey,2009.

[2] S.Z. Alparslan Gok, R. Branzei, and S. Tijs, Convex interval games, Preprintat IAM, Middle East Technical University, Ankara, Turkey, and Center forEconomic Research, Tilburg University, The Netherlands, 2008.

[3] S.Z. Alparslan Gok, R. Branzei, and S. Tijs, Airport interval games and theirShapley value, to appear in Operations Research and Decisions, 1, 2009.

[4] S.Z. Alparslan Gok, S. Miquel, and S. Tijs, Cooperation under interval un-certainty, Math. Methods Oper. Res., 69, (2009), pp. 99–109.

[5] S.Z. Alparslan Gok, and G.-W. Weber, Cooperative games under ellipsoidaluncertainty, to appear in the Proceedings of PCO 2010, 3rd Global Confer-ence on Power Control and Optimization, February 2-4, 2010, Gold Coast,Queensland, Australia.

[6] A. Aster, B. Borchers, and C. Thurber, Parameter estimation and inverseproblems, Academic Press, 2004.

[7] A.M. Bagirov, and J. Yearwood, A new nonsmooth optimization algorithmfor minimum sum-of-squares clustering problems, European J. Oper. Res.,170, 2 (2006), pp. 578–596.

[8] Z. Barzily, Z.V. Volkovich, B. Akteke-Ozturk, and G.-W. Weber, Cluster sta-bility using minimal spanning trees, ISI Proceedings of 20th Mini-EUROConference Continuous Optimization and Knowledge-Based Technologies,Neringa, Lithuania, May 20-23, (2008), pp. 248–252.

Page 17: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

Regression Analysis for Clusters in Gene-Environment Networks 17

[9] R. Benedetti, Real algebraic and semi-algebraic sets, Hermann, Ed. des Sci-ences et des Arts, Paris, 1990.

[10] A. Ben-Tal, and A. Nemirovski, Convex Optimization in Engineer-ing - Modeling, Analyis, Algorithms. Faculty of Industrial Engineer-ing and Management Technion-City, Haifa 32000, Israel, available atwww.isa.ewi.tudelft.nl/ roos/courses/WI4218/tud00r.pdf.

[11] A. Ben-Tal, Conic and robust optimization, Lecture notes (2002)Available at http://iew3.technion.ac.il/Home/Users/morbt.phtml.

[12] J. Bochnak, M. Coste, and M.-F. Roy, Real algebraic geometry, Springer1998.

[13] E. Borenstein, and M.W. Feldman, Topological signatures of species inter-actions in metabolic networks, J. Comput. Biol., 16, 2, (2009), pp. 191–200,DOI: 10.1089/cmb.2008.06TT.

[14] S. Boyd, and L. Vandenberghe, Convex Optimization, Cambridge UniversityPress, 2004.

[15] E. Buyukbebeci, Comparison of MARS, CMARS and CART in Predict-ing Default Probabilities for Emerging Markets, MSc. Term Project Re-port/Thesis in Financial Mathematics, at IAM, METU, Ankara, August 2009.

[16] P. Durieu, E. Walter, and B. Polyak, Multi-input multi-output ellipsoidal statebounding, J. Optim. Theory Appl., 111, 2, (2001), pp. 273–303.

[17] J. Gebert, M. Latsch, E.M.P. Quek, and G.-W. Weber, Analyzing and opti-mizing genetic network structure via path-finding, Journal of ComputationalTechnologies, 9, 3, (2004), pp. 3–12.

[18] A. Gokmen, S. Kayalgil, G.-W. Weber, I. Gokmen, M. Ecevit, A. Surmeli, T.Bali, Y. Ecevit, H. Gokmen, and D.J. DeTombe, Balaban Valley Project: Im-proving the Quality of Life in Rural Area in Turkey, International ScientificJournal of Methods and Models of Complexity, 7, 1, (2004).

[19] J.R. Harris, W. Nystad, and P. Magnus, Using genes and environments to de-fine asthma and related phenotypes: applications to multivariate data, Clini-cal and Experimental Allergy, 28, 1, (1998), pp. 43–45.

[20] T. Hastie, and R. Tibshirani, Discriminant adaptive nearest neighbor classifi-cation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 18,6, (1996), pp. 607–616.

[21] T. Hastie, R. Tibshirani, and J. Friedman, The elements of statistical learning,Springer (2001).

Page 18: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

18 E. Kropat, G.-W. Weber, J.-J. Ruckmann

[22] A. Iscanoglu, G.-W. Weber, and P. Taylan, Predicting default probabilitieswith generalized additive models for emerging markets, invited lecture, Grad-uate Summer School on New Advances in Statistics, METU (2007).

[23] E. Kropat, S. Pickl, A. Rossler, and G.-W. Weber, On theoretical and practicalrelations between discrete optimization and nonlinear optimization, in specialissue Colloquy Optimization Structure and Stability of Dynamical Systems(at the occasion of the colloquy with the same name, Cologne, October 2000)of Journal of Computational Technologies, 7, 2002), pp. 27–62.

[24] E. Kropat, G.-W. Weber, and B. Akteke-Ozturk, Eco-finance networks underuncertainty, in J. Herskovits, A. Canelas, H. Cortes, and M. Aroztegui, eds.,Proceedings of the International Conference on Engineering Optimization(ISBN 978857650156-5, CD), EngOpt 2008, Rio de Janeiro, Brazil (2008).

[25] E. Kropat, G.-W. Weber, and C.S. Pedamallu, Regulatory networks under el-lipsoidal uncertainty - optimization theory and dynamical systems, Preprintat IAM, METU, Ankara, Turkey, 2009. Submitted to SIAM Journal on Opti-mization.

[26] A.B. Kurzhanski, and I. Valyi, Ellipsoidal Calculus for Estimation and Con-trol, Birkhauser, 1997.

[27] A.A. Kurzhanski, and P. Varaiya, Ellipsoidal Toolbox Manual, EECS Depart-ment, University of California, Berkeley, 2008.

[28] W.B. Mattes, S.D. Pettit, S.A. Sansone, P.R. Bushel, and M.D. Waters,Database development in toxicogenomics: issues and efforts, Environmen-tal Health Perspectives, 112, (4), (2004), pp. 495-505.

[29] A. Nemirovski, Five Lectures on Modern Convex Optimization, C.O.R.E.Summer School on Modern Convex Optimization (2002), Available athttp://iew3.technion.ac.il/Labs/Opt/opt/LN/Final.pdf.

[30] A. Nemirovski, Lectures on Modern Convex Optimization, Israel Institute ofTechnology (2002), available athttp://iew3.technion.ac.il/Labs/Opt/opt/LN/Final.pdf.

[31] A. Nemirovski, Interior Point Polynomial Time Algorithms in Convex Pro-gramming, Lecture Notes (2004), available at https://itweb.isye.gatech.edu.

[32] A. Nemirovski, Modern Convex Optimization, Lecture at PASCAL Work-shop, Thurnau, Germany, March 16-18 (2005).

[33] Y.E. Nesterov, and A.S. Nemirovskii, Interior Point Polynomial Algorithmsin Convex Programming, SIAM, 1994.

[34] M. Partner, N. Kashtan, and U. Alon, Environmental variability and modu-larity of bacterial metabolic network, BMC Evolutionary Biology, 7, (2007),169doi:10.1186/1471-2148-7-169.

Page 19: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

Regression Analysis for Clusters in Gene-Environment Networks 19

[35] S. Pickl, Der τ -value als Kontrollparameter - Modellierung und Analyseeines Joint-Implementation Programmes mithilfe der dynamischen kooper-ativen Spieltheorie und der diskreten Optimierung, Thesis, Darmstadt Uni-versity of Technology, Department of Mathematics, 1998.

[36] L. Ros, A. Sabater, and F. Thomas, An ellipsoidal calculus based on prop-agation and fusion, IEEE Transactions on Systems, Man and Cybernetics,Part B: Cybernetics, 32, 4 (2002), pp. 430–442.

[37] Y. She, Sparse regression with exact clustering, PhD Thesis, Department ofStatistics, Stanford University, USA, 2008.

[38] M. Tastan, Analysis and prediction of gene expression patterns by dynamicalsystems, and by a combinatorial algorithm, MSc Thesis, Institute of AppliedMathematics, METU, Turkey, 2005.

[39] M. Tastan, T. Ergenc, S.W. Pickl, and G.-W. Weber, Stability analysis of geneexpression patterns by dynamical systems and a combinatorial algorithm, inHIBIT – Proceedings of International Symposium on Health Informatics andBioinformatics, Turkey ’05, Antalya, Turkey, 2005, pp. 67–75.

[40] M. Tastan, S.W. Pickl, and G.-W. Weber, Mathematical modeling and stabil-ity analysis of gene-expression patterns in an extended space and with Runge-Kutta discretization, in Proceedings of Operations Research 2005, Bremen,September 2005, Springer, pp. 443–450.

[41] P. Taylan, G.-W. Weber, and A. Beck, New approaches to regression by gen-eralized additive models and continuous optimization for modern applica-tions in finance, science and techology, in the special issue in honour of Prof.Dr. Alexander Rubinov, Burachik, B., Yang, X. (guest eds.) Optimization, 56,5-6, (2007), pp. 1–24.

[42] O. Ugur, S.W. Pickl, G.-W. Weber, and R. Wunschiers, Operational researchmeets biology: An algorithmic approach to analyze genetic networks andbiological energy production, Preprint no. 50, Institute of Applied Mathe-matics, METU, 2006, submitted for the special issue of Optimization at theoccasion of the 5th Ballarat Workshop on Global and Non-Smooth Optimiza-tion: Theory, Methods, and Applications (2006).

[43] O. Ugur, and G.-W. Weber, Optimization and dynamics of gene-environmentnetworks with intervals, in the special issue at the occasion of the 5th BallaratWorkshop on Global and Non-Smooth Optimization: Theory, Methods andApplications, November 28-30, 2006, of J. Ind. Manag. Optim., 3, 2, (2007),pp. 357–379.

[44] A.Y. Vazhentsev, On internal ellipsoidal approximations for problems of con-trol synthesis with bounded coordinates, J. Comput. System Sci. Int., 39, 3,(2000), 399.

Page 20: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

20 E. Kropat, G.-W. Weber, J.-J. Ruckmann

[45] F.G. Vazques, J.-J. Ruckmann, O. Stein, and G. Still, Generalized semi-infinite programming: A tutorial, Journal of computational and appliedmathematics, 217, 2, (2008), pp. 394–419.

[46] G.-W. Weber, Charakterisierung struktureller Stabilitat in der nichtlinearenOptimierung, in Aachener Beitrage zur Mathematik 5, H.H. Bock, H.T. Jon-gen, and W. Plesken, eds., Augustinus publishing house (now: Mainz pub-lishing house) Aachen (1992).

[47] G.-W. Weber, Generalized semi-infinite optimization and related topics, Hel-dermann Publishing House, Research and Exposition in Mathematics 29,Lemgo, K.H. Hofmannn and R. Wille, R., eds. (2003).

[48] G.-W. Weber, S.Z. Alparslan-Gok, and N. Dikmen, Environmental and lifesciences: gene-environment networks - optimization, games and control - asurvey on recent achievements, Preprint at Institute of Applied Mathematics,invited paper, to appear in the special issue of Journal of OrganisationalTransformation and Social Change, 5, 3, (2008), pp. 197–233 .

[49] G.-W. Weber, S.Z. Alparslan-Gok, and B. Soyler, A new mathematical ap-proach in environmental and life sciences: gene-environment networks andtheir dynamics, Environmental Modeling & Assessment, 14, 2, April 2009,pp. 267-288.

[50] G.-W. Weber, E. Kropat, B. Akteke-Ozturk, and Z.-K. Gorgulu, A surveyon OR and mathematical methods applied on gene-environment networks,Special Issue on ”Innovative Approaches for Decision Analysis in Energy,Health, and Life Sciences” of Central European Journal of Operations Re-search (CEJOR) at the occasion of EURO XXII 2007 (Prague, Czech Repub-lic, July 8-11, 2007), 17, 3, (2009), pp. 315–341.

[51] G.-W. Weber, E. Kropat, A. Tezel, and S. Belen, Optimization applied onregulatory and eco-finance networks - survey and new developments, to ap-pear in Pac. J. Optim., 6, 3, Special Issue in memory of Professor AlexanderRubinov, M. Fukushima, M., et al., guest eds. (2010).

[52] G.-W. Weber, I. Batmaz, G. Koksal, P. Taylan and F. Yerlikaya-Ozkurt,CMARS: A new contribution to nonparametric regression with multivariateadaptive regression splines supported by continuous optimisation, Preprint atIAM, METU, Ankara, submitted to Advances in Computational and AppliedMathematics.

[53] G.-W. Weber, S. Ozogur-Akyuz, and E. Kropat, A review on data mining andcontinuous optimization applications in computational biology and medicine,Embryo Today, Birth Defects Research (Part C), 87, (2009), pp. 165–181.

[54] G.-W. Weber, P. Taylan, S.-Z. Alparslan-Gok, S. Ozogur, and B. Akteke-Ozturk, Optimization of gene-environment networks in the presence of er-rors and uncertainty with Chebychev approximation, TOP, the Operational

Page 21: based on Ellipsoidal Calculus and Optimization - unibw.de fileRegression Analysis for Clusters in Gene-Environment Networks based on Ellipsoidal Calculus and Optimization In Commemoration

Regression Analysis for Clusters in Gene-Environment Networks 21

Research journal of SEIO (Spanish Statistics and Operations Research Soci-ety) 16, 2 (2008), pp. 284-318.

[55] G.-W. Weber, and A. Tezel, On generalized semi-infinite optimization of ge-netic networks, TOP, 15, 1, (2007), pp. 65–77.

[56] G.-W. Weber, A. Tezel, P. Taylan, A. Soyler, and M. Cetin, Mathematicalcontributions to dynamics and optimization of gene-environment networks,in Special Issue: In Celebration of Prof. Dr. Dr. Hubertus Th. Jongen’s 60thBirthday, D. Pallaschke, O. Stein, guest eds., of Optimization, 57, 2, (April2008), pp. 353-377.

[57] G.-W. Weber, O. Ugur, P. Taylan, and A. Tezel, On optimization, dynamicsand uncertainty: a tutorial for gene-environment networks, in the SpecialIssue Networks in Computational Biology of Discrete Appl. Math., 157, 10,May 2009, pp. 2494–2513.

[58] F. Yerlikaya, A new contribution to nonlinear robust regression and classi-fication with MARS and its applications to data mining for quality controlin manufacturing, Thesis, Middle East Technical University, Ankara, Turkey,2008.

[59] F.B. Yılmaz, A mathematical modeling and approximation of gene expres-sion patterns by linear and quadratic regulatory relations and analysis of genenetworks, MSc Thesis, Institute of Applied Mathematics, METU, Ankara,Turkey, 2004.

[60] F.B. Yılmaz, H. Oktem, and G.-W. Weber, Mathematical modeling and ap-proximation of gene expression patterns and gene networks, in OperationsResearch Proceedings, F. Fleuren, D. den Hertog, and P. Kort, eds., 2005,pp. 280–287.