nonlinear analysls and cobb's cusp surface analysis program€¦ · catastrophe theory...

Numemus n o n l i ~ ~ a r pknotnena rhar exhibit dircondnuolls jumps in behavior have been modeled with c~llcurropk theory. CobbCobb's Cusp Surface Analysis Pmgmm (CUSP) pmvides a way to empirically urimatl and test a nonibwar cvrp ataftrvpk modeL A model o f emotion during pmblem soking is used to intmduce c a ~ n v p h c tkory modeling and is discussed in conjunction with how to nm CUSI: how to interpmt tk ouipu!, and how to impmve CUSP> perfofonnance. C o ~ o m arr madc between linear mgrusion and c a i m m p k theory modclc and between catastrvpk tkory gmdient ~ ~ M W U C S and Cobb's sraric stutistical nup modcL

NONLINEAR ANALYSLS

Catastrophe Theory Modeling and Cobb's Cusp Surface Analysis Program

BRADFORD D. ALLEN JAMES CARIFIO

University of Macsachuserts at h l l

In the past decade, there has been considerable interest, both theoretical and analytical, in nonlinear models of various phenomena (e.g., Gleick 1987). The interest has been characterized by some as a movement toward greater - sophistication in thinking about and modeling phenomena (e.g., Pagels 1989) and by others as part of a general trend toward neonaturalistic methodology in research (e.g.. Eisner 1991). Much of the discussion of nonlinear models in both the popular press and scholarly journals, however, has failed to note the lack of available statistical methods for testing whether relationships are indeed nonlinear. A number of advances have been made in this area in recent years that are "b~akthrough" in character, and they are worth describing further to practicing researchers.

The purpose of this article is to discuss nonlinear statistical analysis and catastrophe theory in general and to present CUSP, Cobb's Cusp Surface Analysis Program. CUSP, which is available from Cobb, is designed to assess whether the relationship between the variables can be modeled as a nonlinear cusp surface, one of the seven elementary catastrophe theory models

EVALUATION REVIEW, Vol. 19 NO. I. February 1995 64-83 0 1995 Sage Publiearioac, Inc

64

Men. Carifio I CATASTROPHE THEORY MODELING 65

(Zeeman 1977). How to use CUSP will be discussed after we outline the theory and provide an example.

NONLINEAR ANALYSIS

Since Poincark first worked on dynamical systems in the 17th century and until relatively recently, nonlinear equations have been viewed as essentially the same as linear equations. During that time, analytic solutions were emphasized even though few nonlinear models could be solved explicitly. It was believed that linear approximations were adequate and that nonanalytic solutions added nothing to the understanding of the phenomenon. In the early 1960s, new mathematical developments and the availability of computers led to the realization that nonlinear dynamic models are very different from linear models, and that nonlinear equations have characteristics that cannot be approximated by linear equations. Interest in nonlinear mathematical models for their own sake continued, but studies applying those models were viewed with skepticism. The description of increasingly complex phenomena required more variables and more linear approximations, which only decreased the confidence in the solutions.

In the past 25 years, however, there has been an explosion of nonlinear modeling and research. An important methodology in this research is bifurcation theory, which examines the qualitative changes that take place in the geometry or topology of solutions over time and allows many phenomena and their models to be grouped into classes according to their behavior. The classifications are especially useful when parameters are close to values where a change of form takes place. Because fewer variables are required to characterize the discontinuities than are required for smooth transitions, generally it is the discontinuities rather than smooth transitions that offer the most insight into phenomena (Bai-lin 1989). Catastrophe theory modeling, which falls under the broad umbrella of bifurcation theory, describes the change of form that can occur around certain types of discontinuities.

CATASTROPHE THEORY

Numerous nonlinear phenomena that exhibit discontinuous jumps in behavior have been modeled using catastrophe theory. The rapid changes in perception of ambiguous figures have been modeled in Poston and Stewart (1978). in Stewart and Peregoy (1983), and in Ta'eed. Ta'eed, and Wright

66 EVALUATION REVIEW / FEBRUARY 1995

(1988). &man (1977) models rapid changes in mood, the sudden crashes and surges in the stock market, prison disturbances, the influence of public opinion on the policy adopted by an administration, anorexia nervosa, and censorship in a permissive society. A model of problem solving in which the solver exits from the problem-solving process either with or without the solution is presented by Boles (1990). and misconceptions in science education are modeled by Boyes (1988). Some other catastrophe theory models include the following: attitude with respect to an election survey (Anderson 1985). research in higher education (Staman 1982). attitudes and social behavior (Flay 1978), birth rates throughout nations (Cobb 1978). attitude change and behavior (Cobb and Watson 1980), psychoanalytic phenomena (Callahan 1990). the emergence of urban slums Oendrinos 1979). patterns of blaming nurses for incidents of aggression (Carifio and Lanza 1992), nonresponse in surveys (Carifio, Biron, and Shwedel 1991), personnel selection, therapy, and policy evaluation (Guastello 1982). motivation in organizations (Guastello 1987). and accidents in an organization (Guastello 1988). For a more complete list of catastrophe theory models over a wide range of applications, see Guastello (1987).

There are several characteristics of catastrophe theory that are the promi- nent features of any cusp catastrophe theory model (see Figure 1). These features are as follows.

1. Over some part of the range of the phenomenon, the behavior is ambiguous; that is, either of two stable but distinctly different behaviors can occur. This characteristic is known as bimodality.

2. Slight changes in the phenomenon can result in discontinuity-an abrupt "jump" (or catastrophe) from one behavior to the other.

3. Over a part of the range of the phenomenon. there is a middle region between the two possible types of behavior that is inaccessible (cannot occur).

4. Because of the inaccessible region. for some range of the phenomenon, jumps from one behavior to a distinctly different behavior do not occur. The result is that jumps between different behaviors depend on the current behavior state and therefore do not always occur at the same point. If the phenomenon cycles back and forth between the two behaviors, there is always a delay between the transitions. This property is known as hysteresis.

5. The phenomenon may exhibit sensitivity or instability. Small perturbations in the phenomenon may lead to large differences in subsequent behavior. Behav- iors may diverge in the sense that two behaviors may start very close together and experience the same circumstances, yet they may take very different paths and end up far apart. This characteristic is known as divergence.

Allen, Carifio I CATASTROPHE THEORY MODELING 67

Figure 1 : Cusp Surface NOTE: The bifurcation set is the boundary of the region where the response is bimodal.

Catastrophe theory combines all five characteristics-bimodality, discontinuity, inaccessibility, hysteresis, and divergence-into one model. The model relates each characteristic to each of the others. According to Thorn (1975), the progenitor of catastrophe theory, the method has the potential for modeling the evolution of forms in all aspects of nature. Given a process in which one of the characteristics is evident, the process should be examined for the other four. With evidence for two or more characteristics, the process becomes an excellent candidate to be modeled with catastrophe theory (Zeeman 1976).

A cusp catastrophe model assumes that there are two different types of behavior that are controlled by two orthogonal factors. The two factors form a plane called the "control space." Contained in the control space is the bifurcation set. Discontinuous changes may occur at the bifurcation set Whether a discontinuous change occurs depends on how the cusp surface is oriented, and on the direction the trajectory takes in crossing the bifurcation set. The factors can be oriented so that one factor runs down the pleat in the cusp surface. A factor oriented this way is called the "splitting" or "bifurcation" factor. Increasing the value of the splitting factor causes the two

68 EVALUATION REVIEW I FEBRUARY 1995

w n i l i c t i n q /A\

bifurcation set

f a c t o r

Figure 2: Control Space Wlth l k o Posslble Axes Orlentatlons

behavior surfaces to separate. The other factor is perpendicular to the splitting factor and so is often called the "normal" or "asymmetry" factor. As this factor increases, the effect is that the behavior eventually changes (possibly discon- tinuously) from one type of behavior to the other.

The factors can be oriented in a way so that the behavior elicited by one factor is opposite to, or conflicts with, the behavior elicited by the other factor. Increasing the value of one conflicting factor tends to push the behavior onto the upper surface, but increasing the value of the other conflicting factor tends to push the behavior onto the lower surface. Inside the bifurcation area, the two factors conflict. Needless to say, the conflicting factors are formed by rotating the bifurcation and asymmetry factors. In either case, the two control factors may be actual measurable variables, or more likely, they are formed by weighted averages of several independent variables. Ln the latter case, the control factors are named according to the underlying construct that they represent.

AN EXAMPLE

Catastrophe theory modeling can be illustrated by discussing acusp model of emotion during problem solving. Rapid changes in emotion are often a part of the process of problem solving. Negative feelings of frustration,

Allen, Carifio I CATASTROPHE THEORY MODELING 69

dismay, and defeat, along with positive feelings of relief, triumph, and hope, are common and can occur repeatedly in the course of solving a single problem. If the onset of positive or negative emotion is sudden and intense, the experience is often identified as either "Aha!" (Parnes 1975; Purica 1988) or "Oh-oh!" respectively. A model of emotion during problem solving can contribute to the understanding of the emotional aspects of problem solving. An awareness of emotion during problem solving might help students moni- tor and reflect on their own feelings, thinking, and performance, and might increase students' willingness to persevere at mathematical tasks.

The model of emotion presented here is motivated by Mandler's theory of emotion (Mandler 1984). Mandler's theory states that the incongruity between what is expected and what is experienced causes the intermption of ongoing plans or planned activity. The interruption of plans leads to physiological arousal and cognitive evaluation, the two major systems of emotion. Physiological arousal occurs after any interruption of an individual's plans or planned behavior. The strength of the interruption depends on the amount of incongruity between what is expected and what is experienced. Evaluation of the interrupted event is either positive or negative, with the particular outcome depending largely on the assimilative and accommodative processes that take place as a result of the intermption. Arousal is nonspecific in that it contributes nothing to the evaluation, whether positive or negative. Arousal provides only the visceral or "gut" feel that determines the intensity of emotion. On the other hand, the evaluation of the situation, which depends on how the interruption is interpreted, determines the quality or tone of emotion. According to Mandler's theory, evaluation and arousal are the two processes that, when combined, produce emotion.

There is no need to modify the control factors of the catastrophe theory model to fit Mandler's theory of emotion. The model factors match perfectly with the constructs defined by Mandler. Physiological arousal is measured along the bifurcation axis because increasing physiological arousal does no more than increase emotion intensity, whether positive or negative. Incon- gruity is measured along the asymmetry axis because increasing incongruity tends to push emotion from positive to negative. Because of what Mandler calls "bands of tolerance," transitions from either positive to negative or negative to positive often are not smooth. The bands of tolerance explain the discontinuity and hysteresis effects that are part of the emotion process. Most important, emotion can be characterized as bimodal (Russell 1979). making emotion easy to model with catastrophe theory. Emotion is either agreeable or disagreeable. These two states are not independent of each other; they are bipolar opposites.


To empirically demonstrate the appropriateness of a catastrophe theory model of emotion during problem solving, data were collected from adult subjects while they were solving problems. In a graduate class titled Mathe- matical Problem Solving, each of the 15 students in the class was given three problems. The problems were difficult, multiple-step problems that required different skills and often were not solvable in one sitting. The students were given several questionnaires that they filled out while they were solving the problems. Each questionnaire contained 12 questions, such as "How mentally energized are you?" "How comfortable are you with your progress?"'How successful do you expect to be?' and "How frustrated are you?" Answers were chosen from a Likert-type scale with a range from 0 (not at all) to 6 (very much).

Students were requested to answer the questions at times of their own choice as long as they felt particularly frustrated or pleased with their problem-solving progress. Because of the nature of catastrophe theory, sampling at random times was not important. What was important was to catch the student in an emotional state that was stable with respect to the importance, expectation, and progress at the time the questionnaire was answered. A total of 67 questionnaires were collected. Each questionnaire was treated as an independent observation of the phenomenon for the purpose of analysis, as what was being assessed was surface fit rather than movement from point to point on the surface. This example will be referred to further as we continue to outline catastrophe theory modeling.

GRADIENT DYNAMICS

An essential part of catastrophe theory is gradient dynamics. It is from this concept that catastrophe theory arises. Agradient dynamic system is a system in which the process moves toward a local maximum or minimum and where a single function can be used to describe the movement for each of the state (independent) variables; that is, one function determines the movement in the dynamic process. In the language of catastrophe theory, the function is called the "potential function" and just as in numerous physical models, movement is determined by the gradient (slope) of the potential function.

Although the claim that "catastrophe theory applies only to gradient systems" is a natural reaction to numerous poorly constructed qualitative catastrophe theory applications (Poston and Stewart 1978). the claim comes from a misunderstanding of the broad range of catastrophe theory applicability. Although catastrophe theory is driven by the maximization or minimiza-

Men, Carifio I CATASTROPHE THEORY MODELING 71

tion of potential functions, catastrophe theory modeling is not restricted to modeling only gradient systems. Many types of dynamic phenomena that involve some type of optimization, but not necessarily the maximization of potential functions, can be related to catastrophe theory by the construction of suitable Liapunov functions.

A Liapunov function is a function of the state variables that conveys some information about the behavior of a system as it evolves in time. A function is sought that continually increases toward locally maximum values as the system evolves. Such a function condenses the behavior of the entire system to gradient-like behavior and thus can serve as a potential function. Catastro- phe theory can then be used to model the summarized behavior described by the Liapunov function. In the generalized form, Liapunov functions are perhaps the most powerful method for analyzing nonlinear systems (Luen- berger 1979).

In spite of the broad range of applicability, it is still necessary to under- stand and discuss catastrophe theory as a gradient dynamic system. Without loss of generality, the case discussed here will be one in which the process moves toward local maxima

The shape of the potential function, and therefore the movement in the dynamical system, is parameterized by a particular set of control factors. For a fixed set of control factor values (called a control point), the corresponding potential function has a particular set of maxima and minima Once the process moves to the local maximum, movement stops and the system finds itself in equilibrium. As the control parameters change with time, maxima and minima may flatten out and new extrema may appear in other places. A minimum and maximum can come together and negate each other, leaving the system no longer in equilibrium. Guided by the potential function, the system rushes toward the next local maximum. The "jump" that occurs as the system moves from equilibrium to disequilibrium and to equilibrium again is called a catastrophe (Fararo 1978). Describing these jumps is the essence and strength of catastrophe theory.

In our example of emotion during problem solving, the assumption that emotion is gradient-like is based on the Gestaltist principle of Priignanz (Koffka 1935). This principle states that a given stimulus is perceived as its simplest interpretation. "Simplest interpretation" for a given stimulus is a stable attractor. In situations in which more than one simplest interpretation is possible, simplest interpretation is the perception that is simpler than any nearby or closely similar perception. This gradient-like dynamic is used by Stewart and Peregoy (1983) in their catastrophe model of perception and is equally appropriate for the emotion model. For many social science applications, if there are attractors in the behavior and there is no periodicity (or

72 EVALUATION REMEW I FEBRUARY 1995

worse-chaos), then the assumption that the behavior is gradient-like is a reasonable one.

The principle of Mgnanz as applied to problem solving states that a particular problem-solving situation is usually seen as being either "good"- one's progress toward the solution is proceeding as hoped or planned--or "bad"--one's progress is not proceeding as planned. Certainly there are times at which one's progress is seen as a fuzzy combination of good and bad, but the gradient-like nature of perception tends to push the percept to one of the simplest interpretation attractors: good or bad.

COBB 'S CUSP SURFACE ANALYSIS PROGRAM

Catastrophe theory describes the behavior at certain points in a gradient dynamic system. Cobb (Cobb and Watson 1980; Cobb and Thrall 1981) shows that a gradient system with white noise included as part of the dynamic leads directly to solutions that are multimodal probability distributions. Under certain conditions, CUSP is able to estimate a probability density function from which a cusp catastrophe surface is calculated. CUSP assumes that the probability density function is a multimodal generalization of the Pearson system of normal densities (Ord 1972) with either one mode or two modes separated by an antimode (Cobb, Koppstein, and Chen 1983). CUSP uses a maximum likelihood procedure to find the most likely generalized normal density that would have produced the observations (Cobb 1981).

The first step taken by CUSP is to estimate the coefficients of the linear regression model. This model, which is of the form 0 = Y - C, where C is a linear combination of the independent variables, becomes the null hypothesis against which the statistical model is compared. The linear model also serves as the starting point for aNewton-Raphson-based optimization technique that searches for the probability density function parameters that maximize the likelihood of the cusp model. If the first iteration shows a decrease in likelihood, the iteration process stops and CUSP reports that the linear model is a better fit than any cusp model. This is the result that should be expected unless the empirical distribution is measurably bimodal over some region.

Prior to using Cobb's program, the emotion data was transformed into four component variables. The dependent variable, Y, is a combination of the6'how pleased" and "how frustrated" questions and is used as a measure of emotion. The first independent variable, XI, is a combination of variables related to motivation. The second independent variable, X,, is a combination of variables related to the perception of progress with the problem. The third

Allen, Carifio / CATASTROPHE THEORY MODELING 73

variable, X,, is a combination of variables related to the expectation of success. As previously stated, the questionnaire was designed as a pilot study to gather a broad range of data on variables that were suspected of being related to emotion as described by Mandler's theory. The component variables do not align themselves directly with the concepts of Mandler's theory because these concepts are not measured directly by the questionnaire.

The standardized linear model that was found for emotion is Y = -.2X, + .4X, + .4X3. This equation shows that as motivation increases, negative emotion increases, and as progress and expectation increase, positive emotion increases. One should note that progress and expectation each have txice the weight of motivation. CUSP starts with this model in its search for the cusp model.

In maximizing the likelihood of the generalized probability density, CUSP iterates toward a statistical model that has the form:

The set of modal values of the generalized density that maximizes the likelihood of the observations forms a solution to Equation 1. The statistical model comes from the canonical cusp model:

O=a+by-y3. (2)

The canonical model variables (a, b, and y) can be derived from the statistical model variables (A, B, C, D, and Y) by a smooth, invertible transformation. A, B, and C are the control factors of the model and are linear combinations of the independent variables. These factors allow for a pleat in the surface but give the statistical model a form that is different from the canonical model. All the factors are necessary, however, because the statistical model is derived from raw data that may not be perfectly centered at the origin nor aligned along axes as in the canonical model of Equation 2.

Factors A, B, and C are linear combinations of two, to up to seven, independent variables, The factors, along with the scalar coefficient D, determine a single dependent variable Y. In Cobb's methodology, Y is the predicted value of the random variable Y when the independent variables, which are known fixed numbers, are given. Factor A is the asymmetry factor, B is the bifurcation factor, C is the linear factor, and D is a scale coefficient.

In catastrophe theory modeling, it is convenient, though not necessary, to let the factors be linear combinations of the independent variables. CUSP uses linear combinations of independent variables to form the factors. Doing so lirmts the way the cusp surface can be adjusted to best fit the data Nonlinear transformations such as localized rotations, stretches, or pinches


Figure 3: Cusp Surtace Emotlon Model

are not part of CUSP'S analysis. This limitation has the advantage that the number of estimated parameters is small.

The control factors of the emotion model with standardized coefficients were found to be:

Thus the emotion model is

O=(.2-.2Xl+.2X2+.8X3)+(2.9-XI+.6X2-.9X3)(Y+.1Xl-.3XJ -4(Y + .lXI - .3xJ3.

To evaluate and interpret the above equation's fit to the observations, CUSP performs certain statistical tests that are described and discussed below. It may be stated here, however, that the cusp model is a far better model of emotion described by the data than is the linear model.

The cusp catastrophe statistical model is actually a generalization of the linear regression model. When A and D of Equation 1 are zero and for a particular value B, the statistical model reduces to the regression model. For

Men, Caritio / CATASTROPHE THEORY MODELING 75

these values. the coefficients of the linear combination of X, that equals C in Equation 1 are the same as those of the regression model's coefficients. The regression model can be seen to be a special case of catastrophe theory from the equatiofis that describe the two models (Cobb and Zacks 1985). Polyno- mial regression models have the form:

y = h + P , x + . . . + h x k + ~ , (3)

whereas the canonical catastrophe theory models are derived from equations of the form:

0 = PO(X) + pl(x)y + . .. + pd(x)yd . (4)

Equation 4 has the form of a linear polynomial regression model when d = 1, po is a polynomial function of x, and p l is a constant. Equation 4 has the form of a cusp model when d = 3, po and p l are linear functions of x, and pz and p3 are constants. The main difference in these two models is that Equation 4 is multivalued for some x but Equation 3 is always single valued.

The statistical model derived by CUSP is closely related to the dynamic model. A catastrophe cusp surface is the set of equilibrium points of a dynamic deterministic system in which the dynamic is described by a potential function. Under what is called the "delay rule," Y moves in the direction of increasing potential at a rate proportional to the slope of the potential function. When the movement stops, Y has reached equilibrium at one of possibly two local maxima on the potential function. In other words, the cusp surface is the set of points, Y, that locally maximize a possibly bimodal potential function.

In contrast, Cobb's static statistical model uses a probability density function to determine the predicted values of Y. The cusp surface is defined as the set of modes (the maxima) of a possibly bimodal probability density function. More specifically, the estimated cusp surface is calculated from the set of modes of the conditional, possibly bimodal, maximum likelihood density function for Y. The antimodes (local minima) of the density are the Y values predicted not to occur. 'This method of deriving acusp surface allows for perturbations in the system resulting from random influences. Cobb's model is not restricted to predicting exact values, and state changes do not have to occur exactly at cusp boundaries. Observations occur at predicted values only on the average.

CUSP performs three separate statistical tests to assess whether the estimated cusp model is superior to the linear model. The first test compares the likelihood of the cusp model to the likelihood of the linear model. Twice the difference in the logs of the likelihoods has approximately a chi-square


distribution (Mendenhall, Scheaffer, and Wackerly 1986). Asufficiently large chi-square statistic resulting from a large difference in likelihoods indicates that for given data, the cusp model has a significantly greater likelihood than does the linear model. Second, CUSP assesses whether the estimated model is actually a cusp surface. Cusp surfaces requires that D be positive and that either A or B be nonzero. CUSP applies a one-tailed t test to assess whether D is significantly greater than zero and uses two-tailed tests to test A and B. Third, to avoid ending up with a surface that was derived by extrapolating over the bimodal region, CUSP requires that at least 10% of the data points fall in the bimodal region of the estimated model. If all three tests are passed, it may be concluded that the cusp model describes the relationship between the dependent random variable Y and the fixed (measured without error) vector X of independent variables (Cobb 1992). These and a number of other statistics and graphs are derived by CUSP to help detcnnine the estimated model's prediction quality.

In testing the emotion model, the statistic that compares whether the cusp model is superior to the linear model by comparing the likelihood of the cusp model to the likelihood of the linear model was chi-square = 30.07 with 8 degrees of freedom (df = 2 times the number of independent variables plus 2). This test is significant at the p = .002 level. The bifurcation coefficient was significantly different from zero with t= 2.96 with 55 degrees of freedom (df= 67 data points - 3 times the number of independent variables - 3). This one-tailed t test is significant at the p = ,005 level. Also, 64% of the cases fell in the bimodal area.

Because of its topological nature, catastrophe modeling poses special problems in computing statistics such as goodness of fit. Observations within the pleated region of the surface could be deviations fiom either of the pleat's two prediction surfaces. The predicted surface for any given point should be the surface the point would move to if the model was a dynamic gradient system. For CUSP to associate each data point with the appropriate surface so that variation of the data about the estimated model can be computed, an association convention is needed. CUSP uses a modification of the delay convention. In catastrophe theory literature, the delay convention usually means that the process always moves to the local maximum rather than always moving to the global maximum. The delay convention for the maximum likelihood model, in which probability functions have replaced potential functions, means that the predicted value is the density's mode that is on the same side of the antimode as the observed value.

The coefficient of determination, 2, which gives an idea of how well a surface fits a set of data, is computed for both the linear regression model and for the cusp model. The delay convention is used in computing 2 for the cusp

AUeo. Carifio 1 CATASTROPHE THEORY MODELING 77

model, so that coefficient is called "delay 9." (The symbol ?(&@, is used for delay 2 in the formulas that follow.)

The two 2 statistics have much in common. In both cases, the predicted value is a mode of the conditional probability distribution. In computing delay 2, the conditional probability distribution is assumed to be a generalized normal density. The predicted value is the mode that is closest to the observed value. In computing the linear regression 9 , the conditional probability distribution is a traditional normal density, and the predicted value is the one and only mode of the normal conditional probability density. The delay statistic and the linear regression 2 statistic should each be interpreted as the proportion of variance accounted for by the model. CUSP computes both of these values so that the amounts of variance explained by each model can be compared.

The cusp model of emotion proved far superior to the linear model. The linear 2 statistic was .45, whereas the delay 2 statistic was 33. This difference means that 84% more variation (331.45 x 100%) is explained by the cusp model than is explained by the linear model.

Rather than looking at the proportion of variation explained by the cusp model, it may be useful to know how much raw score standard deviation is left after factoring out the effect of the model. The statistic that measures the raw score standard deviation about the model is called the standard error of estimate. The standard error of estimate for the cusp model is computed as:

where Sy is the standard deviation of the dependent variable observations. This formula is similar to the standard error of estimate ScsN~incOT, = S,(1 - ?)'Rfor linear regression. The reasoning behind these formulas is as follows. The total variance of y, written as syZ, can be separated into the part explained by the model and the part not explained by the model. Because delay 3 is the proportion of the total variance explained by the model, the amount of variance attibutable to the model is ?(hb) syZ. The rest of the variance is the residual error variance, otherwise known as variance of estimation, and is written as sZesNdehy,. Therefore

P U d & b ) = s;(l- 4&b$ from which Equation 5 above follows. CUSP does not report the raw score standard error of estimate but does report the standard error of estimate for

78 EVALUATION REVIEW / FEBRUARY 1995

the estimated model in units of the standard deviation of the dependent variable y. This statistic is computed as

S d & b / S ~ = (1 - b b d ' n - (6)

CUSP prints a histogram of the residuals from the cusp surface estimation. A symmemcal tightly disbursed distribution centered at zero is most desir- able. The histogram of the residuals gives a visual representation of how well the cusp model fits the data

The standard deviation of the dependent variable of the emotion data is S, = 1.8, so St,db, = 1.8(.17)~ = .74 versus Sdk, = 1.8(.551h = 1.33. These results reflect the variation left unexplained by the linear model. The histogram was not perfectly symmetrical, which may indicate that more data are needed to improve the fit.

CUSP prints the predicted values along with the asymmetry and bifurcation coordinates for each of the first 100 data points. A scatter diagram of the data in the contr~l space (the plane defined by the asymmetry and bifurcation axes) is also included in the output. This diagram shows how many of the data points fall inside the pleated region.

The boundary of the pleated region (the bifurcation set) is not calculated explicitly by CUSP but can be obtained from the formula:

Alpha, beta. and delta in the above equation (and also gamma, not in the equation) are parameter values reported by CUSP. These values are the cusp model factor estimates A, B, C, and D discussed above.

CUSP also prints several graphs showing the effect of varying each independent variable while the other variables are held fixed. One should expect to see a graph that looks like the intersection of the cusp surface with a vertical plane.

USING THE CUSP SURFACE ANALYSIS PROGRAM

To use CUSP, the data should be a random sample of statistically independent observations. If the a priori conditional probability is larger with a bimodal distribution than with a unirnodal normal distribution, CUSP may indicate that a cusp model is appropriate. If too small a sample size is used, however, CUSP either aborts or prints error messages indicating numerical instability.

AUeq Cariflo 1 CATASTROPHE THEORY MODELING 79

The 1987 version of CUSP allows up to seven independent variables, but using this many can lead to numerical problems in the algorithm. Lf numerical problems do arise, there are a few things that can be done. A large number of variables often contain redundancy. All the variables do not measure separate constructs, and their contents tend to overlap. A large number of correlated variables can be transformed into a set of independent variables suitable for analysis by CUSP. Transformations can be made using the method of principal components if the sample size is at least 100 and there are a minimum of five cases per independent variable. The method of principal components allows most of the information contained in the data set to be expressed with two or three components. Because the components are perfectly independent, CUSP is less likely to run into numerical problems.

Another way to improve the performance of CUSP is by using a "suppres- sion variable." As is the case with linear regression, introducing an independent variable that is uncorrelated with the dependent variable and correlated with the other independent variables can sharply improve the convergence and the statistical results.

CUSP requires that at least 10% of the data points fall in the bimodal region of the estimated model. Depending on the nature of the data set, however, 10% may not be large enough for convergence, or large enough even to avoid computational problems. If numerical problems do arise, partitioning the data into subsets along the bifurcation axis may be helpful.

CUSP is particularly susceptible to numerical problems when too many cases fall behind the pleat of the estimated cusp surface. This region is where the b coefficient is less than zero. CUSP is not so sensitive to data outside the pleat in the asymmetry direction where the absolute value of the a coefficient is large. CUSP works best when the data values adequately cover the bimodal region and when the data values are mostly from a region where the b coefficient is greater than zero. Note that the larger the b coefficient, the wider the bimodal region.

CUSP ran the first time on the emotion data. As it turned out, none of the data fell where the b coefficient of the estimated model is negative. This was encouragmg because the component that affects intensity (physiological arousal according to Mandler's theory) should never be negative. Physiologi- cal arousal was not measured directly on the questionnaire, but the results suggest that the motivation, progress, and expectation variables contain some measure of the problem solver's physiological arousal.

CUSP'S performance also depends on the number of observations and on how well they can be fit with a bimodal, generalized normal dismbution.


When CUSP has data with two independent variables taken from an actual cusp surface, the convergence is accurate with as few as 15 data values. As the number of data values decreases below 15, convergence deteriorates rapidly. CUSP may not run at all on data for which a cusp surface is a poor fit unless the data set is sufficiently large. CUSP may abort on data sets with fewer than 35 points, but it may converge to a statistically significant cusp surface when processing similar data with 35 or more points. CUSP required 40 points before it would run without aborting when processing test data with four independent variables. Using a data set with five independent variables, CUSP required 55 points to avoid aborting.

From these and other tests, a rule of thumb for the minimum number of data points required to run CUSP is the following: The minimum number N should satisfy N - 5v > 30, where v is the number of independent variables. This rule of thumb is given with the admonishment that the number of points necessary to satisfy all tests and to allow CUSPto converge to the appropriate cusp surface depends significantly on whether the data have the particular bimodal form that CUSP looks for. It is for the above reason that, given a set of data points that fall exactly on a cusp surface, CUSPdoes not find the exact coefficients of that surface. CUSP behaves this way because the conditional densities for the data do not have a generalized bimodal normal form and so do not lend themselves to maximum likelihood cusp surface estimation. It is also for the above reason that CUSP incorr&tly converges to a cusp surface when given data with a uniform distribution about a linear trend. For some reason, CUSP cannot distinguish between a uniform dismbution over a rectangular region in the control space and a- generalized bimodal dismbution. As soon as the uniform distribution is changed to an even slightly bell-shaped distribution (simulated, for example, by using a linear trend plus the sum of two uniformly distributed variables), CUSP is able to determine that a linear model is preferable for modeling the linear trend.

To use CUSP, numerical data should be in an ASCII text file in free format (values separated by spaces). It is convenient to use one data case per line. After loading CUSP, type "CUSP" to run the program. The program prints questions about how the data file is set up and what data in the file are to be used. CUSP asks for the name of the file containing the data. the number of variables on each line of data. the number of variables (independent plus dependent) to be used on that run, and the positions of the variables in each data line. The positions of the variables are entered after the prompts "l:," "2::' and so on up to the number of variables to be used. After 1:. enter the position of the first variable; after 2:, enter the position of the second variable, and so on. The important point to remember is that the position of the dependent variable is entered last.

AUus Carifio 1 CATASTROPHE THEORY MODELING 81

Some output is sent to the screen. As each data point is used, a hyphen is printed, and at each Newton-Raphson iteration toward the maximum likelihood density, the current log-likelihood is printed. If CUSP converges, the entire output is sent to a file called CUSPOUT.

To repon significant results after using CUSP, the raw cusp model itself need not be presented unless further analysis related to the raw data is to be done. Because the parameters are inmcately tied to the scale of the data, the model with standardized coefficients may be more useful. Both are included as part of CUSP'S output.

Each factor in both the standardized model and the raw coefficients model is composed of a constant term plus a linear combination of the independent variables. The coefficients and the constant can be found in the coefficients mamces included in the output. Repon the asymptotic chi-square statistic, the degrees of freedom, and the significance level from comparing the likelihood of the cusp model to the likelihood of the linear model. It is informative to know how much more of the variance is explained by the cusp model than by the linear model, so repon the delay 2 and the linear 2 values. A comparison of the standard errors of measurement can be made if the S,,,, and Smdkb, values described above are also reported. As discussed above, all these statistics were helpful in assessing the acceptability of the cusp surface as a model for emotion.

If the cusp model is indeed a valid model of emotion during problem solving, it would provide a "visual gestalt" @avid Perkins, in Brandt 1990) that is desperately needed. Emotion theorists write volumes describing a phenomenon that could be fairly well explained by a cusp surface. Further- more, the benefits for students who are often too emotion focused in their problem solving are, as yet, unexplored.

Stewart and Johnson (1984, citing Zeeman) propose that the purposes of a cusp model are to give global insight, to reduce arbitrariness of description, to synthesize unconnected observations, to explain inexplicable features, and to suggest unsuspected possibilities. Phenomena exhibiting one or more of the characteristic features of catastrophe theory--bimodality, discontinuity, inaccessibility, hysteresis, and divergence-are common. It is particularly important for researchers not to conceal these features in their designs by using techniques such as averaging and t tests, linear regression, correlation, and single-factor analysis of variance. Modeling techniques from bifurcation theory, stochastic processes, and catastrophe theory can preserve nonlinear features and may reflect critically important aspects of the phenomenon being studied. Cobb's program can be extremely helpful to researchers in analyzing their data by allowing them to use a statistically testable nonlinear model from catastrophe theory. CUSP is easy to use and should be useful in a wide


range of applications in both the physical and social sciences. Researchers now have a convenient and appropriate methodology with which to do this.

REFERENCES

Anderson, J. J. 1985. A h r y for attimde and behavior applied to anelection survey. Behavioral Science 30219-29.

Bai-lin. H. 1989. Elementary symbolic dynarmcs and chaos in dissiPam,e system. River Edge. NJ: Wodd Scientific.

Boles, S. 1990. A model of m u h e and creative problem solving. Journal of Creative Behavior 24(3): 171-89.

Boyes, E. 1988. Catastrophe misconceptions inscience education. Physics Education Z : l 05-9. Brandf R. 1990. On knowledge and cognitive slrills: A conversation with David Perkins.

Eduu~ional Lradership 47:50-3. Callahan. J. 1990. Predictive models in psychoanalysis. Behavioral Science 35:60-76. Carifio. J.. R. Biron, and A. Shwedel. 1991. A comparison of community college responders and

nomsponders to the VEDS Student Follow-Up Survey. Research in Higher Education 32469-77.

Carifio, J.. and M. Lanza 1992. Further development of patient assault vignettes. Journal of Resmxh in Educcllion 2:81-9.

Cobb. L. 1978. Stochastic catastrophe models and multimodal disbibutions. BehavioralScience 22360-74.

. 1981. Parameter estimation forthe cusp catastrophe model. Behavioyl Science 26:75-8.

. 1992. Cusp surfnce &sir user's guide. Department of Community Medicine, University of N ~ W Mexico School of Medicine. Albuquerque, NM.

Cobb. L. P. Koppstein. and N. Chen. 1983. Estimation and moment recursion relations for multimodal dismbutions of the exponential family. Journal of the Americm Smisrical Associ4hon 78:124-30.

Cobb, L, and R. M. Thrall. 1981. Stochastic diffemdal equations for the social sciences. In Marhcmah'calfronriers of the social andpoluy services, edited by L. Cobb and R. M. Thrall. 37-68. Boulder, CO: Westview.

Cobb, L. and B. Watson. 1980. Statistical catastrophe theory: An o v e ~ e w . Marhcmalical Modeling 1:311-7.

Cobb. L., and S. Zacks. 1985. Applications of catastrophe theory for statistical modeling in the biosciences. Journal ofthe American Smisrical Association 80793-802.

Dendrinos, D. S. 1979. Slums in capitalist urban setfings: Some insights from catastrophe theory. Geogmphica Polonica 42:63-75:

Eisner. E 1991. The enlightened eye: Qualitative inquiry and the enhancement of educational practice. Toronto: Macmillan.

Fararo. J. J. 1978. An introduction to catastrophes. BehavwralScience 23:291-321. Flay, B. R. 1978. Catastrophe theory in social psychology: Some applications to attihldes and

social behavior. Behavioral Science 23:335-50. Gleick. J. 1987. Chaos: Making a new science. New Yo& Springer-Verlag. Guastello, S. J. 1982. oder rat or regression and the cusp catktrophe: Applications of twestage

personnel selection, haining. therapy, and policy evaluation Behavwml Science 27259-72.

Allen, Carifio / CATASTROPHE THEQRY MODELING 83

- . 1987. A bunafly eafastrophe modd of motivation in organizations: Academic perfor- manct. Journal OfApplied Psychology 72(1): 165-82.

-. 1988. Cataraophc modtling of tk accident proccs: -od subunit size. Psychological Bullcrin 103:246-55.

Koffka. K. 1935. Principffs of g e s u psychology. New Y o k Harcourt Brace. Luenbcrger. D. G. 1979. Innuduction to dynanuc system: Theory, modemodclc and applicaiions.

New York Wdey. Mandln, G. 1984. Mind and body. New Yorlr: Norton. Mendenhall. W.. R. S c M e r , and D. Wackcdy. 1986. Mafhemarical staristics with applications.

Boston: Duxbury. Ord J. K. 1972. Families ofjkquency disfributwnr. New York. Hafner. Pagels. H. R. 1989. The dmam of mason: The computer and the rue of the science of complexify.

New Y o k Bantam Pames, J . S. 1975. Aha! In Perspectives in cmarivify, edited by I . A. Taylor and J. W. Gtmls.

Chicago: Aldine. Poston, T., and I. Stewart 1978. Catustmphe theory and its applicah'ons. London: Pritiman. Rvica I. I. 1988. Creativity, intelligence and synergetic processes in the development of science.

Scientometrics 13:ll-24. Russell, J. A. 1979. Affect space is bipolar. Journal of Personalify and Soclal Psychology

37~345-56. Staman. M. E. 1982. Catastrophe theory in higher education research. Research in Higher

EducaRaRon 16:41-53. Stewan, I.. and C. Johnson. 1984. Applications of nonlinear models. ERIC Document Repro-

duction Servia No. ED246763. Stewan, I.. and P. L. Peregoy. 1983. Catastrophe theory modeling in psychology. Psychological

Bulletin 94:336-62. Ta'eed L. K, 0 . Ta'eed and J. E. Wright 1988. Determinants involved in the perception of the

N d e r cube: An application of catastrophe theory. Behavioral Science 3397-115. Thorn, R. 1975. Structural stabilify and morphogenesir. Trans. by D. H. Fowler. Reading, MA:

Benjamin Press. Zeeman, E. C. 1976. Catasmphe theory. Scicntij'ic AmrricM 234(4): 65-83. -. 1977. GUa~hvphe theory: Selecredpapers, 1972-1977. Reading. MA: Addison-Wesley.

BrNord 0. Allen, fonncrassirtantpmfessorof&~at Keene Stcuc Collcge, is cumntly a doctoral candidate at UMASS Lawcll investigating d e c t in mathematics learning.

James C ~ f i o is an arsociare umfessor in the College of EducaR'on at UMASS Lawcll whose m a s m cognitive psycholo& Lsenrch methodolo& Md &matics education Dr CM$o hac investigated and published studies on nonlinear phenomena in several different m a s .

nonlinear analysls and cobb's cusp surface analysis program€¦ · catastrophe theory...

Documents