simple and complex identities in similarity … porac...schelling study, agents were characterized...

! 1!

Simple and Complex Identities in Similarity Clustering: Using Schelling Segregation to Explore Infinite Dimensionalization in Organizational Cluster Formation

By

Christina Fang New York University

Ji-hyun Kim Yonsei University

Joe Porac New York University

! 2!

INTRODUCTION

Research in organizational theory and economic sociology suggests that organizational fields are

partly structured by collective representations that parse organizations into categorical forms or

cognitive codes (e.g., Hannan, Polos, & Carroll, 2007; Hsu, Negro, & Kocak, 2010; Vergne &

Wry, 2014). Much of this work has been focused on establishing how the socio-cognitive

category structure of organizational fields disciplines organizations by defining the

characteristics that buyers, suppliers, regulators, critics, and other audiences consider typical of

particular organizational types. Research suggests that distinctive organizations that depart from

socio-cognitive typifications are sometimes subject to legitimacy discounts in revenues (e.g., Hsu,

2006), costs (Ody-Brasier and Vermeulen, 2014), capital inflows (e.g., Pontikes, 2012), and

stock prices (e.g., Zuckerman, 1999), all of which have implications for organizational survival.

Given these effects, other research has probed how field level categories get established (e.g.,

DiMaggio, 1991; Rosa et al., 1999; Navis & Glynn, 2010), and how organizations attempt to

manipulate their categorical membership to advance their strategic interests (e.g., Santos &

Eisenhardt, 2009; Khaire & Wadhwani, 2010; Rhee 2015).

Research on the antecedents and effects of organizational categories has done much to

explicate the cognitive structure of organizational fields. At the same time, most of the existing

research takes organizational categories as givens, and tracks the usage and frequency of lexical

phrases that have become associated with particular organizational (“French restaurants,”

“grocery stores,” etc.) or product (“comedies,” “minivans,” etc.) configurations. This has left

open many questions about the nature of organizational categories themselves, particularly their

socio-cognitive underpinnings and their internal organization. Hannan, Polos, and Carroll (2007)

suggested that category nomenclatures within a field can evolve by borrowing from other

! 3!

nomenclatures and/or from recombining existing lexical phrases into new category constructions.

However, the canonical case of category formation is the creation of a new category to describe a

perceived cluster of similar organizations. In Hannan, Polos, and Carroll’s words, “Members of

audiences observe producers and products, notice similarities, try to make sense of them by

clustering similar producers/products, and possibly assign labels to clusters. These activities

comprise the first steps in the audience’s creation of what might turn into a category or even a

form” (p. 33). For Hannan and his associates, understanding this process of noticing and labeling

clusters of similar organizations “is exactly the challenge for organizational sociology” (p. 38).

In fact, a long line of research in cognitive science accords similarity among stimulus entities a

foundational role in category formation and learning (e.g., Rosch, 1978; Estes, 1994; Murphy,

2002).

However, over the years both cognitive scientists (e.g., Murphy & Medin, 1985; Sloman

& Rips, 1998) and organizational theorists interested in organizational classification (e.g.,

McKelvey, 1975) have critiqued the role of similarity in explaining category formation and

structure. At the root of these critiques is the problem of “infinite dimensionality,” or the fact

that entities and organizations can be described and compared using a very large number of

attributes and that “any two entities can be arbitrarily similar or dissimilar by changing the

criterion of what counts as a relevant attribute” (Murphy & Medin, 1985: 292). Infinitely

dimensionable stimuli have led some to limit the role of similarity in category theory by

suggesting that other cognitive processes such as causal rules (e.g., Sloman, 1995), functional

themes (e.g., Estes, Golonka & Jones, 2011), behavioral goals (e.g., Barsalou, 1983), bodily

states (e.g., Barsalou, 2008), and/or innate perceptual tendencies (Harnad, 2005) underlie

category formation and change.

! 4!

Durand and Paolella (2013) recently argued that these alternative processes “stretch”

theories of organizational categories in ways that are not accounted for by pure similarity-based

accounts. As reasonable as this suggestion might be, however, similarity assessments play too

central of a role in strategy and organizational theory to ignore similarity-based category models

completely. Indeed, much of organizational theory is an attempt to explain why organizations

are similar or different. And, in strategy research, as Farjoun and Lai (1997) noted, similarity

assessments implicitly underlie models of strategic groups, industry analysis, rivalry and

competition, diversification, and the evaluation of competitive advantage itself. In light of the

importance of similarity as a construct in strategy and organizations research, it is better to take

infinite dimensionality as given in organizational categorization and to explore its boundary

conditions and the socio-cognitive processes that are involved in overcoming the categorical

challenges it creates.

In this regard, cognitive scientists have proposed two general approaches to infinite

dimensionality, what Boster and D’Andrade (1989:132) called the “structured mind hypothesis”

and the “structured world hypothesis.” The structured mind hypothesis claims that prior

knowledge privileges some entity attributes over others in the formation of categories. Prior

knowledge includes implicit theories of the world (e.g., Murphy & Medin, 1985), motivational

interests (e.g., Barsalou, 1983), and innate perceptual tendencies (e.g., Harnad, 2005) that make

certain entity attributes more or less salient during similarity judgments. Salient attributes are

weighted more significantly in any resulting categorization. The structured mind hypothesis can

be extended to the analysis of organizational categories by recognizing the role that cultural

beliefs and routines, institutions and existing classification systems, regulatory strictures, and

group based interests can play in making certain organizational attributes more or less salient in

! 5!

category formation (Hannan, Polos & Carroll, 2007; Latour, 2005; Lounsbury & Rao, 2004; Rao,

Monin & Durand, 2005).

The structured world hypothesis, on the other hand, proposes that external stimuli are not

structured randomly, and that some entity attributes are correlated with others. In a world of

correlated attributes, it is the role of categorization processes to “carve nature at its joints”

(Rosch, 1978) by describing clusters of entity attributes that tend to occur together. Focusing on

one of the correlated attributes carries information about other correlated attributes, and certain

entities stand out and can be used as category prototypes when they are fully representative of

the correlational structure of the domain (e.g., Rosch, 1978). For example, Porac et al. (1995)

extended the structured world hypothesis to organizational categories by mapping the accepted

categories of Scottish knitwear producers. Their results suggested that these categories were

abstracted from a set of five attributes that were most correlated with other producer attributes.

In its most abstract form, the problem of infinite dimensionality reduces to the question

of how infinitely dimensionable actors combine to form clusters of actors that are similar on only

particular attributes, eventually acquiring category labels to summarize this similarity. Infinite

dimensionality is still an open question in the organizations literature on categorization, and there

have been few attempts to explore its implications explicitly (Porac et al., 1995 is one exception).

No doubt, this is partly because of the difficulty of explicating the processes of attribute

reduction empirically in organizational environments via the archival observational and/or cross-

sectional studies that have been used in prior organizational research on categories. Coding

category labels in product reviews or directory listings, for example, is not precise enough to

identity decisively the attribute clusters that labels summarize, nor how particular attributes were

selected for clustering. Cognitive scientists have typically studied attribute selection and

! 6!

clustering experimentally, but at the individual level of analysis. Organizational categorization is

a social process, however, given that attribute selection and clustering is mediated by

interorganizational communication, coordination, and/or imitation over time. This endogeneity

makes intra-individual studies of similarity and categorization less applicable to organizational

contexts, although the problem of infinite dimensionality remains.

Following from the above reasoning, our goal in the present research is to explore the

structured mind and structured world hypotheses by taking advantage of a well-known paradigm

of similarity clustering in a simulated agent-based environment. Boster and D’Andrade

acknowledged that the cognitive mechanisms underlying these two hypotheses are not mutually

exclusive, and that either or both mechanisms can be deployed to reduce the complexity and

dimensionality of stimulus entities for purposes of semantic categorization. We thus explore

each mechanism independently. Our choice for an empirical platform is Schelling’s (1971) well-

known segregation simulation. In Schelling’s model, simulated agents are arrayed within a two-

dimensional grid in such a way that some grid cells contain agents and some are empty. Agents

are able to move around to empty cells according to well-specified rules. In the original

Schelling study, agents were characterized with a single binary color dimension as either black

or white. Agents were endowed with a homophily motive such that they preferred to be near

agents who were similar in color to them. Schelling manipulated the strength of this motivation

by varying the percentage of similar neighbors surrounding a given agent that was necessary to

trigger a move to another location. Agents evaluated their surrounding neighbors for similarity

during each round of the simulation, and moved to a more homophilic location when the

percentage of similar neighbors in their current location was below the pre-defined similarity

threshold. Schelling found that marked clusters of similar agents formed after a small number of

! 7!

rounds with no more than moderate preferences toward homophily (movement thresholds of

60% neighborhood similarity).

The Schelling paradigm is useful for our purposes because it represents an instantiation

of a dynamic social clustering environment in which similarity assessments are the driving

motivation behind actor location and relocation. Schelling’s (1971) original paper stimulated a

large subsequent literature exploring various parameters of the paradigm, and the mechanics of

the Schelling model are well known. A major line of research has been to modify one or more of

the key model assumptions to determine the robustness of Schelling’s original insights. For

instance, some (e.g., Bruch & Mare, 2006; Pancs & Vriend, 2007; Van de Rijt, Siegel, & Macy,

2009) explored alternative forms of preference function than the discrete, threshold-based

function in Schelling (1971). Others varied the size and shape of neighborhoods, the number of

empty cells (e.g., Vinković & Kirman, 2006), and/or the rule and the order of migration (e.g.,

Pancs & Vriend, 2007). Most of this work showed a high degree of congruence with Schelling’s

basic predictions. A related stream of work tested Schelling’s predictions in various empirical

settings (e.g., Card, Mas, & Rothstein, 2008; Clark, 1991; Clark & Fossett, 2008), with similar

support. In addition, given the generality of Schelling’s insights, scholars from outside the social

sciences have built on Schelling’s model to explore segregation in their own domains. For

example, Vinković and Kirman (2006) reported an analogue between Schelling’s segregation

dynamics and particle dynamics found in physical worlds, and Nielsen, Gade, Juul, and

Strandkvist (2015) used Schelling segregation to model the clustering and segregation of

biological cells.

Despite the extensive literature on the Schelling model, to our knowledge all subsequent

work, like Schelling’s (1971), has characterized agents with a single binary attribute. The

! 8!

possibility that agents are infinitely dimensionable, and the clustering challenges associated with

multi-attribute agents, has not been considered in prior research. Research in the Schelling

tradition has thus assumed that some type of attribute reduction occurs prior to clustering

dynamics. In the present research, we relax this assumption and experimentally manipulate the

number of attributes that agents consider in their location decisions. We incorporate a realistic

model of multi-attribute similarity assessments derived from Tversky (1977). In Experiment 1,

we first establish the comparability of our simulation mechanics to Schelling’s (1971) by

replicating his basic finding in the one attribute case. We next introduce three-attribute and ten-

attribute extensions in Experiment 2. We show that Schelling segregation becomes increasingly

less apparent as the number of attributes gets larger. In Experiment 3, we test the structured mind

argument by differentially and exogenously weighting random attributes and show that attribute

weighting modulates this effect and re-establishes segregation even in the multi-attribute case.

In Experiment 4, we begin our exploration of the structured world argument by comparing multi-

attribute clustering when the attributes are randomly allocated to agents and when the attributes

are non-randomly structured into mutually exclusive crisp sets. Consistent with the structured

world hypothesis, we show that segregation occurs when attributes are perfectly correlated in

crisp configurations but not when they are randomly distributed among agents. Finally, in

Experiment 5, we relax the crispness of the attribute sets and explore the effects of moderately

correlated attributes in segregation dynamics. Our results suggest that correlated attributes seed

clustering and segregation around category “prototypes” that best represent the correlational

structure of the attribute space. Together, these five experiments provide suggestive evidence

that supports both mechanisms of attribute reduction. We end our paper by drawing out the

implications of our results for research on organizational categorization.

! 9!

Before proceeding with a more detailed exposition of Schelling mechanics and our

extensions, it is important to address the question of whether the Schelling paradigm is a realistic

context for studying organizational clustering, identity formation, and categorization.

Schelling’s (1971) original purpose was to show that racial segregation in city neighborhoods

could occur with only moderate preferences for homophily among neighbors. The simulation’s

two-dimensional grid represented geographic space, and the movement of actors on the grid was

considered analogous to the movement of actual individuals into and out of real neighborhood

positions. Subsequent research on the Schelling model has continued to conceptualize the two-

dimensional grid as geographic or physical space. In organizational contexts, the clearest

extension of the Schelling model is thus to identity formation and categorization in geographical

clusters of organizations (e.g., Romanelli & Kessina, 2005), or models of spatial competition

(e.g., Hotelling, 1928). Indeed, Romanelli and Kessina (2005) hint at the problem of infinite

dimensionalization in regional identity formation by suggesting that consensus around the

characteristics of the work activities within a cluster is a facilitating condition for the creation of

a regional identity categorization. This suggestion is supported by Porac et al.’s (1995) report of

geographical clustering in the Scottish knitwear industry, where each geographical cluster was

associated with particular values on a small set of consensually recognized attributes. Porac et al.

suggested that common resources, information sharing and collaboration, and imitation all

contributed to the homophilic motivations of knitwear firms in their sample.

Although geographical identity formation among clusters of organizations is a natural

extension of the Schelling paradigm, we also believe that the paradigm is of more general

relevance in modeling attribute reduction in organizational categorization. The two-dimensional

grid can be represented in other theoretically meaningful ways. For example, the grid may

! 10!

represent positions on easily changeable attributes, with similarity comparisons being made

using other more stable actor attributes. White(2004), for example, suggested that markets are

characterized by producer positions along quality and volume dimensions. Given this

representation of the Schelling two-dimensional grid, movement among cells in the space might

be considered producer choices to be more or less similar to other producers occupying particular

quality and quantity positions. As Hannan, Polos, and Carroll (2007) noted, there are many

reasons why organizations may seek homophily in a characteristics space, leading to clustering.

We leave it up to the reader to consider other possible organizational representations of

Schelling’s two-dimensional grid.

A MULTIATTRIBUTE EXTENSION OF THE SCHELLING PARADIGM

The Space

Following Schelling (1971), our model uses a two dimensional lattice consisting of a 20 x 20

lattice with a total of 400 cells. Cells in this lattice are populated by agents with m-dimensional

attributes, each of which is a randomly generated binary variable (i.e. either 1 or -1). Each agent

can occupy only one cell and a cell cannot be occupied by more than one agent. To allow agents’

migration to other parts of the lattice, randomly chosen cells of the lattice are specified to be

vacant. Figure 1a) illustrates a 10 x 10 grid. As seen, agents reside in cells which are marked by

their distinct binary attribute structures. For instance, one agent may be (-1,1, -1), and another

may be denoted as (1, -1, -1). Note that there are also vacant cells, marked by empty slots spread

out randomly in the grid.

---------------------------------------- Insert FIGURE 1 about here

----------------------------------------

Similarity Measure

! 11!

Each agent finds himself more or less similar to each one of his eight neighbors in his Moore

neighborhood: if an individual’s coordinates on a grid are represented as (i, j), the eight

surrounding cells((i-1, j-1), (i, j-1), (i+1, j-1), (i-1, j), (i+1, j), (i-1, j+1), (i, j+1), (i+1, j+1)) are

its neighbors. Because agents evaluate their neighborhood from their unique position, each agent

has a unique set of neighbors. Figure 1b) provides an example of such a Moore neighborhood.

The focal agent in the center is surrounded by 8 cells, only 6 of which are occupied by actual

neighbors with the remaining two cells empty.

As in Schelling (1971), an agent’s overall similarity score to all his neighbors is

computed as the average of dyadic level similarity scores. In Schelling (1971)’s original model,

an agent is characterized by a single binary number (e.g., black or white). Thus, in Schelling

(1971)’s world, a neighbor is either perfectly same or perfectly different. However, in our

context, since there are m dimensions in an agent’s characteristic, we need a similarity score that

summarizes across the m dimensions. We use Tversky (1977)’s ratio similarity judgment model

to characterize agents’ similarity calculation with multiple attributes. The similarity between two

agents a, and b, is denoted as s(a, b) and determined by:

A)-BfB)-AfB)AfB)Afbas

((((),(

⋅+⋅+∩∩

=βα !!!!!!!!!!!!!!!!!(1)

Here, A and B denote the set of features associated with agent a and b, respectively. α and β

denote weights attached to a and b, respectively, where α, β ≥ 0 and f(·) is a monotonic function

(In our case, we use a count function). We simplify by fixing α = β = 1, and s(a, b) becomes

simply the ratio of the features the two agents share in common to the total number of features

they each have. By using the ratio model of similarity, we normalize the similarity score s so that

it lies between 0 and 1. In other words, s(a, b) reduces to:

! 12!

B)AfB)Afbas

∪∩

=((),(

(2)

Consider the two agents in Figure 1c): agent a and b each has {1,-1,-1} and {1,-1,1} as their set

of features. Since the first and second features (underscored) are in common, and there are three

features in total, the similarity between a and b is the ratio of the two features shared in common

to the total number of features, 0.67 (=2/3).

Agent i’s level of similarity with all its neighbors (denoted by S(i)) is simply the average

of all dyadic similarity scores:

])([

),()( ][

jn

jisiS j∑

=!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!(3)

where [j] denotes the neighbors of i and n(·) represents the number of elements of the argument

set. Figure 1d) illustrates how we compute S(i) for our focal agent: we get 0.39 by simply adding

up all dyadic similarities for the 6 neighbors (since 2 cells are empty) and dividing by 6.

Migration

At the beginning of the simulation, agents are randomly distributed on the lattice. In each period,

a single agent is randomly chosen from the lattice population. The chosen agent decides whether

he is happy or content with the current neighbors. If the agent’s current similarity is above a

certain threshold (which we parameterize as Th and ranges from 0.0 to 1.0), he is satisfied and

will remain in his current position. If not, he attempts to migrate to another vacant position in the

conceptual space, by scanning the nearby vacant slots not yet occupied. His radius of search

expands outwards gradually until he finds a slot where his overall similarity exceeds his

threshold level of happiness. Note that it is not sufficient for the new similarity to be greater the

current level of similarity if it is lower than the threshold. In the case that no empty cell is

! 13!

satisfactory, the agent remains in his current position. We continue this until the system has

reached a steady state where migration ceases: all agents either have no incentive to migrate or

cannot find any spot that would make them content.

As seen in Figure 1d), suppose the computed similarity of 0.39 is less than 0.7, the

desired threshold, the focal agent is unhappy as his similarity with his neighbors is less than his

threshold. As a result, he scans his neighborhood for vacant slots and computes his would-be

similarity if he had moved to one of the nearest neighboring slots. Suppose he first considers a

possible move to the cell immediately to his left. His new neighborhood would have been Figure

1e) and 1f), and the associated similarity is 0.78 which exceeds the threshold of 0.7. Because this

new neighborhood makes him happy, the focal agent would carry out a migration to occupy the

cell to his immediate left. If this new neighborhood does not make him happy, he would

increasingly expand his search for vacant slots until he exhausts all possible slots.

RESULTS

Experiment 1: Replicating the Schelling Model: the Threshold Effect

Before we explore how multiple attributes influence cluster formation, we first replicate the

classic Schelling model with only one attribute. Figure 2(a) plots the average level of similarity

across all threshold levels. In the classic Schelling model, there is only one attribute: whether the

agent is black or white. As seen in Figure 2(a), the threshold effect on average similarity is not

monotonic: average similarity gradually increases up until some level of threshold and declines

as threshold level increases further. The levels of similarity are very similar for the two extreme

values of threshold (Th = 0.1 and 0.9). When threshold is low (e.g., <0.2), the average level of

similarity initially is likely to be much higher than the threshold. Recall that agents are initially

distributed randomly in the grid. As such, on average, the level of similarity is about 0.5, a value

that far exceeds the low threshold levels such as 0.1 or 0.2. Virtually all the agents are content

! 14!

with where they are. Thus, hardly any agent has the incentive to migrate. Thus the system

reaches a steady state within a short period. On the other hand, when threshold is very high, there

is strong incentive to be with highly similar neighbors yet there are not many available places to

move to that satisfy the high thresholds. After a few agents find their perfect spots, no further

improvement is possible. At high threshold levels above 0.7, it is increasingly difficult to find

similar others above thresholds and as a result, agents remain ‘unhappy’. Thus at both extremes,

the average similarity level of the system hovers around 0.5, which is exactly the initial level of

similarity.


----------------------------------------

These results are consistent with the basic results of the Schelling model: even at

moderate levels of threshold (e.g. 0.6), a clear pattern of segregation emerges where groups of

similarity seeking agents are located close together in geographical clusters.

To see this emergence of ‘segregation’ visually, we track in Figure 2(b) the evolution of

agents’ spatial location which is marked by different colors. Since there is only one attribute,

black and grey represents two possible types. The first and the second panel of this ‘heat map’

represent the map at the first and the last period, respectively. Clearly, when the threshold level

is around 0.6 or 0.7, strong segregation occurs as indicated by the presence of large clusters of

similar colors (i.e. either black or grey). Agents of the same type tend to be found with similar

others, and we see the emergence of clusters. More segregation implies a higher similarity level.

This visual pattern indicates that there is a link between the levels of similarity and the levels of

segregation. In other words, the higher the average level of similarity, the more agents tend to be

clustered.

! 15!

Experiment 2: Effects of the Number of Attributes

How does adding more attributes complicate the basic story? We plot in Figure 3(a),

average similarity across different threshold levels for the 1, 3 and 10 attribute cases

respectively. As seen, the number of attributes has a clear flattening impact on average

similarity. While the general inverted U shaped relationship is preserved when we introduce

more attributes, average similarity is gradually lowered at every threshold level as the number of

attributes increases. In results not reported here, we also try the number of attributes = 20, and

the resulting curve is almost entirely flat. In other words, as the number of attributes increases, it

is increasingly difficult to achieve clustering of similar others. This indicates that category

formation is a small numbers phenomenon under the condition of random attribute assignment.


----------------------------------------

In Figure 3(b), we again visualize cluster formation by using a ‘heat map’. With multiple

attributes, we use different colors to indicate how high or low the focal agent’s level of similarity

with its neighbors. Cells with darker color (i.e., close to black) represent higher level of

similarity of the focal agent with its neighbors. A highly dark region represents a cluster of agent

with similar types. With one attribute, highly dark regions emerge when threshold is between 0.6

and 0.7. This observation is consistent with the upper panel of Figure 2. Comparing across

different number of attributes, we observe the lightening pattern as the number of attributes

increases. This further suggests that high levels of similarity are indicative of high levels of

clusters formation.

To understand the flattening result from multiple attribute cases, it is helpful to note that

cluster formation is a result of migration, which is in turn driven by two forces. First, to migrate,

agents must be discontent, which means that their neighborhood similarity must be lower than

! 16!

their threshold level of happiness. Second, for migration to take place successfully, these

discontented agents must have a place to move to. In Figure 4, we plot the contrast between 1)

number of agents that are discontent and 2) the number of available slots that would theoretically

make them happy (normalized by 100), at the beginning of the migration process. Two effects

are clear. First, for threshold levels greater than 0.5, the greater the number of attributes, the

more agents who are discontent. However, for threshold levels less than 0.5, the higher the

number of attributes, the less number of agents who are discontent. As the number of attributes

increases, the number of discontent agents begins to rise at increasingly higher threshold levels

(0.1, 0.3 and 0.4 for 1, 3, and 10 attributes accordingly). We label this latter one ‘slow rising

discontent’ effect. As the number of attributes increases, it is easier to satisfy any given

individual: similarity along more dimensions may be increasingly easier to attain theoretically.1

Hence, as the number of attributes increases, the number of discontent agents who are motivated

to migrate is increasingly slower to rise. Less number of individuals would be motivated to move

and migrate in search of similar others, as the number of attributes increases.

However this ‘slower rising discontent effect’ is only half of the story. Motivation to

move (initially) does not necessarily mean that agents succeed in finding an ‘ideal’ spot in a

neighborhood of sufficiently similar others. The second effect is related to the availability of

spots for the discontent agents to move to. For any given individual who is motivated to move

(i.e., their current similarity level being lower than the threshold), he can move to the next

nearest vacant spot in a new neighborhood where similarity with new neighbors exceeds !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!1!The!number!of!attributes!changes!the!theoretical!distribution!of!similarity!before!migration!sets!in.!when!the!agents!have!only!1!attribute/dimension,!i.e.,!either!1!or!0,!the!baseline!distribution!of!similarity!is!spread!out!across!all!similarity!levels,!with!the!most!frequent!scores!being!0.5.!When!we!increase!the!number!of!attributes!to!3!and!10,!the!average!level!of!similarity!remains!the!same!at!0.5.!However,!the!distribution!of!the!similarity!scores!is!now!more!concentrated,!and!extreme!values!of!similarity!(e.g.!those!at!around!0!or!1)!become!much!less!likely.!This!implies!that!at!low!threshold!levels,!the!higher!the!number!of!attributes,!the!more!concentrated!the!initial!distribution!of!similarity,!and!the!smaller!the!number!of!discontent!agents.!In!other!words,!the!discontent!curve!is!slow!to!rise,!for!low!threshold!levels.!

! 17!

threshold. A spot is available if it enables a focal agent to successfully migrate. We simply

divided the number of agents who are motivated to move and can find a satisfactory spot by the

number of agents who are motivated to move. Thus the availability curves in Figure 4 represent

the percentage of discontent agents who can in fact move to satisfactory locations. As seen, the

number of qualified slots starts out high but declines dramatically at higher threshold levels.

There is simply no neighborhood that would satisfy the higher thresholds of these unhappy

agents. They remain unhappy, depressing the average similarity levels for the system. We label

this second effect the ‘insufficient destination effect’. This effect means that even though agents

are motivated to move, they succeed less and less in finding similar others as the threshold levels

increases. As a result, average similarity of the system remains low regardless of the number of

attributes. Despite the motivation to migrate, average similarity of the whole system remains

‘depressed’ at the initial level of 0.5.---------------------------------------

Insert FIGURE 4 about here ---------------------------------------

These two effects together explain the flattening effect of attributes. At lower threshold

levels, more attributes means that agents are less motivated to migrate. Thus as the number of

attributes increases, similarity scores are lower because migration simply is slower to set in. At

very high threshold levels, regardless of attributes, migration does not take place successfully as

there are simply not enough vacant slots to move to. Again, similarity remains low because

migration fails. At moderate thresholds, migration operates smoothly lifting the average

similarity of the system to a level above the baseline. The resulting pattern is therefore what we

have observed in Figure 2: as the number of attributes increases, the relationship between

similarity and threshold is increasingly less pronounced. In other words, it is increasingly

difficult to observe category formation as we increase the number of attributes.

! 18!

One important question is whether the visual patterns we observe in these ‘heat maps’

indeed represent stable, meaningful clusters of agents of different types. For category structure to

emerge, clusters need to be stable and not temporary. To ensure that clusters we observed are

stable, we run each simulation until the 10,000th period, well past the equilibrium defined as the

last time period when migration occurs. The longest time taken to reach a steady state was 6,102

time steps, whereas the average time to steady state was 3,336. We then use the agent locations

as of the 10,000th period to produce the ‘heat maps’ in Figure 3b).

To summarize, clustering or identity formation becomes more difficult as agents evaluate

more attributes. Identity clustering is a small number phenomenon: identity satisfaction or

clustering depends on not comparing oneself on many, let alone infinite, attributes or dimensions.

Experiment 3: Effects of Unequal Attribute Weighting

Next, we explore how the structured mind approach works to solve the problem of

infinite dimensionalization. Recall that in the structured mind approach, certain attributes are

made more salient in determining similarity by our common prior knowledge. So far we assume

that, in multi-attribute cases, different attributes receive the same weight. Next, we weigh

selectively certain attributes. To examine the structured mind approach, we reformulate the

similarity function between a and b by adding a weighting parameter ωi which represents how

much weight is given to the ith attribute as follows.

! !, ! = ! !! ∙ !!!!!

!!!!!!

where

1≔ 1!!!!!!if#ith!attribute(in(a(is(equal(to(ith!attribute(in(b(0!!!!!!otherwise)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))

and

! 19!

!! = !!

!!!


----------------------------------------

Figure 5 examines the effect of unequal weighting using four illustrative cases based on

three attributes. First, there are two extreme cases: 1) each attribute is weighed equally where (ω1,

ω2, ω3 ) = (1, 1, 1); 2) there is an favorite attribute receiving all weights where (ω1, ω2, ω3 ) = (3, 0,

0). Note that case 2) reduces three attributes to one, and should produce results identical to the

one attribute case. Second, we include two intermediate cases where one or two attributes are

given more weight than others: 3) (ω1, ω2, ω3 ) = (2.5, 0.25, 0.25); and 4) (ω1, ω2, ω3 ) = (1.375,

1.375, 0.25).

As seen in Figure 5, the curves corresponding to the two intermediate cases are in

between the two extreme baselines (i.e. equal weighting and extreme dominance). As we move

from the equal weighting baseline (at the bottom) to the extreme weighting case (at the very top),

the system attains increasingly higher level of similarity, and thus higher degree of clustering. In

other words, the unequal weighting of attributes facilitates clustering and increases the average

level of similarity from the equal weighting scheme case. To understand this, it is important to

note an equivalence between smaller number of attributes and unequal weighting: giving

disproportionately more weights to some select attributes should have the same effect as

considering a smaller number of attributes (only those deemed important). A three attribute case

where (ω1, ω2, ω3 ) = (3, 0, 0) is identical to a single attribute case. Unequal weighting in effect

reduces the number of attributes that are included in the calculation of similarity. Given that our

baseline results that clusters are more easily formed in the single attribute case, it is therefore not

! 20!

surprising that unequal weighting is associated with higher similarity than equal weighting with

the same number of attributes. Clustering is not as easy as in the single attribute case, but easier

than the equal weight, multi-attribute case. These results indicate that one possible way to

facilitate the formation of clusters is to compare on many attributes but to weight the attributes

unequally.

Experiment 4: Effects of Fuzzy vs. Crisp Sets

An alternative approach to infinite dimensionalization is the structured world hypothesis.

In contrast to the structured mind approach which assumes that different actors attend to similar

salient attributes, the structured world approach argues that different attributes are correlated in

the real world and this correlation facilitates category formation. In this section, we manipulate

the degree of correlation among attributes.

In our baseline case, we randomly assigned 0’s or 1’s to each attribute. This creates a

random distribution of attributes in the population: knowing a value in an attribute dimension

does not allow any prediction about other attribute dimensions. Since there is no correlation

among attributes, we can call this baseline attribute structure ‘fuzzy’. For any randomly chosen

pair of individuals, it is unlikely that the two are different on every dimension. Very few agents

are either 100% similar or 0% similar, since the average similarity is 0.5.

A different attribute structure is the opposite of ‘fuzzy sets’, where agents’ attributes are

specified to either 100% or 0% similar, i.e., ‘crisp’. For instance, we can assign some agents

(0,0,0,0,0,1,1,1,1,1) and the rest (1,1,1,1,1,0,0,0,0,0). If an individual holds (0,0,0,0,0,1,1,1,1,1)

and the neighbor also holds (0,0,0,0,0,1,1,1,1,1), then the level of similarity is 1. If the neighbor

holds (1,1,1,1,1,0,0,0,0,0), the similarity level is 0. Thus, individuals fall into two “crisp” sets,

and there is perfect correlation between attributes. Knowing a value in an attribute dimension

! 21!

does allow a prediction about other attribute dimensions. Note that in a single attribute case, the

attribute structure is by definition crisp: there are only two types of agents.

In the example given above, two randomly chosen individuals are either 100% similar or

0% similar. A more generalizable specification is to include other intermediate levels of

crispness in agents’ attribute structures. First, we divide the entire population into two groups.

Second, for one group, when assigning actor attributes, we set the probability of receiving 1’s (vs.

0’s) equal to a parameter “c” (for the degree of crispness). For the other group, they receive 1’s

with the probability of (1 – c). When the degree of crispness is 0.5, all the actors receive 1’s with

the probability of 0.5. If the degree of crispness is 0.7, half of the population receives 1’s with

probability of 0.7. The other half receives 1’s with probability of 0.3 (=1 – 0.7). In this way, by

varying the degree of crispness, some members are more or less likely to have 1’s in their beliefs

and other members tend to have 0’s. This degree parameter has a maximum value of 1.0,

corresponding to the case where there are only two groups: one having only 1’s, and the other

having only 0’s. This is the specification underlying Figure 6. Since cases between 0 to 0.5 are

symmetric to the cases between 0.5 and 1, we present only results based on the latter intervals.

----------------------------------------

Insert FIGURE 6 about here

----------------------------------------

As seen from Figure 6, clustering becomes clearer as c increases (i.e., the attribute

structure becomes more crisp: at each threshold level, average similarity increases monotonically

as degree of crispness increases. This means that higher degree of crispness in effect facilitate the

emergence of clusters even as the number of attributes has the opposite effect. It is important to

note that the case of c = 1.0 (i.e., perfectly crisp case) exactly overlaps with the single attribute

! 22!

case. So long as sets are crisp, the number of attributes matters much less. The small number

problem is only a concern when the sets are fuzzy.2

Experiment 5: Prototypicality and Cluster Formation

As discussed earlier, scholars have argued that categories are formed around prototypical

members (e.g., Rosch and Mervis, 1975). According to Rosch and Mervis (1975), prototypical

members have the properties of other category members and tend not to have properties of

category non-members. In our set up, because each attribute can take one of two possible values

(i.e. 1 or 0), there are two fuzzy cluster configurations: one in which members tend to have more

1’s versus 0’s and vice versa. Members with all 1’s and all 0’s are, by definition, the most

prototypical members of the first and the second cluster configuration, respectively.

To understand the role of prototypical members in cluster formation, we take the

following steps. We first create members with randomly distributed attributes (i.e., a completely

“fuzzy set”). Then, we replace a fixed percentage of members (we call this the “prototypical

ratio”) with prototypical members, half of which have all 1’s and the other half all 0’s. For

example, when the prototypical ratio is 0.2, we replace 20% of the initial randomly created

population with 10% members having all 1’s and 10% with all 0’s.

Next, to see if prototypical members are ‘centers’ of their respective clusters, we need a

measure to capture, for each individual, how prototypical he or she is. This new parameter,

‘typicality index’, measures how close an individual is to the most prototypical members, and is

calculated as follows:

Typicality Index = Number of 1’s – Number of 0’s

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!2 The inverted U shaped relationship between average similarity and thresholds holds across all degrees of crispness.

! 23!

The greater the difference between this index and 0 (either positive or negative), the more

prototypical the member. A larger difference implies that there are either more 1’s or more 0’s.

For example, in a four attribute case, members with either all 1’s or all 0’s will have 4 and -4 as

their typicality index, the maximum distance from 0. On the other hand, members with two 1’s

and two 0’s will have 0 for their typicality index.

Figure 7 shows the effect of prototypical members using one typical simulation run.

Note that these color heat maps are constructed using typicality indexes of individuals located in

the grid rather than the types or similarity levels of individuals as in Figure 2b) and 3b). Colors

located at the top and the bottom of the color bar indicate more prototypical members (that is,

those with the brightest and darkest colors). In the 1 attribute case (Figure 7a), there are only two

colors as there are only two types: 0’s and 1’s. Comparing heat maps across different

prototypicality ratios, it seems that having prototypes does not have a pronounced effect on

cluster formation. It is perhaps not surprising because all members are initially by definition

prototypical. Replacing a portion of the initial population with artificially created prototypes

does not enhance clustering any further than the initial condition.

The role of prototypical members becomes more pronounced as the number of attributes

increases. Consider the 10 attribute case (Figure 7c). When the prototypicality ratio is zero, there

is virtually no change in the visual pattern between the initial stage and t=10,000. However, as

the ratio of prototypical members increases to 0.2 or 0.4, more clusters emerge in the final stage.

In addition, prototypical members (as indicated by darker/lighter colors) are clustered together

surrounded by less typical members. As everyone becomes a prototype (i.e. at prototypicality

ratio = 1.0), the role of prototypical members becomes less pronounced.

! 24!


----------------------------------------

Do prototypical members form the center of the emergent clusters?’ If indeed they

facilitate the formation of clusters, we would expect that clusters will be formed around

prototypical members in their neighborhood. In other words, in the case of 10 attribute case,

clusters will be formed around members with the typicality index of 10 (or -10 in the other

category) and they are surrounded with members with 8 (or -8). As the boundary of a local

neighborhood increases, the expected tendency of members with either a high or low typicality

index being clustered together would be weakened. While Figure 7 illustrates visually the effect

of prototypicality, its value is limited because we can only use one typical simulation run.

To conclusively demonstrate the role of prototypical members in cluster formation, we

need a more robust and generalizable way to capture the typicality indexes of individuals located

at the center of clusters and the typicality indexes of members as the clusters expand spatially.

For the latter, we calculate the average typicality index for each expanding neighborhood degree.

The 1st degree neighbors are the eight surrounding spots around the focal member. The 2nd

degree neighbors are the sixteen neighbors around the 1st degree neighbors. Using this rule, we

can define the expanding neighborhood of each actor. We expect the following patterns: 1)

prototypical members (i.e., members with either very high or low typicality index) are expected

to be surrounded by similar prototypical members and this tendency becomes weaker as

neighborhood boundaries increase; 2) non-prototypical members (i.e., members with mid-range

typicality index) are not expected to be surrounded by neighbors with any particular patterns.

Figure 8 is based on 500 independent runs, each of which consists of 300 members on a

20 x 20 lattice. It plots the average typicality index as a function of the neighborhood degree.

Given that it is a one attribute case, there are only two types of individuals – those with 1’s

! 25!

(typicality index = 1) and those with 0’s (typicality index = -1). Seen from Figure 8, as the

neighborhood degree increases (i.e. as the neighborhood expands), the average typicality index

declines from 1 to 0 for prototypical members with 1’s (and increases from -1 to 0 for

prototypical members with 0’s).3 In other words, members with typicality index of 1 are

surrounded by immediate neighbors with similarly high typicality index. As we move outwards

from this center, neighbors become much less prototypical (as their typicality index approaches

zero). The reverse pattern is observed for the members with typicality index of -1.


----------------------------------------

In the 3 attribute case (Figure 8b), there are six groups of individuals depending on their

typicality indexes (3,2,1,-1,-2, and -3). Six lines thus describe how average typicality changes as

a function of neighborhood degree. We find that the members with Typicality = 3 are, on average,

surrounded by members with Typicality = 2 in its immediate neighborhood and average

typicality of neighbors gradually decreases as the neighborhood expands. The pattern is

symmetric for members with Typicality = -3, who are surrounded by members with similarly

high typicality values and average typicality increases as the neighborhood expands. In other

words, the prototypical members (Typicality = 3 or -3) are surrounded by similarly prototypical

members. Actors with lower typicality values (e.g., Typicality = 1 or -1) are surrounded by

similarly low level prototypicality members. This is why the curve for Typicality = 1 (Typicality

= -1) is located lower (higher) than Typicality = 3 (Typicality = -3) and decreases (increases)

more gradually. Therefore, prior results from Figure 7 are confirmed in a more general way:

prototypical members are more likely to be surrounded by other prototypical members and they

tend to be at the center of clusters. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!3!Recall that there are only two types of members in a single attribute case.!

! 26!

Our analyses so far have revealed two potentially competing drivers for cluster formation.

In the previous section we find that a crisp attribute structure facilitates category formation. In

the current section, prototypical members also are found to facilitate emergence of structure. The

question is - which mechanism is more important? In the initial formulation of a crisp structure,

we considered two attribute types: (0,0,0,0,0,1,1,1,1,1) and (1,1,1,1,1,0,0,0,0,0). By comparison,

when we explored the role of prototypicality, we considered two related attribute types:

(1,1,1,1,1,1,1,1,1,1), and (0,0,0,0,0,0,0,0,0,0). Note that the later, prototypical types are also crisp,

but the former crisp types are NOT prototypical. In other words, prototypicality is theoretical

subset of crispness. This means that crisp attributes that are not prototypical may not enhance

cluster formation. In results not reported here, we find this to be true. To summarize, we

systematically injected two types: (1,1,1,1,1,0,0,0,0,0), and (0,0,0,0,0,1,1,1,1,1) into the

population and found that there was little change in cluster formation.

Complementary and Sensitivity Analysis

So far, we assumed that members migrate to the nearest satisfactory spot, an assumption of local

migration that is consistent with the classic Schelling model. In this section, we relax this

constraint and allow members to find the most satisfactory spot anywhere in the grid without

local constraints. With this global migration rule, the thresholds we previously applied in the

baseline Schelling settings do not apply any more as individual do not merely attempt to meet

their threshold level of happiness. Rather, they attempt to maximize their happiness. In results

not reported here, we compare the clustering patterns under these migration rules, based on

typicality indexes (other parameters are identical to the baseline 10 attribute case). We find that

clusters are larger and seems to be more salient under global migration. In the previous analysis

with fuzzy attributes (Figure 2), we reported the highest level of similarity under the 10 attribute

! 27!

case to be 0.67 when threshold is 0.6. Under the global migration rule, the highest level of

similarity reaches 0.78. This seems to be a clear improvement from the baseline case.

Clustering seems more likely for two reasons. First, under global migration, members can

realize any improvement in similarity scores even if the improvement is small. In contrast, under

local migration, potential improvement may be foregone when it does not result in the individual

meeting his or her threshold. This possibility of realizing incremental improvement is akin to the

availability argument we made earlier: when the amount of improvement is no longer

constrained to be above a certain threshold, more neighborhoods becomes candidate or ideal

spots to migrate towards. Second, while members under the local migration rule may stop

searching upon finding the first spot that gives a similarity level above the threshold (no matter

how small the gap is), members in global migration keep searching till finding the best fit overall.

Combining these two, migration is more likely (due to the first mechanism) and once migration

happens, the distance is farther (due to the second mechanism). This results in more cluster

formation.

DISCUSSION

There are strong experiential reasons to assume that categories of organizational forms are

induced from clusters of similar organizations. However, one of the underlying theoretical

conundrums of category models based on similarity clustering is the fact that organizations, and

all external entities for that matter, are infinitely dimensionable. This well-known problem has

led cognitive scientists to propose two general arguments regarding how infinite dimensionality

is resolved during the formation of categories in any knowledge domain. Domain knowledge

and belief systems can privilege certain attributes over others, and thus lead to clustering and

categories formed around those attributes that are most salient and weighted more heavily in a

! 28!

particular context. Or, the external entities themselves may be structured in a way that

configurations of correlated attributes occur together, implying that attributes that are highly

correlated with others have special informational value and thus are privileged during category

formation.

We have used the robust paradigm of Schelling (1971) segregation to examine some of

the details of each of these mechanisms, and have found both to be effective in mitigating the

challenges that infinite dimensionality presents. First, our results suggest that pure Schelling

segregation with randomly assigned attributes is a small numbers phenomenon in that it only

occurs when the number of actor attributes is low. However, weighting attributes differentially

in similarity judgments mitigates this effect and reinstates segregation, albeit at a reduced level.

This supports the “structured mind” hypothesis by suggesting that attribute reduction through

knowledge filtering is at least theoretically plausible. We also found evidence supporting the

“structured world” hypothesis by showing that organizing actor attributes into perfectly crisp and

non-overlapping sets completely reinstates Schelling clustering regardless of the number of

attributes considered. However, as many have noted in the cognitive and organizations literature,

rarely do we find perfectly crisp and non-overlapping sets of attributes in the environment.

Given this, our results also suggest that imperfectly correlated attribute configurations can

reinstate clustering as well, with prototypical attribute actors moving to the center of similarity

clusters and other less prototypical actors positioned around them.

Our results thus lend support to similarity-based models of organizational category

formation, but with two important provisos. First, although it is easy to see how beliefs, cultures,

standard operating procedures, existing classification systems, institutional priorities and the like

can act to privilege certain attributes over others, the structured mind hypothesis essentially

! 29!

requires that any explanation for cluster and category formation must first identify, and perhaps

explain, how various sources of knowledge act together to make some attributes more salient

than others. As Murphy and Medin (1985) acknowledged many years ago, this means that a

theory of categories must be rooted in a theory of knowledge. In organizational contexts, it is

one thing to map the structure of categories-in-use. It is quite another to explain why such

categories are put into use. For the latter, the structured mind hypothesis suggests that a theory

of organizational categories must include a theory about how sources of knowledge come

together during category formation for purposes of attribute reduction.

The structured world hypothesis brings with it different explanatory challenges. We have

shown that the problem of infinite dimensionality can be mitigated by non-randomly structuring

attributes into configurations of correlated attributes. Even imperfect correlations can be used to

privilege certain attributes, those with most informational value, over others in a bottom up

fashion, without prior knowledge acting as a filter. This was the explanation advanced by Porac

et al. (1995), for example, in their study of competitive clustering and category structure in the

Scottish knitwear industry. However, Porac et al. was a cross-sectional study, and the authors

freely admitted that an explanation for why the industry was ordered into a particular category

structure had to be based in a historical analysis of the forces that encouraged and selected out

certain configurations of attributes over others. Again, it is one thing to map the existence of

categories at any given time, and quite another to explain the historical evolution of the attribute

clusters on which that structure is based.

! 30!

REFERENCES

Barsalou, L. W. 1983. Ad hoe categories. Memory & cognition, 11(3), 211-227.

Barsalou, L. W. 2008. Grounded cognition. Annual Review of Psychology, 59: 617-645.

Boster, J., & d'Andrade, R. 1989. Natural and human sources of cross-cultural agreement in ornithological classification. American Anthropologist, 132-142.

Bruch, E. E., & Mare, R. D. 2006. Neighborhood choice and neighborhood change1. American Journal of Sociology, 112(3): 667-709.

Card, D., Mas, A., & Rothstein, J. 2008. Tipping and the Dynamics of Segregation. The Quarterly Journal of Economics, 177-218.

Clark, W. A. 1991. Residential preferences and neighborhood racial segregation: A test of the Schelling segregation model. Demography, 28(1): 1-19.

Clark, W. A., & Fossett, M. 2008. Understanding the social context of the Schelling segregation model. Proceedings of the National Academy of Sciences, 105(11): 4109-4114.

DiMaggio, P. J. 1991. Constructing an organizational field as a professional project: US art museums, 1920-1940. The new institutionalism in organizational analysis, 267: 292.

Durand, R., & Paolella, L. 2013. Category stretching: Reorienting research on categories in strategy, entrepreneurship, and organization theory. Journal of Management Studies, 50(6): 1100-1123.

Estes, W. K. 1994. Classification and cognition (Vol. 22). Oxford: Oxford University Press on Demand.

Estes, Z., Golonka, S., & Jones, L. L. 2011. 8 Thematic Thinking: The Apprehension and Consequences of Thematic Relations. Psychology of Learning and Motivation-Advances in Research and Theory, 54: 249.

Farjoun, M., & Lai, L. 1997. Similarity judgments in strategy formulation: role, process and implications. Strategic Management Journal (1986-1998), 18(4): 255.

Hannan, M. T., Pólos, L., & Carroll, G. R. 2007. Logics of organization theory: Audiences, codes, and ecologies. Princeton, NJ: Princeton University Press.

Harnad, S. 2005. To cognize is to categorize: Cognition is categorization. Handbook of categorization in cognitive science, 20-45.

Hotelling, H. 1928. Spaces of statistics and their metrization. Science, 67(1728): 149-150.

Hsu, G. 2006. Jacks of all trades and masters of none: Audiences' reactions to spanning genres in feature film production. Administrative Science Quarterly, 51(3): 420-450.

! 31!

Khaire, M., & Wadhwani, R. D. 2010. Changing landscapes: The construction of meaning and value in a new market category—Modern Indian art. Academy of Management Journal, 53(6): 1281-1304.

Latour, B. 2005. Reassembling the Social: an Introduction to Actor-Network-Theory. Oxford: Clarendon.

Lounsbury, M., & Rao, H. 2004. Sources of durability and change in market classifications: A study of the reconstitution of product categories in the American mutual fund industry, 1944–1985. Social Forces, 82(3): 969-999.

McKelvey, B. 1975. Guidelines for the empirical classification of organizations. Administrative Science Quarterly, 509-525.

Murphy, G. L. 2002. The big book of concepts. Cambridge, MA: MIT press.

Murphy, G. L., & Medin, D. L. 1985. The role of theories in conceptual coherence. Psychological review, 92(3): 289.

Navis, C., & Glynn, M. A. 2010. How new market categories emerge: Temporal dynamics of legitimacy, identity, and entrepreneurship in satellite radio, 1990–2005. Administrative Science Quarterly, 55(3): 439-471.

Negro, G., Koçak, Ö., & Hsu, G. 2010. Research on categories in the sociology of organizations. Research in the Sociology of Organizations, 31: 3-35.

Nielsen, A. V., Gade, A. L., Juul, J., & Strandkvist, C. 2015. Schelling model of cell segregation based only on local information. Physical Review E, 92(5): 052705.

Ody-Brasier, A., & Vermeulen, F. 2014. The Price You Pay Price-setting as a Response to Norm Violations in the Market for Champagne Grapes. Administrative Science Quarterly, 59(1): 109-144.

Pancs, R., & Vriend, N. J. 2007. Schelling's spatial proximity model of segregation revisited. Journal of Public Economics, 91(1): 1-24.

Pontikes, E. G. 2012. Two sides of the same coin how ambiguous classification affects multiple audiences’ evaluations. Administrative Science Quarterly, 57(1): 81-118.

Porac, J. F., Thomas, H., Wilson, F., Paton, D., & Kanfer, A. 1995. Rivalry and the industry model of Scottish knitwear producers. Administrative Science Quarterly, 203-227.

Rao, H., Monin, P., & Durand, R. 2005. Border crossing: Bricolage and the erosion of categorical boundaries in French gastronomy. American Sociological Review, 70(6): 968-991.

Rhee EY. 2015. Strategic categorization: Vertical and horizontal changes in self-categorization. Academy of Management Annual Meetings Best Papers Proceedings.

Romanelli, E., & Khessina, O. M. 2005. Regional industrial identity: Cluster configurations and economic development. Organization Science, 16(4): 344-358.

! 32!

Rosa, J. A., Porac, J. F., Runser-Spanjol, J., & Saxon, M. S. 1999. Sociocognitive dynamics in a product market. The Journal of Marketing, 64-77.

Rosch, E. 1978. Principles of categorization. In E. Rosch & B. B. Lloyd (Eds.), Cognition and categorization: 27-48. New Jersey: Hillsdale.

Santos, F. M., & Eisenhardt, K. M. 2009. Constructing markets and shaping boundaries: Entrepreneurial power in nascent fields. Academy of Management Journal, 52(4): 643-671.

Schelling, T. C. 1971. Dynamic models of segregation. Journal of mathematical sociology, 1(2): 143-186.

Sloman, A. 1995. Musings on the roles of logical and non-logical representations in intelligence. Diagrammatic reasoning: Computational and cognitive perspectives, 7-33.

Sloman, S. A., & Rips, L. J. 1998. Similarity as an explanatory construct. Cognition, 65(2): 87-101.

Smith, M. D. 2011. The ecological role of climate extremes: current understanding and future prospects. Journal of Ecology, 99(3): 651-655.

A. Tversky. 1977. Features of similarity. Psychological Review, 84: 327–352.

Van de Rijt, A., Siegel, D., & Macy, M. 2009. Neighborhood Chance and Neighborhood Change: A Comment on Bruch and Mare1. American Journal of Sociology, 114(4): 1166-1180.

Vergne, J. P., & Wry, T. 2014. Categorizing categorization research: Review, integration, and future directions. Journal of Management Studies, 51(1): 56-94.

Vinković, D., & Kirman, A. 2006. A physical analogue of the Schelling model. Proceedings of the National Academy of Sciences, 103(51): 19261-19265.

White, H. (2004). Markets from Networks. Princeton, NJ: Princeton U Press

Zuckerman, E. W. 1999. The categorical imperative: Securities analysts and the illegitimacy discount. American journal of sociology, 104(5): 1398-1438.

! 33!

FIGURE 1 The Fundamentals of Schelling (1971) Model

Attri

bute

1 At

tribu

te

2 Attri

bute

3

:!A!

:!B!

S(A,B)!=!2/3!

Average!Similarity!=!0.39!(<0.7)!

2/3 1/3

0/3

0/32/32/3

Focal Actor

3/3

3/3

2/3

2/3 1/3

0/3

0/32/32/3

Focal Actor

3/3

2/32/32/3

Focal Actor

3/3

2/3

Average!Similarity!=!0.78!(>0.7)!

� ��

� ��

(a)

(b) (c)

(d) (e) (f)

! 34!

FIGURE 2 Replicating the Schelling Model

(a) The Effect of Threshold in Schelling’s Original Model

!Note) 300 agents on 20x20 grid; average of 100 independent runs; error bars indicate 95% confidence intervals

(b) Visualizing the Effect of Threshold

Th 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Initial State

! ! ! ! ! ! ! ! !

Steady State

! ! ! ! ! ! ! ! !Note) 300 agents on 20 x 20 grid; black(�) and grey(�) cell represents each type; white cells (�) are empty cells.

0.0 0.2 0.4 0.6 0.8 1.00

20

40

60

80

100

Ave

rage'Sim

ilarity'(%)

T hres hold '(T h)

! 35!

FIGURE 3 Effects of the Number of Attributes

(a) The Effect of Threshold under Multiple Attributes

Note) 300 agents on 20x20 grid; average of 100 independent runs; error bars indicate 95% confidence intervals

(b) Visualizing the Effect of Threshold under Multiple Attributes

Th� 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

1 attribute

! ! ! ! ! ! ! ! !

3 attributes

! ! ! ! ! ! ! ! !

10 attributes

! ! ! ! ! ! ! ! !Note) White cells (�) are empty cells. Darker cells indicate individuals surrounded by similar individuals that lighter cells are.

0.0 0.2 0.4 0.6 0.8 1.00

20

40

60

80

100

10(a ttributes

2(a ttributes

Ave

rage'Sim

ilarity'(%)

T hres hold '(T h)

1(a ttribute

! 36!

FIGURE 4 Discontent and Availability (Initial Period)

!

- 20x20, 300 agents, 100 independent runs - Availability = 1: chosen agent wants to migrate and is able to find a happy spot / 0: the chosen agent wants to migrate but no available

spot (values with no data point = the chosen agent is happy) !

0.0 0.2 0.4 0.6 0.8 1.0

0

20

40

60

80

100

1 1 1

1

1

1

1

11

1 1

3 3 33

3

3

3

33 3 3

10(a tt(a va ilability3 (a tt(a va ilability

10

10

%

T hres ho ld -(T h)

10

1(a tt(a va ilability

D is content(

! 37!

FIGURE 5 Effects of Unequal Attribute Weighting

Note:

- 3 attributes / 20 x 20 grid size / results at t=5,000 / 100 independent runs - Equal Weight: (�1, �2, �3 ) = (1, 1, 1) - Unequal Weight 1 (privileging 1 attribute): (�1, �2, �3 ) = (2.5, 0.25, 0.25) - Unequal Weight 2 (privileging 2 attributes): (�1, �2, �3 ) = (2.75/2, 2.75/2, 0.25)

0.0 0.2 0.4 0.6 0.8 1.0

50

60

70

80

90

100

Ave

rage1Sim

ilarity1(%)

T hres hold 1(T h)

!1 !A ttribute!Unequa l!W eig ht!1!Unequa l!W eig ht!2!E qua l!W eig ht

! 38!

FIGURE 6 Degree of Crispness and Average Similarity!

- !- Note: “c” denotes degree of crispness / number of attributes is 10 unless indicated otherwise / 20 x 20 grid size / 300 actors /

results at t=5,000 / 100 independent runs.!!

0.0

0.2

0.4

0.60.8

1.0

50

60

70

80

90

100

1(A ttributec=1.0

c=0.9

c=0.8

c=0.7

c=0.6

(

Ave

rage.Sim

ilarity.(%)

T h res ho ld .(T h )

c=0.5

! 39!

- FIGURE 7!

Varying Prototype Ratio across Different Numbers of Attributes a) 1 attribute

Prototype Ratio 0.0 0.2 0.4 0.6 0.8 1.0

t = 0

t = 1

0,00

0

b) 3 attributes Prototype Ratio 0.0 0.2 0.4 0.6 0.8 1.0

t = 0

5 10 15 20

2

4

6

8

10

12

14

16

18

20-2

-1.5

-1

-0.5

0

0.5

1

5 10 15 20

2

4

6

8

10

12

14

16

18

20-2

-1.5

-1

-0.5

0

0.5

1

5 10 15 20

2

4

6

8

10

12

14

16

18

20-2

-1.5

-1

-0.5

0

0.5

1

5 10 15 20

2

4

6

8

10

12

14

16

18

20-2

-1.5

-1

-0.5

0

0.5

1

5 10 15 20

2

4

6

8

10

12

14

16

18

20-2

-1.5

-1

-0.5

0

0.5

1

5 10 15 20

2

4

6

8

10

12

14

16

18

20-2

-1.5

-1

-0.5

0

0.5

1

5 10 15 20

2

4

6

8

10

12

14

16

18

20-2

-1.5

-1

-0.5

0

0.5

1

5 10 15 20

2

4

6

8

10

12

14

16

18

20-2

-1.5

-1

-0.5

0

0.5

1

5 10 15 20

2

4

6

8

10

12

14

16

18

20-2

-1.5

-1

-0.5

0

0.5

1

5 10 15 20

2

4

6

8

10

12

14

16

18

20-2

-1.5

-1

-0.5

0

0.5

1

5 10 15 20

2

4

6

8

10

12

14

16

18

20-2

-1.5

-1

-0.5

0

0.5

1

5 10 15 20

2

4

6

8

10

12

14

16

18

20-2

-1.5

-1

-0.5

0

0.5

1

5 10 15 20

2

4

6

8

10

12

14

16

18

20-4

-3

-2

-1

0

1

2

3

5 10 15 20

2

4

6

8

10

12

14

16

18

20-4

-3

-2

-1

0

1

2

3

5 10 15 20

2

4

6

8

10

12

14

16

18

20-4

-3

-2

-1

0

1

2

3

5 10 15 20

2

4

6

8

10

12

14

16

18

20-4

-3

-2

-1

0

1

2

3

5 10 15 20

2

4

6

8

10

12

14

16

18

20-4

-3

-2

-1

0

1

2

3

5 10 15 20

2

4

6

8

10

12

14

16

18

20-4

-3

-2

-1

0

1

2

3

! 40!

t = 1

0,00

0

c) 10 attributes

Prototype Ratio 0.0 0.2 0.4 0.6 0.8 1.0

t = 0

t = 1

0,00

0

5 10 15 20

2

4

6

8

10

12

14

16

18

20-4

-3

-2

-1

0

1

2

3

5 10 15 20

2

4

6

8

10

12

14

16

18

20-4

-3

-2

-1

0

1

2

3

5 10 15 20

2

4

6

8

10

12

14

16

18

20-4

-3

-2

-1

0

1

2

3

5 10 15 20

2

4

6

8

10

12

14

16

18

20-4

-3

-2

-1

0

1

2

3

5 10 15 20

2

4

6

8

10

12

14

16

18

20-4

-3

-2

-1

0

1

2

3

5 10 15 20

2

4

6

8

10

12

14

16

18

20-4

-3

-2

-1

0

1

2

3

5 10 15 20

2

4

6

8

10

12

14

16

18

20 -10

-8

-6

-4

-2

0

2

4

6

8

10

5 10 15 20

2

4

6

8

10

12

14

16

18

20 -10

-8

-6

-4

-2

0

2

4

6

8

10

5 10 15 20

2

4

6

8

10

12

14

16

18

20 -10

-8

-6

-4

-2

0

2

4

6

8

10

5 10 15 20

2

4

6

8

10

12

14

16

18

20 -10

-8

-6

-4

-2

0

2

4

6

8

10

5 10 15 20

2

4

6

8

10

12

14

16

18

20 -10

-8

-6

-4

-2

0

2

4

6

8

10

5 10 15 20

2

4

6

8

10

12

14

16

18

20 -10

-8

-6

-4

-2

0

2

4

6

8

10

5 10 15 20

2

4

6

8

10

12

14

16

18

20 -10

-8

-6

-4

-2

0

2

4

6

8

10

5 10 15 20

2

4

6

8

10

12

14

16

18

20 -10

-8

-6

-4

-2

0

2

4

6

8

10

5 10 15 20

2

4

6

8

10

12

14

16

18

20 -10

-8

-6

-4

-2

0

2

4

6

8

10

5 10 15 20

2

4

6

8

10

12

14

16

18

20 -10

-8

-6

-4

-2

0

2

4

6

8

10

5 10 15 20

2

4

6

8

10

12

14

16

18

20 -10

-8

-6

-4

-2

0

2

4

6

8

10

5 10 15 20

2

4

6

8

10

12

14

16

18

20 -10

-8

-6

-4

-2

0

2

4

6

8

10

! 41!

FIGURE 8 Average Typicality Index across Expanding Neighborhood

(a) 1 Attribute (b) 3 Attributes (c) 10 Attributes

!4

!3

!2

!1

0

1

2

3

4

1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th

Typicality)Inde

x)of)Neighbo

rs

Neighborhood)Degree

Typicality=1Typicality=!1

!4

!3

!2

!1

0

1

2

3

4

1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10thTypicality)Inde

x)of)Neighbo

rsNeighborhood)Degree

Typicality=:3Typicality=:1Typicality=:!1Typicality=:!3

!4

!3

!2

!1

0

1

2

3

4

1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th

Typicality)Inde

x)of)Neighbo

rs

Neighborhood)Degree

Typicality=10Typicality=6Typicality=0Typicality=!6Typicality=!10

simple and complex identities in similarity … porac...schelling study, agents were characterized...

Documents