promoting emergence in information discovery by ...promoting emergence in information discovery by...

10
Promoting Emergence in Information Discovery by Representing Collections with Composition Andruid Kerne, Eunyee Koh, Steven M. Smith 2 , Hyun Choi 2 , Ross Graeber, Andrew Webb Interface Ecology Lab Computer Science Department 1 Psychology Department 2 Texas A&M University, College Station, TX 77843, USA {andruid, eunyee}@cs.tamu.edu, {stevesmith, rgraeber, hyun-choi, porosis}@tamu.edu ABSTRACT While sometimes the task that motivates searching, browsing, and collecting information resources is finding a particular fact, humans often engage in intellectual and creative tasks, such as comparison, understanding, and discovery. Information discovery tasks involve not only finding relevant information, but also seeing relationships among collected information resources, and developing new ideas. Prior studies of search have focused on time and accuracy, metrics of limited value for measuring creativity. We develop new experimental methods to evaluate the efficacy of representational systems for information discovery by measuring the emergence of new ideas. We also measure the variety of web sites that participants visit when engaging in a creative task, and gather experience report data. We compare the efficacy of the typical format for collections, the textual list with a new format, the composition of image and text surrogates. We conduct an experiment that establishes that representing collections with composition of image and text surrogates promotes emergence in information discovery. Author Keywords visual representations, creative cognition, collections ACM Classification Keywords H5.m. Information interfaces and presentation (e.g., HCI): Miscellaneous. INTRODUCTION The creative intellectual tasks that humans perform with digital information resources must be supported and investigated. These tasks are critical to research, writing, learning, and invention on all levels. They are essential to business, education, and personal life. According to Morrison, Pirolli and Card’s analysis of the Georgia Tech Graphics, Visualization, and Usability Center web usage survey [27], the reason people use the web in 69% of cases is to understand or compare/choose. The method of humans in 71% of cases is to collect, that is, to assemble information from multiple sources. Information discovery tasks involve assembling and connecting answers to open-ended questions. Performance of information discovery tasks requires finding elements of relevant information, collecting these elements, and developing understanding of the found elements and their relationships. When people see combinations of found elements in new ways, they may experience cognitive restructuring, in which mental models shift and extend, resulting in the emergence of new ideas. Emergence is the essence of creativity, and the crux of discovery. This research develops new methods for detecting and measuring emergence in the context of browsing collections of digital information resources. We compared the efficacy of representations for presenting collections to participants, and in tandem, the representations that they used for putting together answers to information discovery questions. We found that the composition of image and text surrogates promotes emergence in information discovery tasks. To investigate the contribution to emergence by representing collections of information resources with composition of image and text surrogates, and how it is supported by the creativity support tool, combinFormation [17, 20], we conducted an experiment. Undergraduate psychology students performed open-ended information discovery tasks, which involved applying principles of psychology to questions about life experience, and assembling multiple elements to form an answer. The apparatus included a curated source collection of psychology resources, and an interface for assembling answers to the information discovery questions. In one experimental condition, both the source collection and the answer interface utilized a typical linear text format. One aspect of this iterative design methodology is to handcraft representations of information in order to assess the direction for building systems that generate representations of collections automatically. Thus, in the

Upload: others

Post on 13-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Promoting Emergence in Information Discovery by ...Promoting Emergence in Information Discovery by Representing Collections with Composition Andruid Kerne, Eunyee Koh, Steven M. Smith2,

Promoting Emergence in Information Discovery by Representing Collections with Composition

Andruid Kerne, Eunyee Koh, Steven M. Smith2, Hyun Choi2, Ross Graeber, Andrew Webb

Interface Ecology Lab Computer Science Department1 Psychology Department2

Texas A&M University, College Station, TX 77843, USA {andruid, eunyee}@cs.tamu.edu, {stevesmith, rgraeber, hyun-choi, porosis}@tamu.edu

ABSTRACT While sometimes the task that motivates searching, browsing, and collecting information resources is finding a particular fact, humans often engage in intellectual and creative tasks, such as comparison, understanding, and discovery. Information discovery tasks involve not only finding relevant information, but also seeing relationships among collected information resources, and developing new ideas. Prior studies of search have focused on time and accuracy, metrics of limited value for measuring creativity.

We develop new experimental methods to evaluate the efficacy of representational systems for information discovery by measuring the emergence of new ideas. We also measure the variety of web sites that participants visit when engaging in a creative task, and gather experience report data. We compare the efficacy of the typical format for collections, the textual list with a new format, the composition of image and text surrogates. We conduct an experiment that establishes that representing collections with composition of image and text surrogates promotes emergence in information discovery.

Author Keywords visual representations, creative cognition, collections

ACM Classification Keywords H5.m. Information interfaces and presentation (e.g., HCI): Miscellaneous.

INTRODUCTION The creative intellectual tasks that humans perform with digital information resources must be supported and investigated. These tasks are critical to research, writing, learning, and invention on all levels. They are essential to business, education, and personal life. According to

Morrison, Pirolli and Card’s analysis of the Georgia Tech Graphics, Visualization, and Usability Center web usage survey [27], the reason people use the web in 69% of cases is to understand or compare/choose. The method of humans in 71% of cases is to collect, that is, to assemble information from multiple sources.

Information discovery tasks involve assembling and connecting answers to open-ended questions. Performance of information discovery tasks requires finding elements of relevant information, collecting these elements, and developing understanding of the found elements and their relationships. When people see combinations of found elements in new ways, they may experience cognitive restructuring, in which mental models shift and extend, resulting in the emergence of new ideas. Emergence is the essence of creativity, and the crux of discovery. This research develops new methods for detecting and measuring emergence in the context of browsing collections of digital information resources. We compared the efficacy of representations for presenting collections to participants, and in tandem, the representations that they used for putting together answers to information discovery questions. We found that the composition of image and text surrogates promotes emergence in information discovery tasks.

To investigate the contribution to emergence by representing collections of information resources with composition of image and text surrogates, and how it is supported by the creativity support tool, combinFormation [17, 20], we conducted an experiment. Undergraduate psychology students performed open-ended information discovery tasks, which involved applying principles of psychology to questions about life experience, and assembling multiple elements to form an answer. The apparatus included a curated source collection of psychology resources, and an interface for assembling answers to the information discovery questions. In one experimental condition, both the source collection and the answer interface utilized a typical linear text format.

One aspect of this iterative design methodology is to handcraft representations of information in order to assess the direction for building systems that generate representations of collections automatically. Thus, in the

Page 2: Promoting Emergence in Information Discovery by ...Promoting Emergence in Information Discovery by Representing Collections with Composition Andruid Kerne, Eunyee Koh, Steven M. Smith2,

other experimental condition, the source collection format was a set of hyperlinked image and text compositions, that was pre-assembled using combinFormation. The interface used by participants for forming answers was, likewise, a direct manipulation only version of combinFormation. This tool enables the novice user to easily put together a collection of information resources in the form of a composition of image and text surrogates.

This paper begins with an interdisciplinary review of prior work, locating this research amidst domains such as creative cognition, information science, perception, visual design, and human computer interaction. We derive and extend measures of emergence from creative cognition research to comparatively evaluate the collections developed by participants in the linear text and composition of image and text surrogates conditions. We describe the experimental method. We present and analyze results of the experiment, and discuss implications.

BACKGROUND

Creative Ideation: Emergence and Combination Creativity has been a difficult concept to define precisely. In spite of the broad range of notions about creativity, a consensus has nonetheless developed among creative cognition researchers, that creative ideas and products must be novel in some way, and that they must have value [11]. The creative cognition approach to understanding creativity focuses on the cognitive processes that underlie the production of creative ideas, processes involved in activities such as memory retrieval, visualization, categorization, problem solving, and analogical transfer. Of primary interest are the cognitive operations involved in ideation, the process of generating new and sometimes creative

ideas. The creative cognition approach states that there is a family of features that are shared by most creative ideas, qualities such as insightfulness, imaginativeness, and emergence [36]. In the present study, we focus on emergence in creative products, particularly as it relates to combinations of ideas drawn from digital information resources.

Figure 1: Biopsychology area source information resource collection: image-text composition and linear text formats.

Emergence refers to qualities that come newly into existence as a result of novel combinations of elements [9, 35, 42, 43]. Even when the elemental components, themselves, are not novel, new qualities that emerge from combinations comprise important creative discoveries in science, art, and business [31, 37]. Emergent properties can be seen in many domains. In chemistry, compounds can have properties that do not exist in any of the component elements, in visual perception complex objects can have emergent properties, such as three-dimensionality, that are not found in the featural components, and in language, words can have meanings that are not qualities of the component phonemes or letters. In the course of creative ideation, novel ideas often emerge when component ideas are combined. Combining concepts has been important in theories that deal with creative thinking, such as problem solving, idea generation, and insight experiences [7, 26, 28]. A number of studies have shown that novel properties can emerge from conceptual combinations [10, 11, 16, 38, 41]. These studies have primarily examined the cognitive processes that are involved when people comprehend combinations of concepts (such as computer dog), or when people imagine creative interpretations of ideas randomly combined by experimenters. The present study tests predictions and implications of these studies, examining the usefulness of a representational form that encourages and

2

Page 3: Promoting Emergence in Information Discovery by ...Promoting Emergence in Information Discovery by Representing Collections with Composition Andruid Kerne, Eunyee Koh, Steven M. Smith2,

enables development of unusual combinations of information resources, particularly, combinations that have emergent properties.

Measures of Creative Ideation Creative cognition researchers have observed that the cognitive components of creative thinking are different than those engaged in deductive reasoning [11, 36]. Thus, they have defined different tasks and measures to investigate ideation. Most prior research at the intersection of human computer interaction and information retrieval has addressed convergent thinking tasks, which involve closed-form questions that have a single correct answer. A problem is explicitly specified, and the criteria for the solution are very clear. The accuracy of the answer, and the latency, or time to form it, are appropriate measures of performance.

To investigate creative ideation, divergent thinking tasks have been designed, in which one quests for many possible answers to open-ended questions [11, 33]. Divergent thinking tasks are objectively assessed with ideation metrics [34], such as fluency (i.e., quantity of ideas), flexibility (number of different categories of ideas), originality (i.e., statistical infrequency of an idea), practicality/quality, and emergence [10, 35, 41]. In addition to these standard measures of divergent thinking, the products of the divergent tasks can also be assessed subjectively by participants, their peers, and by experts. Among the subjective measures we have developed are ratings for informative, communicative, and expressive.

Information Discovery and its Precursors Creative cognition provides a perspective for considering related developments in information science and human computer interaction. The information discovery approach [20] builds on berrypicking [3], anomalous states of knowledge (ASK) [4], psychological relevance [15], sensemaking [32, 2], information foraging [30], information seeking [23], and exploratory search [39]. Exploratory search, for example, addresses situations in which users “lack the knowledge or contextual awareness to formulate

queries or navigate complex information spaces,” or “the search task requires browsing and exploration” [39]. Like others [12], exploratory search researchers have recognized that human information needs are not necessarily constant and convergent over the course of a search session.

Figure 2: Two answers to the information discovery question: What psychological factors can influence a person's experiences

dating? Left: composition of image and text surrogates format. Right: linear text surrogates format.

Information discovery integrates concepts and methods from the prior investigations of human interaction with information with the creative cognition approach, to develop a human centered framework. For example, changes in information needs represent not side effects, but rather, an essential stage in creative ideation. Information needs may shift as information is found, gathered, and re-cognized. Needs can change during processes of searching and finding, as a result of the stimulus of information.

Information discovery investigates creative ideation in the context of processes and practices of information finding. In an information discovery task, the human goal is to have ideas in some area. This is a divergent thinking task in which information finding supports the generation and development of ideas. The context may be an academic task, such as paper writing or thesis formulation, or a life task, such as designing a vacation or a career. Search, itself, is not the task. Rather, search is a technology that supports information discovery tasks. We need to understand and support more than simply how people find information. In information discovery, combining and understanding relevant information are as essential as finding. The human needs to find elements of relevant information, collect and combine them, and develop a sense of connections among them. Found information can stimulate seeing new perspectives and formulating new mental models. This sets the stage for the emergence of new ideas. Thus, to support information discovery, the present research develops the representation of individual information resources, and also the representation of the set of resources collected during an information discovery task.

Representing Collections with Composition With the perspective of information discovery, we consider

3

Page 4: Promoting Emergence in Information Discovery by ...Promoting Emergence in Information Discovery by Representing Collections with Composition Andruid Kerne, Eunyee Koh, Steven M. Smith2,

how the concept of surrogates, findings about image-text representations, and the form of composition can be integrated to develop a new representation for collections. The composition format is supported by combinFormation.

Surrogates A surrogate represents an information resource and enables access to that resource [5]. Hypermedia surrogates, which enable navigation, are formed systematically from metadata. One typical surrogate is the Google gist, an element of the result set returned by a search query. Another is the bookmark in a web browser. Surrogates play an important role in keeping found things found [18]. People make critical decisions based on these surrogates, such as choosing which documents to browse, and which to ignore.

Image-Text Representations Promote Cognition In the working memory system, the visuospatial buffer (which stores mental images) and the rehearsal loop used for words are complementary subsystems [1]. They support each other in combined image-text knowledge representations. Glenberg has established that the combination of an image and descriptive text promotes the formation of mental models, and extends working memory capacity [13, 14]. Moreno has found that dual coding strategies enhance cognition during educational experiences of digital media [6, 25]. Text disambiguates images while engaging complementary cognitive subsystems.

To make better use of cognitive resources, the benefits of image-text representations can be applied to the formation of surrogates. Marchionini’s group investigated the use of multimodal surrogates for video browsing [8, 40] by comparing users’ performance and experience using different surrogate formats for digital videos. Combined surrogates lead to better comprehension and reduced human processing time. Woodruff et al investigated the efficacy of “enhanced thumbnails” as navigational surrogates for documents [44]. They start with a reduced screen shot of an entire web page. Each thumbnail is annotated with a larger textual “call out,” which indicates the presence of a key phrase from a search result set. Users performed significantly better on convergent thinking search tasks with enhanced thumbnails, than they did with text summaries or plain thumbnails.

Composition The list of textual surrogates is typically used to represent collections, such as search result sets and bookmarks. Composition is an alternative to lists; literally, it means, “the act of putting together or combining … as parts or elements of a whole” [29]. Composition of image and text surrogates extends the organizing of information afforded by spatial hypertext [24] by emphasizing visual design and communication (see example, Figure 1, left). Composition uses visual design techniques that connect and layer elements [38] to form a coherent whole, including relative size relationships, colors, typefaces, text stroking, and image compositing. The present research addresses the

processes through which collections are assembled, and how the resulting forms function as artifacts for communication and navigation, and stimuli for cognition.

Previously, a method was developed for using images and text chunks clipped from documents as the visual form for individual surrogates, while maintaining referentiality to source and hyperlinked documents [19]. It was then shown that when they are combined to form a composition that represents a collection, participants experience image and text surrogates as easier to use for navigation. The present research extends this finding, by developing and invoking creative ideation metrics, such as emergence, and measuring the impact of the composition of image and surrogates not just as a source format for navigation, but as a format that participants utilize for developing answers to information discovery questions.

combinFormation combinFormation [17] is a mixed-initiative system that enables humans to easily assemble collections of information resources as compositions of image and text surrogates [20]. The system facilitates manipulation of combinations of these surrogates, with the goal of supporting the emergence of new ideas. The participant and agents work collaboratively to develop the collection and its representation in a visual composition space. The system provides a set of direct manipulation facilities for forming, editing, organizing, and distributing collections as compositions. These include the ability to drag and drop clippings from information resources into the composition space. To assist humans in sifting through the vast expanse of potentially relevant information resources, the system also includes generative agents that can proactively gather information resources, form image and text surrogates, and compose them visually, enabling participants to see more possibilities.

Prior research investigated the role of combinFormation in the performance of 182 students in an undergraduate course on invention, The Design Process [10]. In the course, interdisciplinary teams of undergraduate students create new inventions. The students’ work on two creative assignments, The Hybrid, and The Invention, was investigated. On each assignment, a different half of the class used the full, mixed-initiative version of combinFormation to collect relevant prior work. The other half used regular Google and Microsoft Word. The experimental conditions compared performance on the assignments, which were graded by teaching assistants, according to criteria of originality, novelty, practicality, broad impact, and commercial transfer ability. Students performed better on the Hybrid and Invention assignments when they used combinFormation to develop a supporting prior work collection in the composition of image and text surrogates format, than they did using regular Google and Word in the linear text format. Use of combinFormation was found to promote information discovery.

4

Page 5: Promoting Emergence in Information Discovery by ...Promoting Emergence in Information Discovery by Representing Collections with Composition Andruid Kerne, Eunyee Koh, Steven M. Smith2,

What made combinFormation effective? We need to discover the role of different components of digital tools in supporting creativity. To focus the investigation, the present research investigates the role of the representational format. Future work can investigate the role of agent components. Thus, this study employed a reduced version of combinFormation, in which only the direct manipulation capabilities were available. By reducing the set of components of the tool that were available to participants in this stage of research, we created more controlled experimental conditions, isolating the representational format for developing answers to information discovery questions, without involving the factors introduced by the generative agents.

EXPERIMENTAL METHODS

Participants: Undergraduate Psychology Students Forty-three student volunteers participated in the experiment. Undergraduate members of the “psychology subjects pool” fulfilled a requirement of their introductory psychology course by participating. Concurrently offered sections of the course had a total enrollment of more than 1000 students. The experimenters were not personally familiar with the participants. 83.7% of the participants reported that they use the internet daily.

Psychology Resources Collections (Source Format) We designed the study tasks so that they would contribute to the education of the participants. We utilized a previously developed collection of information resources that represent six areas of the psychology curriculum: clinical psychology, consciousness, biopsychology, learning, developmental psychology, and perception [22]. For each area, there were six to ten information resources, each representing a subtopic. The collection featured approximately 50 information resources. Each resource consisted not simply of a single web page, but of a set of web pages, that is a web site or portion thereof, as hyperlinked by the original authors. Each web site provided in-depth information on its subtopic. These web sites were downloaded and cached on a local server in order to remove variables such as network latency and server accessibility.

The collection was split into two mutually exclusive subcollections to isolate experimental conditions, and reduce the potential of carryover effects. Subcollection A included the clinical psychology, consciousness, and biopsychology areas, while Subcollection B consisted of learning, developmental, and perception. For each subcollection, navigation began with an overview consisting of one surrogate for each area. At the area level, the participant found six to ten surrogates, with each linked to an actual information resource (see example, Figure 1).

Two representations of the surrogate collections were developed, each of which was employed in a separate experimental condition. In one representation, navigation at the top overview level and secondary area level were represented with a list of text surrogates; the other

representation utilized compositions of image-text surrogates at both levels. These versions of the psychology resources collection constitute the source format.

Information Discovery Questions Each participant answered four information discovery questions:

• What kinds of things can cause behavioral problems for children in school?

• What psychological factors can influence a person's career choices?

• What psychological factors can influence a person's experiences dating? (See example answers, Figure 2.)

• What can cause obesity?

In order to answer each of these questions, each participant browsed the assigned subcollection. We logged their browsing activity, in order to collect navigation data for analysis across conditions. Since the subcollections were stored on our server, instead of using a proxy server, we were able to configure a Java Servlet to act as a router between the participant and the surrogate collections and information resources. This servlet logged navigation and integrated the navigation logs with answers to questions.

Answer Format and Interface In each experimental condition, the subject used one of two interfaces to develop their answer in a particular format. Participants used the text form field of a web page to form the linear text answer format. They utilized the direct manipulation only version of combinFormation to form the composition of images and text format. In this format, information could be dragged from a source collection information resource document, and dropped into the composition space. Referentiality from the source web page was automatically maintained by combinFormation in such cases, so that the dragged-in material functions as a navigational surrogate. Participants used combinFormation to add their own ideas to the collected source elements, and engage in design and editing.

Procedure At the beginning of the experiment, the experimenter introduced the direct manipulation only version of combinFormation. Participants practiced with the system for 10-15 minutes. The experimenter briefed participants to ensure they understood how to use the program. Then, the participants read the experiment instructions. Participants were asked to do initial research and provide answers for several imaginary psychology course group projects by navigating the psychology resources source material collection. To create their collections of ideas, participants were told to browse websites that might help them get possible answers to the questions, and gather elements of information from the websites. Because their imaginary professor maintained strict standards about plagiarism, they were also instructed to include the web address they

5

Page 6: Promoting Emergence in Information Discovery by ...Promoting Emergence in Information Discovery by Representing Collections with Composition Andruid Kerne, Eunyee Koh, Steven M. Smith2,

obtained each information element from. They were encouraged to create ideas by combining material from multiple websites, and to form connections between related ideas. Creative and unusual ideas were encouraged.

Next, each participant completed a short pre-questionnaire. This was followed by the four information discovery questions. As each question was presented, it was preceded by a reiteration of the instructions: “Put together ideas to answer this question. Rather than just thinking off hand, use the information in the provided websites to help you get ideas. Get as many varied and unusual ideas as you can.”

Participants were given 11 minutes to answer each information discovery question, and encouraged to use the whole time in composing the answer. Participants could see the time remaining counting down on a digital clock. 60 seconds before the end, the numbers turned yellow. Twenty seconds before the end, they turned red. At the end of 11 minutes, if the participant had not clicked submit, the program automatically submitted what they had written. Right after the answer was submitted, participants answered Likert scale questions as to how challenging answering the previous question was, and how able they were to get where they wanted. Right after this, the next question was displayed. At the end of the experiment, participants answered post-experimental questions about their preference between the two navigational source formats, and also between the two answer formats; they also reported the reasons for their choices. The whole procedure lasted approximately 60 minutes.

Score Criteria 0 The subject pulled elements from the assigned

subcollection to answer a given question, but recognizable relationships and new ideas are

minimal.1 Coherence between elements but not original or

originality of elements but no coherence.2 Original elements in a coherent group.

Emergence

3 Original elements connected with found elements in a coherent group in a way that is

clear and insightful.0 Answer seems to have no relation to the question1 Some relevance. Little or no explanation.2 Multiple perspectives through elements.

Some explanation.

Quality

3 Brilliant – Wow, that was very interesting. Better explanation.

Table 1: Criteria for creative ideation measures.

Experiment Design To streamline the experiment, source collection format and answer format were grouped together into a single independent variable, representation format. Thus, for the composition of image and text surrogates format condition, each participant encountered the psychology resources subcollection in the composition format and used the direct manipulation combinFormation to compose the answer with image and text surrogates. For the linear text surrogates condition, the participant encountered the subcollection as a list of text surrogates, and composed the answer using the text form field.

A 2 X 2 within-subjects design was employed. The two independent variables were representation format (image and text composition, linear text), and psychology subcollection (A: clinical psychology, consciousness, biopsychology; B: learning, perception, and developmental). Thus, the experimental design produced 4 different conditions over the formats and subcollections; that is, linear text format with subcollection A, linear text format with subcollection B, image and text format with subcollection A, and image and text format with subcollection B. Each participant was randomly assigned to one of the four conditions of representation and subcollection. The question order was counterbalanced between subjects across these conditions.

Applying Creative Ideation Measures As per the Background section, we utilized the following objective measures of creative ideation: emergence, quality, flexibility, quantity, and originality. To effectively apply these measures, we developed a contextualized method for determining them. The methodological goal is to define a procedure such that multiple raters can independently and consistently score each participant response to each information discovery question. This requires defining criteria with sufficient clarity to achieve a minimum of difference between ratings from question to question, reviewer to reviewer, and study to study. The process for specifying such criteria starts with discussion in context. All raters and experimental designers met to discuss the questions, and to define criteria in association with each rating.

In a preliminary version of these criteria, emergence and quality were defined in a manner that was mutually dependent. Raters were initially seduced by the idea that emergent answers are of high quality. Yet, this did not meet the design goal of clear and separate criteria. Through an iterative process of rating example sets of answers, discussion, and criteria refinement, a consensus was reached on how to define the criteria independently (see Table 1).

Once the criteria for emergence and quality were clear, each reviewer rated participants’ answers for the full set of questions. These results were then compared. The Pearson correlation between the two raters was 0.575 and it was statistically significant (N = 170, p < 0.001). To ensure correct ratings, the reviewers then met to resolve differences. Since the results were already quite consistent, this process was simple; a consensus data set of ratings for emergence and quality was quickly developed.

Emergence Emergence refers to qualities that come newly into existence as a result of novel combinations of elements. To create clear and concise criteria for measuring emergence,

6

Page 7: Promoting Emergence in Information Discovery by ...Promoting Emergence in Information Discovery by Representing Collections with Composition Andruid Kerne, Eunyee Koh, Steven M. Smith2,

several possible attributes were discussed. One factor initially considered was the groupings of elements. This included the presence of groups and their coherence. This was not found to be a criterion that could be consistently applied. Instead, emergence was rated based on the presence of ideas not found within the psychology resources source material collection, as a result of the combination of source collection elements. The presence of novel ideas and the coherence of the answer were found to be significant. Evaluating emergence also required ensuring that participants were using the source materials as factors in the new ideas, as opposed to merely recalling information not contained in the collection without using it at all. A complete specification of the criteria and scores that were defined can be found in Table 1.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

emergence

linear text image-text composition

Figure 3: The mean emergence measure (scale 0-3), as differentiated by the representation format condition.

Flexibility – Navigational Variety and Efficiency Flexibility corresponds to the measure of variety in a participant’s answers. We correspond this with the diversity of the websites that they browsed. This was measured using the router logging mechanism described above.

Other Measures Like emergence, quality was measured through criteria established through iterative discussion and investigation (see Table 1). Fluency (quantity) and originality were measured through the compilation of a master list of all ideas given by all experimental participants on each information discovery question. The frequency of each response was computed across all subjects. Thus, the master list was used to compute normative scores that established what constitutes a response, enabling the counting of quantity. Originality of a single answer is analogous to document frequency in information retrieval. It was derived by counting the subjects that gave an answer, and using this to form a ratio. Thus, if only one subject provided an answer, originality would be one, but if more provided it, the fraction would decrease. Originality for a subject, then, was calculated by averaging these fractional scores.

DATA AND ANALYSIS

Quantitative: Creative Ideation Measures Emergence was measured according to the criteria and method described above. The mean emergence measure was greater for the image-text composition format (1.570) than for the linear text format (1.011). The results were statistically significant [F(1,42) = -4.734, p < 0.001] (Figure 3). This representation format condition includes both the source collection and the interface with which the answer to an information discovery question was developed.

Next, we examine the measure of flexibility, which is associated with navigational variety and efficiency (Figure 4). Participants used 8.172 minutes on average to complete the tasks using linear text and 8.417 minutes using composition. They used almost the same amount of time to answer questions in the two representation formats [F(1,42) = 0.700, p = 0.488]. During the same amount of time, participants were able to navigate and browse a greater number of information resource web pages when using the image and text composition format than when using linear text [F(1,42) = 2.923, p < 0.01]. There was no statistical difference in the number of overview and area surrogate collection pages they viewed, between the two representation formats [F(1,42) = 0.297, p = 0.768].

0

2

4

6

8

10

12

14

# surrogate collection pages # information resource pages

linear text image-text composition

Figure 3: Navigational Variety and Efficiency: Per participant avg. number of surrogate collection pages and

avg. number of information resource pages by representational format.

We also examined the study results for the other objective measures of creative ideation, and found there was no statistical difference between the two representation formats for these measures: quality [F(1,42) = 0.976, p = 0.335], fluency [F(1,42) = 0.429, p = 0.67], and originality [F(1,42) = 0.912, p = 0.367].

Quantitative: Participant Experience Measures In all subjective experience measures, participants preferred composition of image and text surrogates both for navigating the information resource collection surrogate pages (source collection format) and expressing their ideas (answer format). Right after participants submitted each

7

Page 8: Promoting Emergence in Information Discovery by ...Promoting Emergence in Information Discovery by Representing Collections with Composition Andruid Kerne, Eunyee Koh, Steven M. Smith2,

answer, they were asked two additional questions: how challenging was it to answer the previous question, and how able were they to get where they wanted to go. Answers were quantified using a Likert scale, from 0 (very low) to 4 (very high). Across representation formats, participants experienced a similar level of challenges while answering the information discovery questions [F(1,42) = -.643, p = 0.523]. At the same time, the self-rating data shows that participants experienced a sense of being more able to get to where they wanted to go (navigate) while answering each information discovery question when using image and text composition representations [F(1,42) = -2.203, p < 0.05].

We acquired subjective measures of participant experience separately for the source information resource collection formats and the answer formats. For the source collection formats, participants found the composition of image and text surrogates to be more helpful [Χ2 (1) = 6.4, p < 0.02] and they liked them better [Χ2 (1) = 10.256, p < 0.01], compared to the linear text.

For the format of the answers they created, participants liked composing image and text surrogates with combinFormation better than assembling textual surrogates in the linear format [Χ2 (1) = 6.721, p < 0.01]. They also preferred developing answers with image and text composition to linear text for expressing their ideas [Χ2 (1) = 19.558, p < 0.01].

The last quantitative participant experience question asked whether they had problems using each answer interface (0-2 scale). The result showed that participants had more problems creating the image and text composition format with combinFormation than they did creating the linear text format in the browser [F(1,42) = 3.277, p < 0.01]. However the averages of both ratings are lower than 0.6, indicating a fairly low overall level of problems in both formats.

Qualitative: Participant Experience Reports We collected qualitative participant experience report data through open-ended questions that accompanied the Likert-scale questions about subjective experience. The qualitative data adds dimension to our findings in conjunction with the strong quantitative results supporting the effectiveness of the composition of images and text format. One question asked about participants’ preference for the two psychology resources collection source formats. Participants said that the image and text composition gave an idea about what to click, while the linear text provided just a guess as to what the link would be about. The pictures helped inform them about which link was associated with which topic. Colors and images highlight and explain each topic, while the linear text format experience is boring. A few said that it was easier to see and understand what the linear text was offering without the distraction that the composition of image and text presented. However, they found that the blended images and text were more appealing to the eye. P431: “I liked the blended images better because they created a visual for the topic I needed. The linear text simply listed the word.

I guess the picture and the word in the blended images better helped me determine the topic that I needed.”

P432: “I liked the blended format more because it was less dull and almost prompted you mentally. When I was answering questions in this format, when I was thinking of possible answers for the questions, I was prompted mentally before I even clicked the icon. The other, the pure text format is ordinary and un-interesting.”

One question we asked was how they would compare the experience of working with composition of images and text with other formats of information resources that they had worked with (e.g. text book, other web sites…). In the answer data, we again encounter participants’ fondness for the compositional format.

P11: “Composition with images makes it easier with picture-word association. Textbooks sometimes use this with a picture to explain what they are discussing within a chapter. It’s the same basic idea that helps people like me, who remember pictures better than words on a page.”

P22: “While obviously text is the most important part to understanding a topic, it is easier for me to make an educated guess as to what I am supposed to gain from the information if there is a picture to accompany the information.”

Most participants preferred the composition of images and text answer format to the linear text answer format. We asked them to describe their reasons. Participants liked using combinFormation to assemble answers because it provides freedom to put together their ideas in best possible ways and more creatively; meanwhile, they said the linear text format is harder to combine information with. They reported finding combinFormation to be more fun to work with, more stimulating of interest, and more aesthetically pleasing. Also, they found it useful that combinFormation automatically brings the URL and metadata information to the answer when they drag and drop elements.

P321: “I liked how combinFormation used a color code so we could know what was taken from the website and what was our own text. Also how the website address was automatically saved when text or pictures were moved”

P431: “I liked the combinFormation better than the plain text form. While I am most familiar with the plain text form, I feel I can be more creative with combinFormation.”

P435: “The linear text format restricted my ideas from being expressed.”

The participants’ subjective rating also showed that the composition of image and text format was better for expressing ideas. They said that ideas are better expressed with both images and text because visual aid supports their ideas. Also, different colors helped them express connections between ideas and improve presentation.

CONCLUSION The creative intellectual information discovery tasks that humans perform with digital information resources are critical to research, writing, learning, and invention. We

8

Page 9: Promoting Emergence in Information Discovery by ...Promoting Emergence in Information Discovery by Representing Collections with Composition Andruid Kerne, Eunyee Koh, Steven M. Smith2,

have built on the extensive base of prior creative cognition research to develop methods for investigating information discovery tasks in the laboratory, including reusable objective measures for evaluating their creative products. Noteworthy among these is an emergence measure, which indicates the discovery of new ideas with a basis in found information. Other measures include flexibility, fluency, and originality. While the process for applying these measures is time consuming, it is an essential component of evaluation in research that develops tools to support and promote creative processes involving digital information.

combinFormation, a mixed-initiative system for representing collections as compositions of image and text surrogates, is a complex program, in which direct manipulation components and generative agent components are interconnected. Rather than investigate the efficacy of the program as a whole, on this occasion we isolated the variable of collection representation, in the context of the performance of information discovery tasks. This componentization of inquiry enabled us to address research questions focused on representation format and interface.

The results of applying the creative cognition measures to the performance of study participants on information discovery tasks provide strong statistical findings: the composition of image and text format promotes emergence and flexibility. We develop a hybrid picture that reinforces these findings through participant experience measures and reports. They experienced the composition of image and text surrogates format as better for expressing their ideas and more helpful, and they liked it better. They reported that they were more able to get where they needed to go, that is, to navigate, using composition. This corresponds directly with the empirical evidence of their navigational flexibility and efficiency, that they browsed a greater variety of information resources with the composition format. Participant experience reports further corroborate this picture.

There is always a concern when introducing a new tool for representation and interaction: that novices will find it hard to use, due to lack of familiarity. In fact, participants did report more problems using combinFormation to develop answers to information discovery questions as compositions, than they did using the more familiar browser form field to develop linear text. This could contribute to the lack of benefits measured with regard to the fluency and quality measures. Participants become excited about the process of arranging, organizing, and composing collected answer elements. In the time-limited experimental conditions, spending time arranging elements may become an alternative to spending time collecting elements. In a naturalistic information discovery process, a person can take sufficient time, as necessary, to engage in both of these phases of research. The role of tools and familiarity with them, as well as of representational format, on fluency and quality measures is worthy of future research.

Over time, the evaluation of creativity support tools needs to employ and integrate diverse modalities of investigation, including laboratory experiments, ethnographies, field studies, and longitudinal case studies. Each of these modalities makes different contributions. The laboratory experiment findings developed in this paper are reinforced by the ecologically valid field study of undergraduate students in The Design Process course on invention. The composition of image and text surrogates promotes emergence, expression, flexibility, navigation, and an overall sense of creative experience in information discovery. Future work will utilize different modalities of investigation, and will also develop laboratory investigations of other components of creative ideation. These modes of evaluation will be developed in tandem with new methods for creativity support.

REFERENCES 1. Baddeley, A.D., Is working memory working?,

Quarterly Journal of Exp Psych, 44A, 1-31, 1992. 2. Baldonado, M., Winograd, T., SenseMaker: an

information-exploration interface supporting the contextual evolution of a user's interests, Proc CHI 1997, 11-18.

3. Bates, M. The design of browsing and berry picking techniques for the online search interface. Online Review, 13:5, 407-431, 1989.

4. Belkin, N., ASK for Information Retrieval: Part I. Background and Theory, Journal of Documentation, Vol 38, Num 2, 1982.

5. Burke, M., Organization of Multimedia Resources, Hampshire, UK: Gower, 1999.

6. Carney, R.M., Levin, J.R., Pictorial Illustrations Still Improve Students’ Learning From Text, Educational Psychology Review, Vol. 14, No. 1, March 2002.

7. Costello, F. J., & Keane, M. T. (2000). Efficient creativity: Constraint guided conceptual combination. Cognitive Science, 24, 299–349.

8. Ding, W., Marchionini, G., Soergel, D., Multimodal Surrogates for Video Browsing, Proc DL 1999, 85-93.

9. Estes, Z., & Glucksberg, S. (2000). Interactive property attribution in concept combination. Memory & Cognition, 28, 28–34.

10.Estes, Z. & Ward, T. B. (2002). The emergence of novel attributes in concept modification. Creativity Research Journal, 14, 149-156.

11.Finke, R., Ward, T., Smith, S.M. Creative Cognition, Cambridge, MA, MIT Press, 1992.

12.Fisher, K.E., Erdelez, S., McKechnie, L., eds., Theories of Information Behavior (ASSIST Monograph), Information Today, 2005.

13.Glenberg, A.M., Langston, W.E., Comprehension of illustrated text: Pictures help to build mental models,

9

Page 10: Promoting Emergence in Information Discovery by ...Promoting Emergence in Information Discovery by Representing Collections with Composition Andruid Kerne, Eunyee Koh, Steven M. Smith2,

Journal of Memory & Language, 31(2):129-151, April 1992.

14.Glenberg, A.M., The Indexical hypothesis: meaning from language, world, and image, in N. Allen, ed, Words and Images: Working Together - Working Differently, Albex, 2002.

15.Harter, S.P., Psychological Relevance and Information Science, Journal of the American Society for Version:1.0 Information Science, 43(9):602-615, 1992.

16.Hampton, J. A. (1997). Emergent attributes in combined concepts. In T. B. Ward, S. M. Smith, & J. Vaid (Eds.), Creative thought: An investigation of conceptual structures and processes. (pp. 83–110). Washington, DC: American Psychological Association.

17.Interface Ecology Lab, combinFormation, http://ecologylab.cs.tamu.edu/combinFormation/

18.Jones, W., Dumais, S., Bruce, H., Once found, what then?: a study of ‘keeping’ behaviors in personal use of Web information. Proc ASIST 2002, November 18-21, 2002, 391-402.

19.Kerne, A., CollageMachine: An Interactive Agent of Web Recombination, Leonardo 33:5, 2000.

20.Kerne, A., Koh, E., Dworaczyk, B., Mistrot, M.J., Choi, H., Smith, S.M., Graeber, R, Caruso, D., Webb, A., Hill, R., Albea, J., combinFormation: A Mixed-Initiative System for Representing Collections as Compositions of Image and Text Surrogates, Proc JCDL 2006, 11-20.

21.Kerne, A., Smith, S., The Information Discovery Framework, Proc DIS 2004, 357-360.

22.Kerne, A., Smith, S.M., Choi, H., Graeber, R., Caruso, D., Evaluating Navigational Surrogate Formats with Divergent Browsing Tasks, Proc CHI 2005 Extended, 1537-40.

23.Marchionini, G., Information Seeking in Electronic Environments, Cambridge Univ Press, 1997.

24.Marshall, C.C., Shipman, F.M., VIKI: Spatial hypertext supporting emergent structure, Proc ECHT 94, 13-23.

25.Mayer, R.E., Moreno, R., Animation as an Aid to Multimedia Learning, Educational Psychology Review 14:1, March 2002.

26.Mobley, M. I., Doares, L. M., & Mumford, M. D. (1992). Process analytic models of creative capacities: Evidence for the combination and reorganization process. Creativity Research Journal, 5, 125–155.

27.Morrison, J.B., Pirolli, P., Card, S.K., A Taxonomic Analysis of What World Wide Web Activities Significantly Impact People's Decisions and Actions, Proc CHI 2001 Extended.

28.Mumford, M. D., Baughman, W. A., Maher, M. A., Costanza, D. P., & Supinski, E. P. (1997). Process-based measures of creative problem-solving skills: IV. Category combination. Creativity Research Rips, L. J.

(1995). The current status of research on concept combination. Mind and Language, 10, 72–104.

29.Oxford English Dictionary on Compact Disk, 2nd Edition. Oxford: Oxford University Press, 1992.

30.Pirolli, P., Card, S.K., Information Foraging, Psychological Review, 106:4, Oct 1999, 643-675.

31.Rothenberg, A. (1979). The emerging goddess. Chicago: University of Chicago Press.

32.Russell, D. M., Stefik, M. J., Pirolli, P., Card, S. K., The cost structure of sensemaking, Proc of the INTERCHI 93, pp. 269-276.

33.Shah, J.J., Smith, S.M., Vargas-Hernandez, N. Metrics for measuring ideation effectiveness. Design Studies, 24, 2003, 111-134.

34.Shah, J. J., Smith, S. M., Vargas-Hernandez, N., Gerkens, R., & Wulan, M., Empirical studies of design ideation: Alignment of design experiments with laboratory experiments, Proc Am Soc Mechanical Engineering, 2003.

35.Smith, E. E., Osherson, D. N., Rips, L. J., & Keane, M. (1988). Combining prototypes: A selective modification model. Cognitive Science, 12, 485–527.

36.Smith, S.M., Ward, T.B., & Finke, R.A. (1995). The creative cognition approach. Cambridge, MA: MIT Press.

37.Thagard, P. (1984). Conceptual combination and scientific discovery. In P. Asquith & P. Kitcher (Eds.), PSA, 1 (pp. 3–12). East Lansing, MI: Philosophy of Science Association.

38.Tufte, E., Envisioning Information, Cheshire, CT: Graphics Press, 1990.

39.White, R.W., Kules, B., Drucker, S.M., schraefel, m.c., Supporting Exploratory Search, CACM, 49:4, April 2006, 36-39.

40.Wildemuth, B., Marchionini, G., Yang, M., Geisler, G., et al., How Fast Is Too Fast? Evaluating Fast Forward Surrogates for Digital Video, Proc JCDL 2003, 221-230.

41.Wilkenfeld, M. J., & Ward, T. B. (2001). Similarity and emergence in conceptual combination. Journal of Memory and Language, 45, 21–38.

42.Wisniewski, E. J. (1996). Construal and similarity in conceptual combination. Journal of Memory and Language, 35, 434–453.

43.Wisniewski, E. J. (1997). When concepts combine. Psychonomic Bulletin & Review, 4, 167–183.

44.Woodruff, A. Rosenholtz R., Morrison, J., Faulring, A., Pirolli, P., A Comparison of the Use of Text Summaries, Plain Thumbnails, and Enhanced Thumbnails for Web Search Tasks, JASIST 53(2):172-185, 2002.

10