knowledge organization in an information retrieval task

information Processing & Mclnogement Vol. 26, No. 4, pp. 535-542, 19!90 Printed in Great Britain.

0306-4573/w 13.00 + .oLl Copyright Q 1990 Pergamon Press plc

KNOWLEDGE ORGANIZATION IN AN INFORMATION RETRIEVAL TASK

BRYCE ALLEN Graduate School of Library and Information Science,

University of Illinois at Urbana-Champaign, Urbana, IL, 61801, U.S.A.

(Received 21 October 1988; accepted in final form 16 August 1989)

Abstract-One characteristic that may affect the way users interact with information systems is the way they organize their knowledge of the topic to be searched. An experiment was conducted to determine the extent and nature of this effect. Subjects who were given different advance organizers read texts and responded to questions about the topics in the texts. These questions were presented in an information retrieval context, on simulated pre-search forms. It was found that different organizing structures affected responses to questions in one topic of the three that were investigated. This demonstrates a complex interaction between the topic of the search, the organizing structure employed by users, and questions asked by intermediaries. Because the way users organize their knowledge has an impact on their interaction with information systems, these organizing structures may be candidates for inclusion in cognitive models of users. Other user characteristics that may affect information retrieval can be investigated using this type of experiment.

1. INTRODUCTION

The main goal of current research into cognitive models of information users is to make information systems more responsive to users. In general, the idea is that information systems can be developed that will respond to the characteristics of individuals or of groups of users by changing the user interface or the retrieval mechanism (or both) to conform with user characteristics (Borgman et a[., 1985).

Before information systems that incorporate cognitive models of users can be designed or implemented, two basic research questions must be addressed. Which characteristics or attributes of users should be included in the user model? How do these characteristics or attributes affect the information interaction?

Identification of user characteristics The identification of user characteristics to be modelled presents interesting method-

ological problems. As Daniels (1986) points out, existing research projects and experimental systems exhibit “no consensus as to what types of information an adequate user model should contain.” Daniels’ (1985) own research, following the model established by Belkin (1984), used transcripts of pre-online search interviews to identify user characteristics sought by human intermediaries. Categories of characteristics identified by this method included the status, goals and background of users and their level of knowledge of both the topic and of information retrieval systems. These characteristics appear to be elements of the context of the question and the retrieval goals of users, which Harter (1987) distin- guishes from the question itself. In online bibliographic searching, these user characteristics can help to determine the kind of search that will meet the user’s needs.

Another approach to this task of identifying characteristics that should be included in user models is based on the idea that information-seeking behavior incorporates cognitive processes such as comprehension, memory, and imagination. Research in cognitive science has investigated individual characteristics that affect these cognitive processes, and it is possible to consider such characteristics as candidates for inclusion in cognitive models of information users. Some of the user characteristics identified in Daniels’ research have

535

536 B. ALLEN

also been investigated in general cognitive tasks such as reading and recall. For example, Spilich, Vesonder, Chiese, and Voss (1979) investigated the effect of the amount of domain-related knowledge on subjects’ comprehension of a narrative, and found that high-knowledge subjects integrated the narrative into a goal structure and remembered more of the important details from the narrative. Gagne, Bing, and Bing (1977) found that the expectations and goals of subjects conditioned the organization of their recall of textual materials. Anderson, Reynolds, Schallert, and Goetz (1977) found that academic background had an effect on the way subjects understood ambiguous narratives.

It is possible that other individual characteristics identified by cognitive science as affecting cognitive processes may also have an impact on users’ participation in information interactions. One such characteristic is the way individuals organize their knowledge of a topic. Research in cognitive psychology using advance organizers has demonstrated that organizing structures have an effect on comprehension and recall of textual materials.

Advance organizers are statements or instructions given to subjects who are reading texts in order to facilitate the processes of comprehension and later recall. These organizers may provide a context for the text that follows, or they may provide a structure into which readers can fit what they are reading. In some cases, they provide plans or strategies helpful in the process of comprehension of the text. In early research into the effect of advance organizers, Ausubel (1960) reported that they facilitated comprehension and recall by mak- ing the material read more familiar and meaningful, and by providing an optimal anchor- age for material in memory. Lee (1965) studied the role of section headings (i.e., those headings that identify the elements of high level prose structure) in memory for prose, and found significantly better memory for text when these structural headings were included. Schumacher, Liebert, and Fass (1975) found that the usefulness of advance organizers depended on the relationship between the structure provided in the organizer and the pas- sage being read. Not all organizing structures functioned equally well. Tyler, Delaney, and Kinnucan (1983) found that an advance organizer based on Kintsch and van Dij k’s (1978) macrostructures were of particular benefit to poor readers. This body of research indicates that advance organizers given at the time of reading have an effect on the understanding and recall of textual materials. This effect appears to operate through the activation of organizing cognitive structures (schemata), that guide the process of reading, understanding, and recall.

Since organizing structures affect the way people understand and remember what they read, it may follow that the way users organize their knowledge (or lack of knowledge) of a search topic can have an impact on their interaction with information systems. This research was designed to ascertain whether different ways of organizing knowledge would lead to differences in the ways users interact with information systems. The focus of the research was the way users respond to questions from intermediaries, and the input to information systems provided by these responses.

The effect of user characteristics on information interactions Once one has identified user characteristics or attributes that may affect the informa-

tion interaction, and which are therefore candidates for inclusion in cognitive models of users, it is necessary to identify the ways in which different values of these characteristics or attributes are associated with different ways of approaching the information interaction. For example, it seems logical to expect that users with high domain knowledge will interact with information systems differently from users with low domain knowledge. One way to identifying the nature of the difference is through experimentation. Once a characteristic has been identified, it is possible to set up an experiment in which that characteristic is systematically varied while other characteristics are held constant, and to examine the effect of the different values of that characteristic on the performance of information tasks.

The research reported here used an experimental method to examine the effect on an information retrieval task of different ways of organizing users’ knowledge of a topic. Sim- ilar methods can be applied to other user characteristics. For example, Borgman (1986) investigated the effect of users’ mental models of the retrieval system on their performance

Knowledge organization in an information retrieval task 537

in using the system. Additional research is currently under way that investigates the effect of academic background on the way users describe their information needs.

2. METHODOLOGY

Experimental tasks Participants in the experiment read one of three texts, derived from articles in sub-

ject areas selected at random from the sciences. They then made notes on what they were reading, following one of three sets of note-taking instructions designed to act as advance organizers. The purpose of these different instructions was to influence the way participants organized the knowledge of the topic they were deriving from the text. After a delay of one week, participants responded to written questions about the topic they had read. The questions were identified as an information retrieval task by being presented on simulated pre-online search forms.

Note-taking instructions The three sets of note-taking instructions were designed to generate different organi-

zations of the knowledge of the topic in the minds of the participants. The first set of instructions asked participants to assume that they were going to be requesting an online search on the topic being read. The notes taken were to identify the ideas and concepts that participants considered important in completing a successful online search. This set of instructions introduced a particular goal with an information retrieval orientation, and placed constraints on the ideas and concepts to be considered important: Those of potential importance in obtaining an online search on the topic. Because it was anticipated that these instructions would elicit notes that contained much of the vocabulary found in bibliographic records, they were called “bibliographic instructions.”

The second set of instructions imposed a different structure: The high-level organizing structure of scientific report articles, as identified by Kintsch and van Dijk (1978). Al- though developed in the context of text-linguistics and discourse analysis, this structure has attracted the attention of a number of researchers in information science. The role of this type of structure in questions or cues has been reported by Allen (1988). Liddy (1988) found a similar structure in abstracts of scientific articles, and Beghtol (1986) discussed the role of these high-level structures in classifying documents. Participants were informed that papers like the one they were reading normally contained elements reporting the purpose of the authors, the methodologies used, the findings reported, and the conclusions drawn. In their notes, they were to include the important ideas and concepts from these categories. Because of their emphasis on structural elements, these instructions were called “structural instructions.”

The final set of instructions imposed no specific structure. Participants were simply asked to make notes identifying the important ideas and concepts contained in the text they were reading. This set of instructions, which was included to provide a base against which the others could be measured, was called “free instructions.”

Questions The questions presented on the simulated online search forms were designed to echo

the structures imposed by the note-taking instructions. Participants answered bibliographic questions that asked about keywords and synonyms; structural questions that asked about purpose, methodology, findings and conclusions; or open questions, that asked about the topic in general terms. Because of the experimental design, one-third of the participants answered questions that were based on the same structure used in note-taking, one- third answered questions that were based on a quite different structure, and the final third answered questions that imposed no particular structure.

Participants 108 students (77 female and 31 male) participated in this experiment. 78 were gradu-

ate students in Library and Information Science, and 30 were first-year Business students

IPM 26:4-G

538 B. ALLEN

from a a college of applied arts and technology. Participants were paid $10 for completing the experimental tasks. Most of the participants had little or no library work experience, and were only moderately familiar with online searching. Participants were randomly assigned to the 27 different experimental conditions: Each read one topic, made notes following one set of instructions, and answered one set of questions.

Materials Three topics were selected randomly from the sciences: Amino acids in meteorites, the

migration and wintering habits of Sandhill Cranes, and the effect of television viewing on aggression in children. Two articles on each topic were selected and abridged to provide a total length of approximately 1500 words.

Analysis The notes taken by participants were analyzed to verify that the instructions made a

difference. Overall length of the notes, measured by the number of unique word stems used, was calculated, and the contents of the notes were analyzed to identify evidence of the effects of the note-taking instructions.

The responses to the simulated pre-search forms were analyzed into propositions using the method of Bovair and Kieras (1985), and compared with the propositions from the text read by the participant. In addition, the vocabulary of the response was compared with the vocabulary included in the titles and abstracts of the articles that served as the basis for the texts read, and the proportion of topic keywords included in the response was calculated, Finally, each response was analyzed to determine overall length of each response, measured by number of unique word stems included. These measures provided evidence about the ways in which participants performed an information task: The provision of information about a search topic to an information retrieval system. The first measure, the number of propositions from the text read that were included in the response, indicated the extent to which a general recall of the topic was included in the response. The second, the proportion of topic keywords contained in responses, indicated the extent to which the responses contained details about the information need that were of potential use in retrieval from standard document retrieval systems. The third, overall length of response, provided insight into the number of details about the information need that were included in responses.

These three variables were analyzed by ANOVA,using the topic of the text read by subjects as a random effect, the questions asked on the search forms as a fixed effect, and the note-taking instructions as a fixed effect (Allen, 1988).

3.FINDiNGS

Note-taking task The first question to be addressed was whether the different instructions produced dif-

ferences in the notes. The average length of notes taken by participants who followed the different note-taking instructions was calculated, and it was found that notes taken following bibliographic instructions averaged 165 words in length, significantly shorter than those taken following either structural instructions (221 words in length) or free instructions (230 words in length). Other differences were noticed as well. In 19 of 36 cases, notes taken following structural instructions were organized according to the categories of the scientific report structure, and in 4 of 36cases, notes taken following bibliographic instructions mentioned specific details about the anticipated online search. From this organiza- tional evidence, as well as from the difference in overall length of the notes taken, it follows that the instructions produced differences in the notes. Since the instructions were designed to act as advance organizers, it can be concluded that the knowledge acquired by participants in the reading and note-taking tasks was organized differently, according to either the bibliographic structure, the text-lin~istic structure, or (in the case of free instructions) an alternative structure chosen by the participant.

Knowledge organization in an information retrieval task

Content analysis of responses to questions

539

When the content of the responses to questions on forms were analyzed, differences were identified that show that participants’ organization of knowledge had an effect on their responses. Figure 1 illustrates these differences.

Participants who read the text on the topic of Sandhill Cranes, and who made notes following the bibliographic instructions, included an average of 18 propositions from the text they had read in response to structural questions, but only 5-6 idea units in response to open or bibliographic questions. Participants who made notes following the structural instructions included an average of 12 propositions in response to open questions, but only 4 in response to bibliographic or structural questions. It appears that, in this one topic, participants produced responses that exhibited more complete recall of the topic when they were responding to questions that presented a contrast to the way they organized their knowledge of the topic.

Participants who read the other two texts did not show this pattern of findings, and responded to all three types of questions in the same way regardless of which note-taking instructions were followed. It is not clear why the Sandhill Cranes topic should have produced such a strikingly different result, but it is apparent that in some topics the effect of the way people organize their knowledge is stronger than in others.

Supporting evidence for the interaction between users’ organization of knowledge and the questions asked on search forms was obtained from analysis of the proportion of topic keywords included in responses. Figure 2 shows the results of this analysis for the Sand- hill Cranes text. The pattern of results is very similar to that found in Fig. 1.

Participants who received free instructions included 16% of the topic keywords in their responses to structural questions, but only 6% in responses to open questions and 9% in responses to bibliographic questions. These findings are very similar to those for participants who received bibliographic instructions. Again, participants who read the other two texts showed no evidence of these differences.

Data collected on the gender, level of previous degree, subject area of previous degree, familiarity with the topic, and familiarity with libraries were analyzed to determine if these variables produced the differences in responses described above. No significant effect was observed for any of these variables. This indicates that differences in responses can be attributed to the experimental conditions (the instructions given and the questions asked).

Length of responses to questions Average length of responses to questions on online search forms were analyzed. It was

found that the participants who received bibliographic note-taking instructions, and who

Propositions

18

16

14

12

10

6

6

4

=I 0 Blbhographic Instructtons

0’

Bibliographic Structural

Questions

Fig. 1. Text propositions matched in responses: Cranes text.

540 B. ALLEN

% Keywords Matched

.+ Free instructions

so_ Bibliographic Instructions

I- Structural Instructions

Bibliographic

Questions

Structural

Fig. 2. Keywords matched in responses: Cranes text.

had produced considerably shorter notes in response to those instructions, responded no differently to the questions than other participants. The average length of response for participants who had received the bibliographic note-taking instructions was 40 words, compared to 37 words for those who received the structural note-taking instructions, and 40 words for those who received the free note-taking instructions.

4. CONCLUSIONS

This experiment demonstrated that the way users of information systems organize their knowledge of the topic of their search has an effect in the information interaction, and that this effect depends on the topic of the information need. In the case of one topic, Sandhill Cranes, there was evidence of a negative effect for consistency between users’ organization of their knowledge and questions asked by intermediaries.

In the other two topics, and in the overall length of responses to questions, there was no difference in responses that could be attributed to participants’ organization of their knowledge. In other words, a difference that was quite noticeable in the notes taken by participants disappeared in their responses to online search forms. One possible explanation of this phenomenon is that the cognitive structures imposed by the note-taking task were superseded by the structures implicit in the questions on the pre-search forms. Support for this explanation is found in the fact that participants responded in very consistent ways to these questions: Providing shorter responses to bibliographic questions and longer responses to structural questions.

This explanation does not hold for those participants who were asked open questions, because the questions themselves imposed no structure on responses. It appears that the information retrieval orientation of the task may have provided the structure that superseded the structure of the participants’ organization of their knowledge. This explanation is supported by the fact that there was a significant negative correlation between length of library work experience and length of responses to open questions. Participants who were more familiar with the tasks and functions of information retrieval tended to respond to open questions with short answers that included a relatively small proportion of keywords and a relatively small number of idea units.

In the case of those participants who read the Sandhill Cranes text, the organizing structure imposed by the note-taking task interacted with the structures of the questions and of the information retrieval task. The result was that structures that provided a contrast to those used in knowledge acquisition elicited more productive responses. It is

Knowledge organization in an information retrieval task 541

difficult to generalize from this one case, because it is not possible to identify the characteristics of the Sandhill Crane literature that contributed to this interaction between search topic, knowledge organization, and questions asked. Additional research into knowledge organization in information retrieval is needed to identify the mechanisms involved in this complex interaction.

In terms of selecting characteristics or attributes for inclusion in cognitive models of users of information systems, this research indicated that the user’s organization of knowledge about the search topic can be a useful input in selecting the type of questions to be posed by the intermediary in some topic areas. An information system that was able to include this characteristic in its user model would be able to adapt its interface with the user to ask questions that provide a contrasting structure in order to elicit additional details about the information need. In other topic areas, this cognitive characteristic can be dis- regarded, since it is likely to be superseded either by the structures implicit in the questions posed by the intermediary, or by the structure imposed by the user’s understanding or model of the information retrieval process.

This research points out the complexity of the interrelationship between users’ cognitive characteristics and their performance of information-related functions and tasks. Clearly, much additional research is required before user models can be fully employed in an operational information retrieval setting. This experiment used an experimental approach to study one cognitive characteristic of users in the context of an information retrieval task. This approach can be applied to a number of possible components of cognitive models of users to validate these components (i.e., to show that they have an effect on the information interaction) and to provide clues as to the ways in which information systems can adapt to users’ cognitive characteristics.

Acknowledgement-This research was supported by the Council on Library Resources.

REFERENCES

1. 2.

3.

4.

5.

6.

I.

8.

9.

10.

11.

12.

13.

Allen, B., (1988). Text structures and the user-intermediary interaction. RQ, 27(4), 535-541. Allen, B., (1988). Bibliographic and text-linguistic schemata in the user-intermediary interaction. London, Ont: University of Western Ontario, Ph.D. Dissertation. Anderson, R.C., Reynolds, R.E., Schallert, D.L. & Goetz, E.T. (1977). Frameworks for comprehending discourse. American Educational Research Journal, 14(4), 367-381. Ausubel, D. (1960). Use of advance organizers in the learning and retention of meaningful material. Jour- nal of Educational Psychology, 51(5), 261-212. Beghtol, C. (1986). Bibliographic classification theory and text linguistics: aboutness analysis, intertextual- ity and the cognitive act of classifying documents. Journal of Documentation, 42(2), 84-l 13. Belkin, N.J. (1984). Cognitive models and information transfer. Social Science Information Studies, 4(2), 111-129. Borgman, C.L., Case, D.O. & Meadow, C.T. (1985). Incorporating users’ information seeking styles into the design of an information retrieval interface. Parkhurst, C.A., (Ed.). ASIS ‘85. Proceedings of the 48th ASIS Annual Meeting. (pp. 324-330). White Plains, NY: Knowledge Industry Publications. Borgman, CL. (1986). The user’s mental model of an information retrieval system: an experiment on a pro- totype online catalog. International Journal of Man-Machine Studies, 24(l), 47-64. Bovair, S. & Kieras, D.E. (1985). A guide to propositional analysis for research on technical prose. In Britton, B.K. & Black, J.B., (Eds.), Understanding expository text. (pp. 315-362). Hillsdale, NJ; Lawrence Erlbaum Assoc. Daniels, P.J. (1986). Cognitive models in information retrieval-an evaluative review. Journal of Documen- tation, 42(4), 272-304. Daniels, P.J. (1986). The user modelling function of an intelligent interface for document retrieval systems. Brookes, B.C., (Ed.), Intelligent information systems for the information society. Proceedings of the Sixth International Research Forum in Information Science (IRFIS 6), Frascati, Italy, September 16-18, 1985. (pp. 162-176). Amsterdam: North-Holland. Gagne, E.D., Bing, S. & Bing, J. (1977). Combined effect of goal organization and test expectations on organization in free recall following learning from text. Journal of Educafional Psychology, 69(4), 428-431. Harter, S.P. (1987). Online searching as a problem-solving process. Smith, L.C., (Ed.), Questions and answers: Strategies for using the electronic reference collection. Proceedings of the 24th Annual Clinic on Library Applications of Data Processing, Urbana, IL, April 5-7, 1987. Urbana, IL: Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign; (In press).

14. Kintsch, W. & van Dijk, T.A. (1978). Toward a model of text comprehension and production. Psychologi- cal Review, 85, 363-394. For a more complete account, see van Dijk, T.A. (1980). Macrostructures: an inter- disciplinary study of global structures in discourse, interaction and cognition. Hillsdale, NJ: Lawrence Erlbaum Assoc.

542 B. ALLEN

15. Kintsch, W. & van Dijk, T.A. (1978). Toward a model of text comprehension and production. Psychologi- cal Review, &T(5), 363-394.

16. Lee, W. (1965). Supra-paragraph prose structure: its specification, perception and effects on learning. Psy- chological Reports, 17(l), 135-144.

17. Liddy, E.D. (1988). Structure of information in full-text abstracts. User-oriented content-based text and image handling (RIAO 1988 Conference). Centre de Hautes Etudes Internationales d’hiformatique Documentaire, 183-191.

18. Schumacher, G., Liebert, D. & Fass, W. (1975). Textual organization, advance organizers and the retention of prose material. Journal of Reading Behavior, 7(2), 173-180.

19. Spilich, G.J., Vesonder, G.T., Chiese, H.L. & Voss, J.F. (1979). Text processing of domain-related information for individuals with high and low domain knowledge. Journal of Verbal Learning and Verbal Behav- ior, 18(3), 275-290.

20. Tyler, S.W., Delaney, H. & Kinnucan, M. (1983). Specifying the nature of reading ability differences and advance organizer effects. Journal of Educational Psychology, 75(3), 359-373.

knowledge organization in an information retrieval task

Documents