a comparison of standardized and narrative letters of recommendation

  • ACADEMIC EMERGENCY MEDICINE November 1998. Volume 5. Number 11 1101

    A Comparison of Standardized and Narrative Letters of Recommendation


    Abstract. Objective: To compare the Council of Emergency Medicine Residency Directors (CORDS) standardized letters of recommendation (SLORs) with traditional narrative letters of recommendation (NLORs) with regard to interrater reliability, consis- tency, and time of interpretation. Methods: In part I of the study, four members of the residency selection committee each evaluated the same 20 SLORs and 20 NLORs from which all identifying characteristics had been deleted. Using Likert-type scales of the global assessment, each letter was assigned a nu- meric value from 1 to 7. The interrater reliability was calculated for both types of letters using the Kendall coefficient of concordance. Average time to interpre- tation of the letters was also determined. In part 11, using the same numeric values as in part I, 207 sin- gle-author SLOFUNLOR pairs were evaluated to de- termine whether the global assessment of the SLOR was consistent with that of its partner NLOR. Inter- pretation of the NLOR was performed blinded to the

    SLOR. Statistical analysis was calculated using Spearman correlation coefficients. Results: In part I of the study, the interrater reliability of the SLOR was 0.97, as compared with 0.78 for the NLOR. The average time to interpret the global assessment of the SLOR was 16 seconds, vs 90 seconds for the NLOR. In part I1 of the study, of the 207 SLOR/NLOR pairs, 112 (54%) were assigned the same numeric value, 80 (39%) differed by one, 13 (6%) differed by two, and two (1%) differed by three, for an overall correlation of 0.58. Conclusions: Compared with NLORs, the CORD SLOR offers better interrater reliability with less interpretation time. Single-author SLORMLOR pairs submitted for a single applicant do not correlate well. Residency selection committees must decide whether the added work of interpreting NLORs is beneficial. Key words: letter of recommendation; postgraduate education; emergency medicine; resi- dency; selection. ACADEMIC EMERGENCY MEDI- CINE 1998; 5:1101-1104

    RADITIONAL narrative letters of recommen- T dation (NLORs) are a factor of the resident selection process considered to be more influen- tial than U.S. Medical Licensing Examination (USMLE) scores. Along with transcripts and the deans letter, they are a n important pre-interview source of information about a n applicants inter- personal and clinical skills.2 Accurate interpreta- tion of NLORs requires time and a significant amount of experience, and even experienced inter- preters find the task d i f f i ~ u l t . ~ Frequently, impor- tant information is missing or worded in a manner that is subject to a range of i n t e r p r e t a t i ~ n . ~

    With the aim of making data extraction more precise and efficient, the Council of Emergency Medicine Residency Directors (CORD) has devel- oped a standardized letter of recommendation (SLOR). A SLOR would be expected to require less time and experience to interpret than a NLOR. It

    From the Department of Emergency Medicine, Christ Hospital and Medical Center, Oak Lawn, IL (DVG, RCH, JD, SG). Received December 26, 1997; revision received June 11, 1998; accepted June 25, 1998. Address for correspondence and reprints: Daniel V. Girzadas Jr., MD. Department of Emergency Medicine, Christ Hospital and Medical Center, 4440 West 95th Street, Oak Lawn, IL 60453.

    would ensure tha t information considered impor- tant to residency selection committees was not omitted. The experience of the previous application cycle seems to bear this out. A separate problem has developed, however. Frequently an author of a letter of recommendation (LOR) for a single appli- cant submits both a SLOR and a NLOR. Both let- ters are usually interpreted because one cannot be certain the same information is conveyed in both formats. This increases the workload of interpret- ing recommendations. If it can be demonstrated tha t the two types of recommendations convey equivalent information, the more time-consuming NLOR would be unnecessary. This would decrease the workload of resident selection. The first objec- tive of our study was to determine whether the SLOR conveys information equivalent to tha t of the NLOR. We also measured the interrater reli- ability of both the SLOR and the NLOR. Finally, we determined the time required to make a global assessment of both types of letters.


    Study Design. This was a retrospective review of LORs received as par t of the standard application


    TABLE 1. Narrative Letter of Recommendation (NLOR) recommendations ranging from poor to outstand- ing. We believed a random selection would have provided mostly letters in the 5-6 range, since

    Classification System

    Score Classification these were most common.) All identifying charac- teristics were deleted from each letter. The NLORs were not Paired with the SLORs; the raters were given a set of 20 NLORs and a different set of 20

    Includes glowing statements such as is one of the finest medical students of the year, is one of the best medical students I have ever worked with, richly deserves the hon- ors awarded in the rotation, or receives my highest rec- - - ommendation.

    May include some honors grades, top 15-20%, near honors. Functions as an intern.

    Contains the obligatory good fund of knowledge, punc- tual, hardworking, progressed well, should be a n ex- cellent candidate for postgraduate training, along with some superlatives.

    Contains mildly complimentary but noncommittal language. Pleasantly describes a n average student and tries to put a good spin on the description.

    May be completely neutral as if the writer has never met the student, or have some subtle descriptions of the students averageness or contains slightly negative comments.

    Contains troublesome or negative comments with little or no balancing superlatives. Almost guarantees no interview.

    I s hard to come by as most students do not ask someone who dislikes them or who has been disappointed in their perfor- mance to write them a letter of recommendation. All by itself guarantees no interview.

    process between September and December 1996. A LOR could be submitted by a physician from any specialty. Letters reviewed included applicants who were rejected, interviewed, or ranked. Be- cause of the retrospective nature of this project, it was considered exempt from institutional review board review.

    Studg Protocol. In part I of our study, we estab- lished seven-point Likert-type scales for the NLOR and the SLOR.5 For the NLOR, statements were classified and were assigned a numeric value ac- cording to an unpublished classification system de- veloped by one of the investigators (RCH), and used in the residency selection process of our de- partment (Table 1). The SLOR was also assigned a numeric value of 1-7 (Table 2). If there were inconsistencies, the letter was assigned a numeric value according to the most positive phrase.

    To establish our seven-point numeric system as stable or constant, we determined its interrater reliability. Four raters evaluated the same 20 NLORs and 20 SLORs. Two raters were very experienced and two raters were inexperienced evaluators of LORs. The letters were selected nonrandomly to encompass global assessments ranging from most positive to negative. (In part I, we chose letters that would provide a spectrum of

    - SLORs. The raters were asked to assess one entire set prior to assessing the remaining set. They were asked to rank each letter according to the estab- lished seven-point Likert-type scale.

    In part I1 of our study, we examined 207 SLOW NLOR pairs. Virtually all paired letters that were submitted to our residency program in this appli- cation cycle were included. Each pair was written by a single author for a single applicant. The au- thor could be from any specialty. Each NLOW SLOR pair was interpreted by one of the same four raters as in part I using the same two ranking sys- tems described above. Blinded to the correspond- ing SLOR, each NLOR was interpreted first and assigned a numeric value. Immediately after, each SLOR was interpreted and assigned a numeric value.

    Data Analysis. Interrater reliability among the four raters for both the NLOR and the SLOR was calculated using the Kendall coefficient of concor- dance. Time of interpretation was determined by timing one experienced rater and one junior rater for a total of 80 letters.

    The numeric assignment of the SLOFUNLOR pair was correlated using the Spearman rank-or- der correlation coefficient.


    In part I of our study, we determined an interrater reliability of the SLOR of 0.97. The interrater re- liability of the NLOR was 0.78. The average time required to interpret a SLOR was 16 seconds, com- pared with 90 seconds for the NLOR. (This average time represents the sum of the time it took for a n experienced rater and an inexperienced rater to in- terpret each packet of 20 letters, divided by 40 to- tal evaluations. We did not measure the time i t took to interpret each letter).

    In part I1 of our study, 112 (54%) of the 207 SLOFUNLOR pairs were assigned the same nu- meric value. Eighty pairs (39%) differed by one point on the scale, 13 (6%) differed by two, and two (1%) differed by three. The overall correlation was 0.58.


    Accurate interpretation of LORs is essential, since decisions based on these letters can profoundly af-

  • ACADEMIC EMERGENCY MEDICINE November 1998. Volume 5.


