Annals of Emergency Medicine
Volume 40, Issue 3 , Pages 323-328, September 2002

Effect of structured workshop training on subsequent performance of journal peer reviewers

Presented at the Fourth International Congress on Peer Review in Biomedical Publication, Barcelona, Spain, September 2001.

Department of Emergency Medicine, University of California-San Francisco, San Francisco, CA (Callaham), and the Department of Emergency Medicine, University of California-Los Angeles, Los Angeles, CA (Schriger)

Received 1 February 2002; accepted 6 June 2002.

Article Outline

Abstract 

Study objective: We sought to determine whether peer reviewers who attend a formal interactive training session produce better reviews. Methods: Peer reviewers were invited to attend a formal, 4-hour, highly interactive workshop on peer review. Attendees received a sample manuscript to read and review in writing in advance. The workshop included presentations on analyzing a study and the journal's expectations for a quality review, discussion of the sample manuscript's flaws and how to address them in a review, discussion of the reviews written by the attendees, and discussion of real reviews of other manuscripts illustrating key points. The performance of attendees on the basis of standard editor quality ratings (1 to 5) was assessed for the 2 years after workshop attendance. Control reviewers matched for previous review quality and volume were selected from nonattendees of the workshop. In study 1, all average reviewers received a standard written invitation. In study 2, 75 randomly selected average reviewers were personally and actively recruited with intensive follow-up by means of e-mail and telephone calls in an effort to reduce self-selection bias. Results: In study 1, 25 reviewers volunteered for the course, were eligible for study, attended, and were compared with 25 matched control reviewers. Of attendees filling out evaluations, 19% thought it somewhat and 81% thought it very helpful. All thought it would improve their subsequent reviews, and 85% thought it would improve their review ratings. The mean change in rating after the workshop was 0.11 (95% confidence interval [CI] −0.25 to 0.48) for control reviewers and 0.10 (95% CI −0.20 to 0.39) for attendees. In study 2, of 75 reviewers intensively recruited, only 12 (41%) of those who said they would attend did. All of the participants thought the workshop would improve their performance and ratings. Test scores at the end of the workshop improved in 73% of participants compared with scores on pretests. The control reviewers' average rating changed by −0.10 (95% CI −0.49 to 0.29) versus 0.06 (95% CI −0.34 to 0.23) for attendees. Conclusion: Among invited peer reviewers, voluntary attendance at a highly structured and interactive workshop was low and did not improve the quality of subsequent reviews, contrary to the predictions of attendees. Efforts to aggressively recruit average reviewers to a second workshop were time consuming, had low success rates, and showed a similar lack of effect on ratings, despite improvement in scores on a test instrument. Workshop teaching formats, although traditional, are of unproven efficacy. [Ann Emerg Med. 2002;40:323-328.]

Author contributions: MLC conceived and designed the study, collected data, drafted the manuscript, and takes responsibility for the paper as a whole. DLS designed the study, analyzed the data, and revised the manuscript. Both authors designed and conducted the workshop.

 

See related articles, p. 313 , p. 317 , p. 329 , and p. 334 , and abstracts, p. 338 .

Back to Article Outline

Introduction 

High-quality reviewers are a crucial component in selecting quality science for publication. Most journals do not have objective methods of screening for reviewer quality, and not all reviewers are excellent. Training methods that would improve reviewer quality might be of value, but little is known about such methods. Instructional workshops are popular but require considerable logistic effort, reach only a small fraction of the potential audience, and are of unproven efficacy. We conducted 2 randomized trials to examine the efficacy of a structured half-day workshop format in improving subsequent review quality scores.

Back to Article Outline

Materials and methods 

All active peer reviewers at Annals of Emergency Medicine were screened for possible invitation to attend a formal, 4-hour, highly interactive workshop on peer review held simultaneously with a major meeting in the specialty and run by 2 senior editors of the journal (including a formally trained methodology and statistics editor). Exclusion criteria included attendance at a prior workshop, guest reviewer status, or membership on the journal's editorial board. Details of the workshop format are summarized in the Figure.

(The complete course with all educational materials and lectures is also available on CD-ROM from the authors.) The format was an informal small group with ample discussion and attendee participation throughout the discussion. Formal didactic lectures were avoided. Attendees were surveyed at the conclusion of the workshop as to their assessment of the course.

All reviews at Annals of Emergency Medicine are routinely rated by editors for quality on a previously reported 1- to 5-point scale1 (Table 1), which is similar to that validated by van Rooyen et al.2

Table 1. Definitions of editor ratings of reviews.
Rating Definition
1  Unacceptable effort and content
2  Unacceptable effort or content
3  Acceptable
4  Commendable; of use to both editor and author
5  Exceptional; hard to improve (expected to describe no more than 10% to 15% of reviews)
These scores were assessed for the 2 years after workshop attendance; the current median score of all active reviewers is 4. The unit of analysis was the individual reviewer, and this was based on intention to treat. The studies were approved by the Committee on Human Research of the University of California-San Francisco. Editors were blinded as to the identity of study participants, whether they attended the workshop, and the purpose of both studies. Reviewers were blinded to the purpose of both studies.

In study 1, all 173 peer reviewers with average review quality scores (median ≤4.0) over the preceding 2 years were sent a written invitation to attend the workshop in 1997 or 1998. No additional follow-up was made to the invitation. The workshop contents are summarized in the Figure (and listed in detail at www.acep.org/AnnEmergMed/PeerReview.html ). Control reviewers matched for previous review quality and volume were selected from invited reviewers who did not attend the workshop (nonattendees).

A problem with the format of study 1 is that reviewers were randomized by invitation and not by attendance. We could not mandate attendance, but in an attempt to reduce self-selection bias, we conducted a second study with very intensive recruitment. In study 2, 150 reviewers with average quality scores during the previous 2 years were randomized (StatView 5.0, SAS, Carey, NC) to be invited to the previously described workshop in October 1999 or not to be invited. If invited reviewers did not respond to letters and e-mails, follow-up telephone calls were made first by journal staff and, ultimately, by the first author. Efforts at contact were continued until a specific response was obtained from those invited. Those not invited served as the group from which control reviewers matched for previous review quality and volume were selected. A test covering basic elements of peer review discussed during the workshop was administered immediately before and after this workshop (on the same day).

Back to Article Outline

Results 

Twenty-five reviewers volunteered for the first course, were eligible for study, attended, and were compared with 25 matched control reviewers. Of attendees filling out evaluations, 19% thought it somewhat and 81% thought it very helpful. All thought it would improve their subsequent reviews, and 85% thought it would improve their review ratings. Fifty participants had sufficient data for analysis, completing 217 rated reviews before the workshop and 289 after the workshop. The quality scores of this group are reported in Table 2. All quality scores were assigned by editors blinded to workshop attendance and were analyzed blinded to study group identity.

Table 2. Study 1 (standard invitation).
Variable After WorkshopControl Reviewers (n=25)Workshop Attendees (n=25)
Reviews in prior 2 y6.525.68
Mean rating of reviews in prior 2 y3.533.76
Reviews after workshop8.688.56
Mean rating of reviews after workshop3.653.86
Mean rating change (95% CI)0.11 (−0.25 to 0.48)0.10 (−0.20 to 0.39)

CI, Confidence interval.

In study 2, of 75 reviewers randomized to the intervention group and invited in multiple contacts, 29 agreed to attend, and 12 attended. All concluded after attendance that the workshop would improve their subsequent reviews and review ratings. Posttest scores improved in 73% of participants (an average 23% improvement in score) compared with pretest scores. Eleven attendees had sufficient data for analysis (a total of 106 rated reviews before the workshop and 79 after the workshop), and the scores are reported in Table 3. Blinding was the same as for study 1.

Table 3. Study 2 (intense recruitment).
Variable After WorkshopControl Reviewers (n=11)Workshop Attendees (n=11)
Reviews in prior 2 y5.364.82
Mean rating of reviews in prior 2 y3.623.67
Reviews after workshop3.724.45
Mean rating of reviews after workshop3.523.62
Mean rating change (95% CI)−0.10 (−0.49 to 0.29)0.06 (−0.34 to 0.23)

Back to Article Outline

Discussion 

Workshops to educate editors and reviewers are a common educational method used by universities, journals, and professional societies, such as the Council of Science Editors (www.councilscienceeditors.org ). Our journal had been offering such a course for more than 15 years, and it has been very popular. In 1998, we first assessed its effect on subsequent reviewer performance and found no benefit.3 We thought that lack of effect might be caused by the fact that its format was not sufficiently rigorous (although it was typical of other courses) and also that it included many reviewers with good performance scores who might not have had much room for improvement. Therefore, we limited subsequent studies to average reviewers only, and we revised the workshop substantially, making it much more specific and focused and enhancing the interaction with attendees. This resulted in a workshop modeled after evidence-based learning methods and requiring critical appraisal and written review of a manuscript in advance by each participant (Figure). We then assessed the results again.

Our first study invited all average reviewers and examined the effect of the workshop teaching method on those who chose to attend. Although attendees uniformly thought it was very helpful and would improve their subsequent reviews and review ratings, it did not (Table 2). We thought that one reason might be the self-selection of a particular subgroup of reviewers who attended the course, and therefore, in the second study, we randomly selected only half the group of average reviewers to invite. We then aggressively tried to ensure the attendance of this group. This can only be described as a major failure because only 29 of 75 invited reviewers actually agreed to attend despite a labor-intensive recruiting effort involving multiple personal contacts. Of those who agreed, only 12 (41%) actually attended, a response rate very different from that for previous workshops. Those who did attend were quite enthusiastic about the workshop and its benefits and improved their knowledge of peer review in a posttest. Nonetheless, there was no evidence of benefit in their reviews (Table 3).

The workshop or small discussion group format is so ingrained in scientific and medical education that our failure to find any benefit might seem to indicate a failure of proper teaching materials or technique. However, there is actually surprisingly little scientific study of this popular teaching format. Most of it has been conducted in the setting of medical school or resident journal clubs. We have not been able to identify any other studies examining the effects on experienced physicians or peer reviewers for journals. Medicine interns randomized to 5 journal club sessions on critical appraisal skills believed their reading skills were significantly improved, but their performance on an objective test of critical appraisal skills did not improve at all compared with that of a control group who received no such instruction.4 Similar results were obtained in a study of informal journal clubs over an 8-month period for pediatric residents.5 Stern et al6 has demonstrated that, in a validated objective test of critical appraisal of a manuscript, readers' self-appraisal of their skills is unrelated to their actual performance (R =0.15). Bazarian et al7 conducted a series of 12 one-hour conferences on evidence-based medicine with case-based studies, structured evaluation, close faculty supervision, and critical evaluation of a fabricated manuscript before and after the training. These were compared with traditional teaching sessions. The intervention group improved less than the control group. A recent meta-analysis of studies of teaching critical appraisal by using the small discussion group technique concluded there was no evidence supporting its efficacy.8

Our studies have limitations created by the very format itself, in that attendance was essentially voluntary (and thus self-selected), regardless of what type of recruitment occurred. Nonetheless, one would think this would attract more motivated reviewers who might be more likely to demonstrate a benefit, especially because we limited attendance to reviewers with average performance who had room for improvement. However, we do not know whether our results would apply to the great majority of average reviewers who declined our invitation. A traditional controlled trial randomized as to attendance and not invitation would be the ideal way to study this training method but would require a method that would reliably force the voluntary reviewers typical of most journals to attend a training program despite their interests and schedules. In our case, under such circumstances, only 41% of those who said they would attend did. Our format was as detailed and structured as a workshop of this duration can be and, in general, was more so than some similar courses offered by journals and scientific organizations. Perhaps a longer course lasting several days, with far more individualized attention and extensive personal tutoring on the writing of reviews, might have an effect. However, even 1-day courses are already demanding of resources, and an extended course would require a great deal of time and effort to produce. Such a course would require more time and expense on the part of reviewers and thus likely would be attended by an even smaller proportion of them.

We believe the quality of some reviewers at many journals could stand improvement. Recruiting and retaining qualified peer reviewers is essential, but few editors personally know the capabilities of most of their reviewers. Most journals do not assess reviewers' backgrounds in research methodology or critical literature review,9 and in selecting both reviewers and editorial board members, editors report they primarily seek expertise in their subject area and prior publications in the field and not methodologic expertise on the part of the reviewer.10 However, the same editors report that study quality and methodology are the most important factors in accepting manuscripts. Not only do editors not select reviewers for the qualities they say are most important, but in the only available survey, only 3% of social and behavioral sciences editors reported providing any training for reviewers of any kind.11

Although popular with those who attended (like others of its kind), our small workshop format was not successful in improving the subsequent quality ratings of reviewers for our journal. We believe this finding is an aberration but might indicate that this short interaction is insufficient to change behavior. More intensive efforts, perhaps provided through carefully constructed Internet- or CD-ROM-based instruction and feedback over a period of time, warrant investigation. We have already investigated simple written feedback on reviews and also found it not to affect a change.12 It is yet to be determined whether really good reviewers are born (come already qualified to the journal) or can be made (improve their performance during their reviewer career). If it is the former, the importance of valid screening tools for initial selection of reviewers by journals is all the greater.

Back to Article Outline

References 

  1. Callaham ML, Baxt WG, Waeckerle JF, et al.  Reliability of editors' subjective quality ratings of peer reviews of manuscripts. JAMA. 1998;280:229–231
  2. van Rooyen S, Black N, Godlee F. Development of the review quality instrument (RQI) for assessing peer reviews of manuscripts. J Clin Epidemiol. 1999;52:625–629
  3. Callaham ML, Wears RL, Waeckerle JF. Effect of attendance at a training session on peer reviewer quality and performance. Ann Emerg Med. 1998;32:318–322
  4. Linzer M, Brouwn J, Frazier L, et al.  Impact of a medical journal club on house-staff reading habits, knowledge, and critical appraisal skills. JAMA. 1988;260:2537–2541
  5. Langkamp DL, Pascoe JM, Nelson DB. The effect of a medical journal club on residents' knowledge of clinical epidemiology and biostatistics. Fam Med. 1992;24:528–530
  6. Stern DT, Linzer M, O'Sullivan PS, et al.  Evaluating medical residents' literature-appraisal skills. Acad Med. 1995;70:152–154
  7. Bazarian JJ, Davis CO, Spillane LL, et al.  Teaching emergency medicine residents evidence-based critical appraisal skills: a controlled trial. Ann Emerg Med. 1999;34:148–154
  8. Norman GR, Shannon SI. Effectiveness of instruction in critical appraisal (evidence-based medicine) skills: a critical appraisal. CMAJ. 1998;158:177–181
  9. Schulman K, Sulmasy DP, Roney D. Ethics, economics, and the publication policies of major medical journals. JAMA. 1994;272:154–156
  10. Lebeau DL, Steinmann WC, Michaels RK. A survey of journal editors regarding the review process for original clinical research . Presented at the Third International Congress on Peer Review and BioMedical Publication; September 17-21, 1997; Prague, Czech Republic.
  11. Cicchetti D. The reliability of peer review for manuscript and grant submissions: a cross-disciplinary investigation. Behav Brain Sci. 1991;14:119–186
  12. Callaham ML, Knopp RK, Gallagher EJ. Effect of written feedback by editors on quality of reviews: two randomized trials. JAMA. 2002;287:2781–2783

 Reprints not available from the authors. Address for correspondence: Michael L. Callaham, MD, Department of Emergency Medicine, University of California, Box 0208, San Francisco, CA 94143-0208; 415-353-5885, fax 415-353-1799; E-mail mlc@medicine.ucsf.edu.

PII: S0196-0644(02)00047-1

doi:10.1067/mem.2002.127121

Refers to article:

  • Research into peer review and scientific publication: Journals look in the mirror

    Michael L. Callaham
    Annals of Emergency Medicine September 2002 (Vol. 40, Issue 3, Pages 313-316)

  • Graphical literacy: The quality of graphs in a large-circulation journal

    Richelle J. Cooper, David L. Schriger, Reb J.H. Close
    Annals of Emergency Medicine September 2002 (Vol. 40, Issue 3, Pages 317-322)

  • The use of dedicated methodology and statistical reviewers for peer review: A content analysis of comments to authors made by methodology and regular reviewers

    Frank C. Day, David L. Schriger, Christopher Todd, Robert L. Wears
    Annals of Emergency Medicine September 2002 (Vol. 40, Issue 3, Pages 329-333)

  • The effect of dedicated methodology and statistical review on published manuscript quality

    David L. Schriger, Richelle J. Cooper, Robert L. Wears, Joseph F. Waeckerle
    Annals of Emergency Medicine September 2002 (Vol. 40, Issue 3, Pages 334-337)

Annals of Emergency Medicine
Volume 40, Issue 3 , Pages 323-328, September 2002