Journal Home
Search for

Volume 53, Issue 4, Pages 536-543 (April 2009)


View previous. 29 of 41 View next.

Journal Club questionsEmpiric Antibiotic Therapy for Sepsis Patients: Monotherapy With β-Lactam or β-Lactam Plus an Aminoglycoside? Answers to the November 2008 Journal Club Questions

Teri A. Reynolds, MD, PhDa, Tyler W. Barrett, MD (Section Editor)b, David L. Schriger, MD, MPH (Section Editor)c

Refers to article:
Empiric Antibiotic Therapy for Sepsis Patients: Monotherapy With β-Lactam or β-Lactam Plus an Aminoglycoside? , 25 February 2008
Richard Sinert, Leah Bright
Annals of Emergency Medicine
November 2008 (Vol. 52, Issue 5, Pages 557-560)
Full Text | Full-Text PDF (76 KB)
Sinert R, Bright L. Empiric Antibiotic Therapy for Sepsis Patients: Monotherapy With β-Lactam or β-Lactam Plus an Aminoglycoside? Ann Emerg Med 2008;52:557-560; doi:10.1016/j.annemergmed.2007.12.013
Teri A. Reynolds, David L. Schriger, Tyler W. Barrett
Annals of Emergency Medicine
November 2008 (Vol. 52, Issue 5, Pages 561-562)
Full Text | Full-Text PDF (128 KB)

Article Outline

Discussion Points

Answer 1

Answer 2

Answer 3

Answer 4

Answer 5

Randomization, Allocation Concealment, and Blinding

Outcome Evaluation

Answer 6

Answer 7

References

Copyright

Discussion Points 

return to Article Outline


1. What was the impetus for conducting this meta-analysis? What are the potential drawbacks and advantages of single versus combined-agent therapy for sepsis? What is the research question of this review?

2. Editor's note: Although the Maclure and Schneeweiss Episcope figure did not appear with the original questions, a modified version is reproduced here with the answers. Journal club leaders may wish to share this version with participants. The Episcope can be a useful way to organize thoughts and discussion on any article. If you find it helpful here, consider using it consistently during several journal club sessions.Please examine Figure 1 of the article “Causation of Bias: The Episcope,” by Maclure and Schneeweiss (http://www.epidem.com/pt/re/epidemiology/fulltext.00001648-200101000-00019.htm, and reproduced here as Figure 1 (see below)). The figure depicts 10 layers of potential distortion that can render the results of a systematic review a biased estimate of the truth. Briefly discuss how each layer might produce bias in this particular systematic review. Which layers pose the greatest threats? When answering questions 2-5, try to identify the relevant layer(s) of the Episcope.

View full-size image.

Figure 1. The Episcope. The user looks at the output of the device (level K) and sees the “known” risk difference (kRD) (or any other measure of effect). The known risk difference results from information transmitted, such as light waves through a telescope, from a causal (“etiologic”) risk difference (aRD) in a target population, through layers of lenses and filters. Each layer is a distinct domain in which certain types of biases operate, potentially further distorting the estimate of RD from its true values (aRD). It is only by considering the biases introduced at each of the 10 levels that we can determine to what degree kRD is an accurate proxy for aRD. This figure was produced by Maclure and Schneeweiss for their article “Causation of Bias: The Episcope.” Their original figure was developed for case-control epidemiologic studies. We have modified their figure slightly to include the bolded terms at each level. These terms indicate the kinds of bias that might be introduced at each level in a randomized trial. From Maclure M, Schneeweiss S. Causation of bias: the episcope. Epidemiology. 2001;12:114-122. Used with permission from Lippincott Williams & Wilkins, Baltimore, MD.



3.A.What is the difference between a systematic review and a narrative review? To what does the term “meta-analysis” refer? Why is it important for a systematic review to articulate its search strategy a priori?

3.B.The yield of electronic database searches may be limited even when they are performed by trained medical librarians (Felson DT. Bias in meta-analytic research. J Clin Epidemiol. 1992;45:885-92). What techniques did these authors use to ensure that they included as many eligible studies as possible?

4. Referring to the fact that this review includes trials comparing broad spectrum β-lactams to narrow-spectrum β-lactams (only 20/64 trials used the same β-lactam in both arms), Sinert writes “The ‘apples to oranges limitation’ is somewhat ameliorated by the large number of trials reviewed…” Do you agree?

5. What do the authors mean by “quasi-randomized” trials? What are the possible pitfalls of including such trials? What elements contributed to the overall “poor” quality of trials in this study?

6.A.Meta-analyses are designed to calculate a summary effect and the relevant confidence interval. Describe the results of this meta-analysis in terms of the summary effect. What are the implications for the use of antibiotics in sepsis?

6.B.What assumptions are made when study populations are combined and a summary effect reported? What is the effect of combining studies on the precision of the result? What characteristics of a combined analysis contribute to the change in precision? What is the effect of combining studies on bias? Beginning from the level of the individual published paper (layer “J” in the Episcope), describe the further layers of bias that might intervene in this article's presentation of results.

7. The protocol for a systematic review should include descriptions of the following: the research question, methods used for identifying eligible studies, methods of data abstraction, and statistical methods (the meta-analysis). Does the Annals summary of the Cochrane review fully describe all four of these components? From a reader's perspective, what are the benefits and limitations of a 4-page summary review of a 117-page review of 64 trials?

Answer 1 

return to Article Outline

Q1. What was the impetus for conducting this meta-analysis? What are the potential drawbacks and advantages of single versus combined-agent therapy for sepsis? What is the research question of this review?

As the original Cochrane review describes it, “Optimal antibiotic treatment for sepsis is imperative. Combining a β-lactam antibiotic with an aminoglycoside antibiotic may have certain advantages over β-lactam monotherapy” according to in vitro studies showing synergistic efficacy. On the other hand, “combination therapy may have several drawbacks, such as an increased rate of adverse effects.”1 The authors go on to describe previous studies of various quality and implications, including their own nonrandomized prospective study of bacteremic patients,2, 3 which suggested the adequacy of β-lactam monotherapy; a prospective observational study of patients with Pseudomonas bacteremia3 that is limited by the fact that some patients received aminoglycoside monotherapy, which may increase mortality; a meta-analysis that included nonrandomized studies and did not separately analyze β-lactam monotherapy4; and one previous meta-analysis of randomized trials that focused on febrile neutropenic patients.5 The current analysis is motivated by the lack of a comprehensive review of high-quality studies comparing β-lactam monotherapy to combination therapy in patients with sepsis.

As Sinert describes it, “The objective of this review is to examine the efficacy of monotherapy with β-lactam antibiotics versus the standard β-lactam-aminoglycoside antibiotic combination in treatment of sepsis, with regard to all-cause mortality and an estimation of the rate of adverse effects with each treatment.”

Answer 2 

return to Article Outline

Q2. Please examine Figure 1 of the article “Causation of Bias: The Episcope,” by Maclure and Schneeweiss (http://www.epidem.com/pt/re/epidemiology/fulltext.00001648-200101000-00019.htm, and reproduced here as Figure 1). The figure depicts 10 layers of potential distortion that can render the results of a systematic review a biased estimate of the truth. Briefly discuss how each layer might produce bias in this particular systematic review. Which layers pose the greatest threats? When answering questions 2-5, try to identify the relevant layer(s) of the Episcope.

Layer A of the Episcope represents the truth, the exact magnitude and direction of the relationship between choice of antibiotics treatment and patient outcome.

In layer B, random variation may introduce error, meaning that by chance alone, a particular sample's distribution differs from that of the target population. For example, there may be proportionately more deaths in this study population of monotherapy patients than in the population of all patients who received monotherapy. The threat of such random error decreases with increasing sample size and, over multiple experiments, should not consistently favor one outcome or group over another.

In layer C, nonrandom confounding may generate bias. A factor other than the intervention under study, but correlated with the intervention and outcome, might alter the apparent relationship between intervention and outcome. A classic example of systematic confounding is early research suggesting that birth order affected the rate of Down syndrome, ie, that third, fourth, or fifth children were more likely to have Down than first- or second-born children. In fact, all children born to older mothers are more likely to have Down, independent of birth order. A fifth child born to a 25-year-old mother would be at low risk, whereas a first child born to a 40-year-old mother would be at higher risk. The relationship between birth order and Down was confounded by maternal age. Maternal age was correlated with birth order because mothers having their fourth child were, on average, older than those having their first. And maternal age is correlated with the probability of having a child with Down. Thus, the seeming relationship between birth order and the probability of Down syndrome disappears once adjusted for maternal age. For a detailed discussion of confounding, see Answer 2 to the May 2008 Journal Club.6

In the current review, as discussed in question 4, the use of different β-lactams in the monotherapy and dual therapy limbs may confound the effect of the addition of an aminoglycoside. And, as discussed in question 5, inadequate forms of randomization and group assignment may introduce confounding.

In layer D, bias may be introduced simply by deficient means of evaluating outcomes. “Hard” outcomes such as mortality should be straightforward to measure if follow-up is good, whereas “softer” outcomes may be greatly affected by how they are defined or measured. For example, depending on how “clinical failure” is defined, the effect of combination therapy may be over- or underestimated.

In layer E, if the separation between trial arms is not maintained—if, for example, subjects in the monotherapy arm were actually exposed to a dose of aminoglycoside—then the intervention effect may be diluted.

In layer F, missing data can substantially compromise the value of study results because, at best, they compromise the statistical power of the study by reducing the sample size, and at worst, they introduce selection bias if “missingness” is not random.

In layer G, treating a continuous variable as a categorical variable can create an outcome difference where none exists. Imagine, for example, a continuous laboratory value that ranges from 3 to 30 and is generally considered negative below 6 and positive above. If the values in one study arm cluster tightly around 6.2 and those in the other arm cluster tightly around 5.8, a continuous treatment of this variable would lead to the conclusion that the variable is of little import, whereas a categorical analysis might suggest that this variable is highly important. When distributions are bi- or trimodal, great care must be taken in defining whether continuous or categorical approaches will lead to the most valid and robust analysis. See Royston et al7 for further discussion.

In layer H, systematically flawed follow-up (if, for example, patients who do better leave the hospital and are more likely to be lost to follow-up) can produce biased estimates. In addition, an effect that appears at one time interval may not appear earlier or later. For example, an intervention may affect survival to hospital discharge but have no effect on 30-day mortality, or methods of wound care may have different cosmetic results at 30 days but not at 6 months. In general, follow-up protocols should be clearly specified a priori to ensure that they are not manipulated to achieve the desired outcome. As Sinert noted in this review, “follow-up was specified in only 67% of the studies.”

Layer I represents the bias that may come from the mismodeling or misinterpretation of results. Not all modeling techniques are appropriate for all data. For example, one could run a linear regression on 2 variables that have a zero, or flat, slope (indicating that there is no relationship between the variables), but if the relationship between the variables was U-shaped, the regression results would have no meaning.

Layer J represents publication bias (the tendency that positive findings are published more often than negative ones), which is discussed briefly in question 3a below and will be discussed in a future Journal Club. Like most reviews, this one depends primarily on published data (though these authors also surveyed experts for any other available data; see question 3b), and a bias toward positive findings may be generated in this layer.

Answer 3 

return to Article Outline

Q3.a What is the difference between a systematic review and a narrative review? To what does the term “meta-analysis” refer? Why is it important for a systematic review to articulate its search strategy a priori?

Consider layer K of the Episcope.

A review addresses a research question by summarizing and interpreting the results of a series of relevant articles. In a narrative review, the search strategy and determination of article relevance is left to the author of the review. Authors may choose any articles they like, according to what they consider to be important journals or important articles, or articles that share their point of view. No search strategy need be articulated before the search itself, and the chosen search strategy is commonly not described in the review. The selection of articles used in a narrative review is not a reproducible process. Similarly, the determination of which articles are relevant to the research question and which results will be included in a narrative review is left to authorial judgment and may be based on criteria not available to a reader. Bias may intervene at any of these stages: the original search may be limited, articles may be excluded on any grounds, and the review author's summary may prioritize results from certain studies (according to familiarity, geography, positive results, whim, or any other criterion) or certain results within studies (selectively reporting those study outcomes that serve the author's point of view, for example). Narrative reviews, then, reflect the “expert” judgment of the author (which may be desirable in some cases), but they are not robust in their representation of the available data.

A systematic review, by contrast, is a summary of results generated by a well-described search strategy that aims to capture all relevant and methodologically sound studies addressing a given research question. Systematic reviews provide a detailed description of their search strategy and articulate all exclusion criteria. The protocol of a systematic review should be explicit enough that its search is reliably reproducible by other researchers.

The systematic nature of a review does not, of course, exclude bias. The individual studies that compose the review can be biased and, because reviews generally depend on searches of published material, both narrative and systematic reviews are subject to publication bias (the tendency that positive findings are published more often than negative ones), for example. If negative studies have been systematically suppressed and not published, no degree of analysis of published studies will account for this. Techniques for evaluating the degree—and minimizing the influence—of publication bias will be discussed in a future Journal Club.

To what does the term “meta-analysis“ refer?

Although the combination of studies to increase sample size dates back to the early 1900s, the term meta-analysis was introduced in 1976 by Gene Glass, a researcher working in education and the social sciences: “My major interest currently,” he writes, “is in what we have come to call—not for want of a less pretentious name—the meta-analysis of research. The term is a bit grand, but it is precise, and apt…meta-analysis refers to the analysis of analyses. I use it to refer to the statistical analysis of a large collection of analysis results from individual studies for the purpose of integrating the findings. It connotes a rigorous alternative to the casual, narrative discussions of research studies which typify our attempts to make sense of the rapidly expanding research literature.”8

The “meta-analysis” is the statistical analytic component of a systematic review and includes an evaluation of the variability among individual studies (the heterogeneity), as well as the determination of a summary result or “summary effect.” Not all systematic reviews are meta-analyses. Meta-analyses use a variety of statistical techniques to evaluate and minimize forms of bias and error generated by the combined analysis of disparate study populations. Layer K in the Episcope represents the “combinatorial bias” that may be added to the existing individual study biases that have accumulated in layers B to J (see also the discussion of “validity” in question 6b below; combinatorial bias will be discussed in a future Journal Club.)

Why is it important for a systematic review to articulate its search strategy apriori?

A systematic search should articulate its search strategy before executing the search to ensure that the search strategy cannot be manipulated to include or exclude particular results.

Q3.b The yield of electronic database searches may be limited even when they are performed by trained medical librarians (Felson DT. Bias in meta-analytic research. J Clin Epidemiol. 1992;45:885-92). What techniques did these authors use to ensure that they included as many eligible studies as possible?

The authors searched multiple databases, including the Cochrane Central Register of Controlled Trials (CENTRAL), MEDLINE, EMBASE, and LILACS, as well as the proceedings of the Interscience Conference of Antimicrobial Agents and Chemotherapy.

To increase the yield of the initial search, they used multiple overlapping search terms combined with “OR” (aminoglycoside OR amikacin* OR tobramycin*, etc) and partial search terms (bacter* OR bacteremia).

They also evaluated all trials and any “major reviews” cited in the studies that resulted from the initial database search. This is sometimes termed “ancestral searching.”

In addition, as they note in the original Cochrane review, “We contacted the first or corresponding author of each included study, and the researchers active in the field, for information regarding unpublished trials or complementary information on their own trials.”

They included publications in any language.

Answer 4 

return to Article Outline

Q4. Referring to the fact that this review includes trials comparing broad spectrum β-lactams to narrow-spectrum β-lactams (only 20/64 trials used the same β-lactam in both arms), Sinert writes “The ‘apples to oranges limitation’ is somewhat ameliorated by the large number of trials reviewed…” Do you agree?

This review included both trials comparing monotherapy with a broad-spectrum β-lactam to combination therapy with a broad-spectrum β-lactam plus an aminoglycoside, and trials that compared a broad-spectrum β-lactam to a narrow-spectrum β-lactam plus an aminoglycoside. A narrow-spectrum β-lactam alone would not have been considered adequate therapy. With the results of these trials in mind, these apples and oranges differ substantially in that the addition of an aminoglycoside seems to confer no additional protection, whereas the narrowing of the β-lactam spectrum seems to confer additional risk (an increase in clinical failure and a suggestion of increased all-cause fatality). In some trials, then, the change in β-lactam may confound the addition of the aminoglycoside—in other words, a difference in outcome could be attributed to the change in β-lactam, to the addition of the aminoglycoside, or to both. Only the studies comparing the same (broad-spectrum) β-lactam in both arms can actually isolate and address the effect of adding an aminoglycoside to empiric therapy (the question this review is designed to address). Sinert suggests that the large number of trials ameliorates the problem. In fact, only separate analysis of the apples and oranges truly ameliorates the problem, and this separation effectively creates 2 meta-analyses. The large number of trials only allows that there are adequate trials in each group to preserve a decent sample size under separate analysis. Increasing sample size (“the large number of trials reviewed”) increases precision but does not affect validity (see question 6b below).

Answer 5 

return to Article Outline

Q5. What do the authors mean by “quasi-randomized” trials? What are the possible pitfalls of including such trials? What elements contributed to the overall “poor” quality of trials in this study?

Consider layers B and C of the Episcope.

Two studies9, 10 were “quasi-randomized”; rather than placing subjects in treatment groups by a randomly generated assignment, their protocols used the last digit of patient identification numbers (in one case, assigning treatment by evens and odds).

There are 2 crucial aspects of randomization: how random assignment is generated (allocation generation) and how the eventual group assignment is concealed from the prospective subject and the recruiters at enrollment (ie, allocation concealment).

Randomization itself is designed to ensure that treatment groups are as similar as possible so that any difference detected between the groups can be plausibly attributed to the study intervention. Although it is unlikely that patient identification numbers would introduce a confounder that systematically affected outcomes in sepsis (eg, it is unlikely that patients who have an odd final digit of their medical record number are systematically healthier or sicker that those whose number is even), other quasi-randomizing techniques might. If, for example, patients presenting on Monday, Tuesday, and Wednesday are assigned to one treatment group and those presenting on Thursday, Friday, Saturday, and Sunday, to another, then a difference in outcome might be attributable to weekend staffing (which has been shown to affect inpatient care).

More important, though, the use of medical record numbers to generate allocation precludes allocation concealment because treatment-group assignment is obvious to anyone who knows (or can deduce) the assignment strategy and has access to the patient's medical record number. A researcher who supports monotherapy, for example, may be less likely to enroll a very ill patient with a medical record number that destines him or her for the monotherapy group. An additional complication created by “quasi-randomization” techniques is that they impair blinding, which could make the study vulnerable to differential treatment of the groups (see “blinding” and “cointervention” below).

What elements contributed to the “overall” poor quality of trials in this study?

Consider layers E and H of the Episcope.

Randomization, Allocation Concealment, and Blinding 

Allocation generation (randomization) was deemed “adequate” in 53% of the studies (34/64). Twenty-eight studies did not describe their allocation protocol, and, as mentioned above, 2 studies were only “quasi-randomized” by patient identification numbers.

Only 33% (21/64) of the included studies reported “adequate” allocation concealment, meaning that treatment group assignment was effectively hidden from those enrolling patients in the study. Several studies did not report on concealment (34 studies), or “envelopes were used but not described as sealed or opaque” (7 studies). [Editor's note: A brief advanced aside: Element 9 of the CONSORT statement (available at http://www.consort-statement.org/index.aspx?o=1026) states that all randomized trials should describe their method of allocation concealment. We have never seen an article that says, “we attempted allocation concealment by placing group assignments in consecutive numbered envelopes but the envelopes were cheap and flimsy and the assigned group could easily be gleaned by holding the envelope against any standard light source.” Consider whether reporting guidelines such as CONSORT increase the quality of trials or increase the white lies told by investigators as they submit their work for publication in prestigious medical journals.]

Both allocation generation and concealment were adequate in only 30% of the studies (19/64).

Similarly, “blinding,” or concealment of the intervention being used in any given patient, is an important strategy to avoid bias. Studies may be “double blinded,” meaning that the intervention is concealed from both subjects and observers, or may be blinded only to certain participants, such as patients, providers, or researchers. Only 2 studies in this review were double blinded. Researchers evaluating outcomes were blinded in 4 studies, and clinicians were blinded in 1 study.1

The failure to conceal allocation or to blind subjects and observers provides a considerable opportunity for the introduction of systematic bias because conscious and unconscious behaviors on the part of subjects, practitioners, and staff (based on beliefs about the intervention under study) may substantially affect outcomes. The failure to blind caregivers may generate systematic “cointerventions” if subjects in one group are systematically treated differently by their caregivers. It may also cause biased evaluation of outcomes.

For a discussion of randomization, allocation concealment, and blinding, see the answer to question 4 of the May 2008 Journal Club.6

Outcome Evaluation 

Sixty-seven percent (43/64) of studies in this review specified follow-up duration, whereas only 28% (18/64) of studies defined a specific period for outcome evaluation.

Failure to prespecify an outcome evaluation protocol may also introduce bias if, for example, the timing of outcome evaluation can be manipulated to affect outcomes. A reviewer's ability to evaluate methodological heterogeneity and study quality is also limited for articles with missing or partial descriptions of their outcome evaluation protocol.

See also Additional Table 03, “Study quality assessment table,” in the original review.1

Answer 6 

return to Article Outline

Q6.a A meta-analyses are designed to calculate a summary effect and the relevant confidence interval. Describe the results of this meta-analysis in terms of the summary effect. What are the implications for the use of antibiotics in sepsis?

The Cochrane review1 reports its results as follows: “In studies comparing the same β-lactam [eg, using the same β-lactam in the monotherapy and combination-therapy arms], we observed no difference between study groups with regard to all-cause fatality, RR 1.01 (95% CI 0.75-1.35) and clinical failure, RR 1.11 (95% CI 0.95-1.29). In studies comparing different β-lactams, we observed an advantage to monotherapy: all cause fatality RR 0.85 (95% CI 0.71-1.01), clinical failure RR 0.77 (95% CI 0.69-0.86).”

The authors also note that “[n]ephrotoxicity was significantly more frequent with combination therapy,” and the relative risk (RR) for the protective effects of monotherapy is described as RR 0.30 (95% confidence interval 0.23 to 0.39).

As the results are reported above, an RR of death, clinical failure, or nephrotoxicity of less than 1 would favor monotherapy. An RR of greater than 1 would favor combination therapy. See the answer to question 3 in the March Journal Club for a full discussion of confidence intervals.11

To the degree that they accurately represent the combined study populations, these results suggest that monotherapy with a broad-spectrum β-lactam alone is likely an adequate empiric regimen for sepsis because it does not increase rates of mortality or treatment failure in patients with sepsis compared with combination (β-lactam/aminoglycoside) therapy. As Sinert writes, “In fact, monotherapy (different β-lactams) compared to combination therapy significantly improved fatality from non-urinary tract infections. Clinical failures were less common from all causes with monotherapy (different β-lactams) than combination therapy, secondary to favorable outcomes in treating bacteremia and non-urinary tract infections.”

An additional implication on the difference between narrow- and broad-spectrum β-lactam therapy emerges from these results. Because the advantage to monotherapy emerges when the “oranges” are analyzed separately (as discussed above in question 4, the trials using different β-lactams in each arm typically use a broad spectrum in the monotherapy arm and a narrow spectrum in the combined group), there is likely a protective effect to the use of a broad-spectrum β-lactam.

Finally, the results also suggest that the addition of an aminoglycoside to β-lactam therapy may increase nephrotoxicity, though not the overall rate of adverse effects. The “protective” effect of monotherapy is not adequately explained, then, by increased adverse effects with combined therapy (because the overall rate of adverse effects was the same).

The applicability of these results may be affected by many factors, including changing infection and sensitivity patterns (the increasingly widespread addition of MRSA coverage to an empiric sepsis regimen, for example, might change rates of both mortality and nephrotoxicity).

Q6.b What assumptions are made when study populations are combined and a summary effect reported? What is the effect of combining studies on the precision of the result? What characteristics of a combined analysis contribute to the change in precision? What is the effect of combining studies on bias? Beginning from the level of the individual published paper (layer “J” in the Episcope), describe the further layers of bias that might intervene in this article's presentation of results.

The authors also note that “[n]ephrotoxicity was significantly more frequent with combination therapy,” and the relative risk (RR) for the protective effects of monotherapy is described as RR 0.30 (95% confidence interval 0.23 to 0.39).

When results from different studies are combined to generate a summary effect, the studies are assumed to be similar enough that they can be treated, for the purposes of the analysis, as part of one larger study. This assumption is, of course, always imperfect. Studies may differ in terms of the study population (demographics or other baseline characteristics), methodology (design, allocation, interventions, outcome evaluation), or results (“statistical heterogeneity”). The “forest” or “box and line” plot provides some visual representation of the relationship among individual study results. All else being equal, the more similar study populations are, the more likely they are to generate similar study results, and the more “combinable” they are. At the same time, the more narrowly defined a study population is, the less generalizable the study results.


View full-size image.

Figure 2. The figure on the left represents scattered values that are somewhat distant from each other (low precision) and yet, on average, approach the target's center (high accuracy/low bias). The figure on the right shows several values that cluster closely together (high precision) and yet, on average, remain distant from the target's center (low accuracy/high bias).


Typically, a weighting scheme is used to determine the relative influence of each study on the combined results. The weights can be determined by sample size, within-study variation, study quality, or some combination of these. Studies can be combined by using fixed-effects or random-effects models. These models will be discussed in a future Journal Club, but in brief:

“A fixed-effects model assumes that there is a single ‘fixed’ effect that every study will approximate. That is, if every study were infinitely large, every study would yield an identical result. A random-effects model, on the other hand, assumes that the results of individual studies form a distribution of effects that has some central value and some degree of variability. The random effects model makes fewer assumptions about the variability in the analysis and so is more cons03/13/09ervative than the fixed effects model.”12

What is the effect of combining studies on the precision of the result? What characteristics of a combined analysis contribute to the change in precision? What is the effect of combining studies on bias?

Smaller studies can be combined to generate a larger effective sample size, which will increase the precision of a study. In other words, with a larger sample size, the effect of random variation is decreased. Increasing sample size, however, does not necessarily affect bias. Repeating the same (biased) study on an ever-larger population, although it may reflect an increasing level of precision, does nothing to increase the accuracy of the study, the relationship between the measured outcome and the actual outcome (the truth). Bias affects accuracy and cannot be countered by increasing precision. One could design an extremely precise but biased rocket model, for example, in which all rockets fall within 10 feet of one another, though miles from the target (the truth).

The process of combining studies, then, is an opportunity for the introduction of systematic error and may actually increase bias (or decrease accuracy). The degree to which a study represents what it purports to is usually referred to as its validity, and this is a type of accuracy. Bias may affect the internal validity of a study (how well the study instrument measures what it is supposed to) or its external validity (how well its study sample represents other populations or how generalizable its results are).

Beginning from the level of the individual published paper (layer J in the Episcope), describe the further layers of bias that might intervene in this article's presentation of results.

Individual articles emerge from the Episcope at level I. Whether they ever see the light of day (are published) is determined in level J. Remember that publication bias (the tendency that positive findings are published more often than negative ones) in layer J of the Episcope may exclude many relevant results and that no subsequent adjustment of published material can compensate for this. (See the discussion of publication bias in question 3a above.)

In level K of the Episcope, the review search itself can introduce bias as various formulations of search terms may systematically exclude certain studies. Beyond this, evaluation of the studies for relevance may be flawed and may reflect the bias of the individual reviewer or the review protocol. Measuring agreement among multiple blinded (to publication venue, author, date, etc) evaluators may help reduce systematic bias at this stage, but only if these evaluators are truly independent. The process of data abstraction provides yet another opportunity for the introduction of bias, and again, evaluating the degree of agreement among multiple abstractors may help minimize systematic bias. In addition, the very act of combining data may over- or underrepresent the relative importance of individual study results and may increase confounding. Finally, the discussion and elucidation of results, especially if presented, as in this case, in a summary that does not include access to the original studies, may introduce substantial interpretive bias.

Answer 7 

return to Article Outline

Q7. The protocol for a systematic review should include descriptions of the following: the research question, methods used for identifying eligible studies, methods of data abstraction, and statistical methods (the meta-analysis). Does the Annals summary of the Cochrane review fully describe all four of these components? From a reader's perspective, what are the benefits and limitations of a 4-page summary review of a 117-page review of 64 trials?

The Annals summary describes the research question, summarizes the study selection protocol, and includes a summary table of results by subgroup but does not offer a forest plot or listing of individual results. Readers gain a sense of the overall methods in this meta-analysis but are not, for example, given enough information to reproduce the search or to evaluate exclusions (beyond the categorical exclusion of trials that included “neonates and preterm babies… as well as studies in which more than 15% of the patients were neutropenic”).

The Annals summary also provides substantially less information about methodology than the original Cochrane review. In the setting of an established series, such as the Cochrane review, with an easily accessible, well-established, and oft-repeated meta-analysis methodology, this may be of limited importance if the reader has previous experience with the methodology. (Even the original Cochrane review refers in some instances to a standardized “Cochrane methodology”).13 The advantage of these elisions is to create a document that, although of limited use for a researcher who wishes to carefully evaluate its review methodology, is highly usable for the practicing physician who might be unlikely to access the unwieldy 117-page original.

Bias that intervenes “early” in layer K may be amplified by the subsequent Annals summary. The original review, though claiming to use the Cochrane methodology, may be flawed by error or neglect. If its original reviewers miss this fact (they may or may not pull and review the hundreds of original articles), the Annals summary will be similarly flawed without providing access to the original data that would allow a reader to catch the error. Because the review now appears to be endorsed by an additional reliable organization, the summary process confers further credibility and a larger readership while further insulating the study from critical analysis. Thus, although conscientiously performed systematic reviews may be helpful to clinicians and patients, there is always that danger that “validity by existence” (“it exists and it's heavy; therefore, it must be valid”) is conferred to undeserving documents.

References 

return to Article Outline

1. 1Paul M, Silbiger I, Grozinsky S, et al. Beta lactam antibiotic monotherapy versus beta lactam-aminoglycoside antibiotic combination therapy for sepsis. Cochrane Database Syst Rev. 2006;(1):CD003344.

2. 2Leibovici L, Paul M, Poznanski O, et al. Monotherapy versus beta-lactam-aminoglycoside combination treatment for gram-negative bacteremia: a prospective, observational study. Antimicrob Agents Chemother. 1997;41:1127–1133. MEDLINE

3. 3Hilf M, Yu VL, Sharp J, et al. Antibiotic therapy for Pseudomonas aeruginosa bacteremia: outcome correlations in a prospective study of 200 patients. Am J Med. 1989;87:540–546. Abstract | Full-Text PDF (928 KB) | CrossRef

4. 4Safdar N, Handelsman J, Maki DG. Does combination antimicrobial therapy reduce mortality in Gram-negative bacteraemia? (a meta-analysis). Lancet Infect Dis. 2004;4:519–527. Abstract | Full Text | Full-Text PDF (185 KB) | CrossRef

5. 5Paul M, Soares-Weiser K, Grozinsi S, et al. Beta-lactam versus beta-lactam-aminoglycoside combination therapy in cancer patients with neutropaenia. Cochrane Database Syst Rev. 2002;(2):CD003038.

6. 6Barrett TW, Schriger DL. Annals of Emergency Medicine Journal Club (Acutely decompensated heart failure in a county emergency department: a double-blind randomized controlled comparison of nesiritide versus placebo treatment. Answers to May 2008 journal club questions). Ann Emerg Med. 2008;52:458–472. Full Text | Full-Text PDF (650 KB) | CrossRef

7. 7Royston P, Altman DG, Sauerbrei W. Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med. 2006;25:127–141. MEDLINE | CrossRef

8. 8Glass GV. Primary, secondary, and meta-analysis of research. Educ Res. 1976;5:3–8.

9. 9Duff P, Keiser JF. A comparative study of two antibiotic regimens for the treatment of operative site infections. Am J Obstet Gynecol. 1982;142:996–1003. MEDLINE

10. 10Landau Z, Feld S, Krupsky M. [Ceftriaxone or combined cefazolin-gentamicin for complicated urinary tract infections]. Harefuah. 1990;118:152–153. MEDLINE

11. 11Barrett TW, Schriger DL. Annals of Emergency Medicine Journal Club (Practical considerations in HIV testing in the emergency department, characteristics of diagnostic tests, and the role of sensitivity analysis in observational studies. Answers to March 2008 Journal Club questions). Ann Emerg Med. 2008;52:170–181. Full Text | Full-Text PDF (1420 KB) | CrossRef

12. 12Lang TA, Secic M. How to Report Statistics in Medicine: Annotated Guidelines for Authors, Editors, and Reviewers. 2nd ed.. New York, NY: American College of Physicians; 2006;.

13. 13Higgins J, Green SCochrane Collaboration. Cochrane Handbook for Systematic Reviews of Interventions (Cochrane Book Series). Hoboken, NJ: John Wiley & Sons; 2008;.

a Alameda County Medical Center-Highland Campus, Oakland, CA

b Vanderbilt University Medical Center, Nashville, TN

c University of California, Los Angeles, CA

 Editor's Note: You are reading answers to the fifth installment of Annals of Emergency Medicine Journal Club. The questions and the article they are about (Sinert and Bright. Ann Emerg Med. 2008;52:557-560) were published in the November 2008 issue.

Information about journal club can be found at http://www.annemergmed.com/content/journalclub.

Readers should recognize that these are suggested answers. We hope they are accurate; we know that they are not comprehensive. There are many other points that could be made about these questions or about the article in general. Questions are rated “novice,” () “intermediate,” () and “advanced” () so that individuals planning a journal club can assign the right question to the right student. The “novice” rating does not imply that a novice should be able to spontaneously answer the question. “Novice” means we expect that someone with little background should be able to do a bit of reading, formulate an answer, and teach the material to others. Intermediate and advanced questions also will likely require some reading and research, and that reading will be sufficiently difficult that some background in clinical epidemiology will be helpful in understanding the reading and concepts.

We are interested in receiving feedback about this feature. Please e-mail journalclub@acep.org with your comments.

PII: S0196-0644(08)02013-1

doi:10.1016/j.annemergmed.2008.11.009


View previous. 29 of 41 View next.