Skip to main content

Table 6 Quality of evidence across studies for each key outcome using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) methodology

From: The effect of surgery on the outcome of treatment for multidrug-resistant tuberculosis: a systematic review and meta-analysis

Author(s): Harris, R; Khan, M and Allen, V

Date: 25/11/2015

Question: Surgery compared to no surgery for treatment of MDR or XDR TB

Setting: Georgia, Latvia, Russia, South Africa, South Korea and Turkey

Bibliography: Dravniece et al. (2009); Gegia et al. (2012); Karagoz et al. (2009); Keshavjee et al. (2008); Kim et al. (2007); Kim et al. (2008); Kwak et al. (2015); Kwon et al. (2008); Leimane et al. (2005); Mitnick et al. (2008); Shean et al. 2008; Sklyuev et al. (2013); Tahaoglu et al. (2001); and Torun et al. (2007).

Quality assessment

№ of patients

Effect

Quality

Importance

№ of studies

Study design

Risk of bias

Inconsistency

Indirectness

Imprecision

Other considerations

Surgery

No surgery

Relative (95 % CI)

Absolute (95 % CI)

Cured (follow up: range 0.5 to 10 years; assessed with: WHO definition)

5

observational studies

serious a,b,c,d,e,f,g

not serioush

not seriousi

not serious

nonej

118/157 (75.2 %)

308/561 (54.9 %)

OR 3.03 (1.59 to 5.78)

238 more per 1000 (from 110 more to 327 more)

VERY LOW

CRITICAL

Successful outcome (follow up: range 0.25 to 7 years; assessed with: Cure or treatment success, WHO definition)

14

observational studies

seriousa,b,c,d,e,f,g,k,l,m

not seriousn

not seriouso

not serious

nonej,p

371/453 (81.9 %)q

1197/2006 (59.7 %)

OR 2.62 (1.94 to 3.54)q

198 more per 1000 (from 145 more to 243 more)

VERY LOW

CRITICAL

Death (follow up: range 0.5 to 10 years; assessed with: All-cause mortality or TB mortality)

5

observational studies

seriousa,b,c,d,e,f,k,r,s,t

not seriousn

seriousu

serioust

nonej

11/191 (5.8 %)

52/720 (7.2 %)

OR 0.82 (0.41 to 1.64)

12 fewer per 1000 (from 41 fewer to 41 more)

VERY LOW

CRITICAL

Loss to follow up (previously default) (follow up: range 0.5 to 10 years; assessed with: WHO definition)

4

observational studies

seriousa,b,c,d,e,f,v

not seriousn

not seriousw

not serious

none j,x

6/156 (3.8 %)

77/613 (12.6 %)

OR 0.35 (0.15 to 0.81)y

78 fewer per 1000 (from 21 fewer to 105 fewer)

VERY LOW

CRITICAL

Treatment failure (follow up: range 0.5 to 10 years; assessed with: WHO definition)

5

observational studies

serious a,b,c,d,e,f,g,k

not seriousn

not seriousw

not serious

none j,x

8/191 (4.2 %)

82/720 (11.4 %)

OR 0.38 (0.18 to 0.81)

67 fewer per 1000 (from 20 fewer to 91 fewer)

VERY LOW

CRITICAL

Transfer out (follow up: Not reported)

2

observational studies

serious a,b,c,f,z,aa

not serious ab

not serious

not seriousab

none aa,ac

0/139 (0.0 %)

6/305 (2.0 %)

not estimable

 

VERY LOW

 

Adverse Events from surgery (follow up: range 1.5 to 10 years)

1

observational studies

serious a,b,f

not serious ad

not serious

not seriousad

publication bias strongly suspected ae

2/66 (3 %) surgical patients died due to surgical complications.

VERY LOW

 
  1. MD mean difference, RR relative risk
  2. aDo not address or adjust for confounders and in some studies do not fully describe population - Dravniece et al. (2009); Karagoz et al. (2009); Kim et al. (2007); Kwak et al. (2015); Kwon et al. (2008); Mitnick et al. (2008); Shean et al. (2008); Sklyuev et al. (2013); Tahaoglu et al. (2001) and Torun et al. (2007)
  3. b Retrospective observational studies do not have randomisation and have inherent bias in who is offered surgery - Dravniece et al. (2009); Karagoz et al. (2009); Keshavjee et al. (2008); Kim et al. (2007); Kim et al. (2008); Kwak et al. (2015); Kwon et al. (2008); Leimane et al. (2005); Mitnick et al. (2008); Shean et al. 2008; Tahaoglu et al. (2001) and Torun et al. (2007)
  4. c Uncertainty in representativeness of study population - Dravniece et al. (2009); Karagoz et al. (2009); Kim et al. (2007); Kwak et al. (2015); Kwon et al. (2008); Shean et al. 2008; and Tahaoglu et al. (2001)
  5. d No estimate of variability given - Dravniece et al. (2009) and Tahaoglu et al. (2001)
  6. e Number lost to follow up reported, but characteristics not described - Tahaoglu et al. (2001)
  7. f Length of follow up not described or adjusted for in analysis - Dravniece et al. (2009); Kim et al. (2007); Kwak et al. (2015); Kwon et al. (2008); Leimane et al. (2005); Mitnick et al. (2008); Shean et al. (2008); Tahaoglu et al. (2001); and Torun et al. (2007)
  8. g In surgical studies, it is not possible to blind patients or study team. Outcome assessors could be blinded, and is somewhat important for assessing cure using smear as an outcome. However laboratory assessment is generally conducted by different personnel than the diagnosing physician. For treatment success/failure there is a risk of reporting bias due to lack of blinding where data are programmatic, as there may be over-reporting due to programmatic targets and could be biased by knowledge of surgical status
  9. h Moderate I-squared (54.2 %) and overlapping CIs between studies so not downgraded
  10. i Some variation in duration of follow up in outcome definition, however is not downgraded as alone this is not classified as serious issue for this outcome
  11. j All studies are cohort, therefore may be some confounding due to patient allocation to surgery or no surgery. Patients who are more unwell may be more likely to be recommended for surgery (therefore causing underestimate of effect size), however the most sick are often not offered surgery as they may be too unwell or disease too disseminated to allow surgery (therefore overestimating effect size). In addition, there may be variation in the population offered surgery by setting or surgeon. As there is a specific window for surgery, these biases may have an impact on estimation of effect size, though it is unclear whether they would bias the estimation in a particular direction, and are a reflection of the reality of the patient group offered surgery. Therefore, the reviewers decided not to upgrade or downgrade the rating
  12. k Reports number, but not summary statistics or precision for this specific outcome - Leimane et al. (2005) and Mitnick et al. (2008)
  13. l Abstract only, outcome and patient characteristics not clearly described - Dravniece et al. (2009)
  14. m Loss to follow up not reported for surgical vs non-surgical patients - Kim et al. (2007)
  15. n Low I-squared and overlapping CIs between studies, so not downgraded
  16. o Most studies followed WHO outcome definitions. Some variation in duration of follow up to assess outcome but not downgraded as alone is not classified as serious issue for this outcome
  17. p Empty lower right quadrant of funnel plot. However, it seems that smaller (less precise) studies are reporting lower effect estimate so if publication bias were to exist this would suggest the current estimate effect measure is conservative. Per protocol, studies with <10 surgical participants were excluded, therefore the very smallest of studies were not included. Plot is not sufficiently asymmetrical to raise serious concerns, and any bias would appear to cause an underestimate of effect, therefore quality not downgraded
  18. q n= 13 for OR estimates, but = 11 for numbers of patients summarised in the table, as 2 studies only report effect estimate rather than the number of patients with the outcome and the denominator
  19. r In surgical studies, it is not possible to blind patients or study team. Outcome assessors could be blinded, but unimportant in mortality outcome as no subjectivity in assessment
  20. s Time period of follow up very variable, and for patients with follow up for <2 years the follow up period is potentially insufficient for mortality outcome - Shean et al. (2008) and Torun et al. (2007)
  21. t Pooled CIs cross the null. Event rate is low and post-hoc optimal information size calculation indicated number included in assessment of this outcome is too low to give sufficient power
  22. u Variation between studies in outcome definition used (all-cause vs TB-only). Unclear/variable period over which death was assessed (e.g. died during treatment, within 6 months of completion, or after 2 years)
  23. v In surgical studies, it is not possible to blind patients or study team. Outcome assessors could be blinded, but where data are programmatic they are unlikely to be. This could introduce underestimate in reporting of default, but this bias is unlikely to vary between study groups
  24. w Mostly use WHO definition, minor variation in definition in some studies, but sufficiently direct not to downgrade
  25. x OR (similar to RR given infrequency of event) is <0.5 and the upper confidence interval would still provide a clinically significant benefit, therefore this would be considered a large effect size. However, the quality are not upgraded as according to GRADE methodology this should not be done if the risk of bias is serious
  26. y N = 2 studies had no patients lost to follow-up in the surgery group, so 0.5 has been added to all cells in order that a CI can be calculated. The summary OR restricted to the 2 studies that had at least one patient lost to follow-up in each group is 0.47 (95 % CI 0.18, 1.24)
  27. z Although reported separately, unlikely that clear differentiation has been made between LTFU and transfer out
  28. aa Suspected underreporting of outcome, but uncertain as to how this would impact the conclusions
  29. ab No pooled estimate, so insufficient evidence to assess
  30. ac Only 2 publications, so not possible to assess publication bias, but given how few report this outcome publication bias may be plausible
  31. ad One study and no comparator group so not possible to estimate
  32. ae Likely that complications occurred in other studies, but have either not been reported or have been included in all-cause deaths