Identifying youth at high risk for sexually transmitted infections in community-based settings using a risk prediction tool: a validation study

Background  Chlamydia trachomatis (CT) and Neisseria gonorrhoeae (NG) are the most common bacterial sexually transmitted infections (STIs) worldwide. In the absence of affordable point-of-care STI tests, WHO recommends STI testing based on risk factors. This study aimed to develop a prediction tool with a sensitivity of > 90% and efficiency (defined as the percentage of individuals that are eligible for diagnostic testing) of < 60%. Methods This study offered CT/NG testing as part of a cluster-randomised trial of community-based delivery of sexual and reproductive health services to youth aged 16–24 years in Zimbabwe. All individuals accepting STI testing completed an STI risk factor questionnaire. The outcome was positivity for either CT or NG. Backwards-stepwise logistic regression was performed with p ≥ 0.05 as criteria for exclusion. Coefficients of variables included in the final multivariable model were multiplied by 10 to generate weights for a STI risk prediction tool. A maximum likelihood Receiver Operating Characteristics (ROC) model was fitted, with the continuous variable score divided into 15 categories of equal size. Sensitivity, efficiency and number needed to screen were calculated for different cut-points. Results From 3 December 2019 to 5 February 2020, 1007 individuals opted for STI testing, of whom 1003 (99.6%) completed the questionnaire. CT/NG prevalence was 17.5% (95% CI 15.1, 19.8) (n = 175). CT/NG positivity was independently associated with being female, number of lifetime sexual partners, relationship status, HIV status, self-assessed STI risk and past or current pregnancy. The STI risk prediction score including those variables ranged from 2 to 46 with an area under the ROC curve of 0.72 (95% CI 0.68, 0.76). Two cut-points were chosen: (i) 23 for optimised sensitivity (75.9%) and specificity (59.3%) and (ii) 19 to maximise sensitivity (82.4%) while keeping efficiency at < 60% (59.4%). Conclusions The high prevalence of STIs among youth, even in those with no or one reported risk factor, may preclude the use of risk prediction tools for selective STI testing. At a cut-point of 19 one in six young people with STIs would be missed.

The WHO Global Health Sector Strategy on STIs 2016-2021 (Global Strategy), provides goals, targets, and priority actions for curtailing the STI epidemic [16]. A priority action for countries is the implementation and scale-up of services aimed at early diagnosis of STIs to ensure effective medical treatment and prevent further transmission. Early diagnosis of STIs is challenging, given that most STIs are asymptomatic especially in women [17][18][19]. In the absence of affordable point-of-care tests for STIs, universal screening remains rare in resourceconstrained settings. An approach promoted by WHO is to offer STI testing to asymptomatic individuals based upon risk factors or risk prediction tools [16].
Clinical prediction rules for STIs have been successfully developed for high-income settings to allow for a so-called "selective screening" approach [20,21]. This approach is aimed at minimising costs associated with testing low-risk individuals while detecting most infections. Thresholds of 60% efficiency (defined as percentage of individuals that are eligible for diagnostic testing based on predictive criteria) and 90% sensitivity have been proposed as ideal benchmarks for clinical prediction tools in the context of STIs [20,22,23].
Previous STI risk prediction tools administered by healthcare providers in Africa have been developed using an ad-hoc approach and were found to have a poor sensitivity; none have been developed specifically for youth [19,24]. We aimed to develop a clinical prediction tool for STIs specifically targeting youth in Zimbabwe.

Study design and setting
This study was nested within a cluster-randomised trial (CHIEDZA) of an integrated package of HIV and sexual and reproductive health (SRH) services for youth delivered in community-based settings in Zimbabwe (registered in clinical trials.gov: NCT03719521). Individuals aged 16-24 years living within an intervention cluster are eligible to receive an integrated package of SRH services including HIV testing, HIV treatment and adherence support, contraception, pregnancy testing, syndromic management of STIs, menstrual health information and products, condoms and general health counselling. Individuals older than 24 years of age at a repeat visit, who were less than 25 years at the first visit are also eligible to accessing the services. Testing for gonorrhoea and chlamydia was offered to all clients accessing CHIEDZA services over a limited period of time. Treatment of STIs and HIV is provided according to national guidelines. All services are offered free of charge.
The trial is being conducted in three provinces (Harare, Mashonaland East and Bulawayo), with each province containing eight geographically demarcated clusters randomised 1:1 to four intervention and four standard of care (routine, existing services) clusters. The intervention is delivered once weekly (on the same day each week) at a community centre in each intervention cluster by a team of nurses, community health workers, youth workers and a counsellor.
This sub-study assessing STI risk factors was conducted in eight intervention clusters in Harare and Mashonaland East.

STI testing
All individuals accessing CHIEDZA services were nonselectively offered testing for gonorrhoea and chlamydia Keywords: Sexually transmitted infections, Adolescents, Screening, Risk prediction tool if they had not tested within the 6 months prior to the visit, regardless of whether they had symptoms or risk factors for STIs. Those who accepted testing were asked to provide a urine sample which was tested using the GeneXpert platform (Cepheid, Sunnyvale, CA, USA) [9]. All individuals were given the option to pick up their result the following week, and individuals with a positive result were actively contacted by phone and asked to visit the centre. Positive test results were not disclosed over the phone, but only provided face to face. Partner notification (PN) slips were given to those who had a positive STI test result and all partners were offered treatment. Individuals who reported STI symptoms were treated according to national guidelines for syndromic management but were also offered testing for gonorrhoea and chlamydia [25].

Risk factors for STIs
All individuals accepting STI testing were approached by the study team and asked if they would like to participate in the risk factor sub-study. Those consenting were asked to answer a short questionnaire (11 items) regarding current relationship status, number of sexual partners, concurrent partners, condom use, use of contraception, previous or current pregnancies and STI risk perception. HIV status was obtained from the CHIEDZA dataset. The questions in the questionnaire were informed by studies developing risk prediction tools for chlamydia and gonorrhoea infection in high-income settings [20,21] and studies investigating risk factors for those infections in sub-Saharan Africa [13,17,19,24,26,27]. The questionnaire was administered by a research assistant not involved in STI testing or delivering other CHIEDZA services.

Data analysis
The outcome was positivity for either C. trachomatis or N. gonorrhoeae, combined as one variable. For variables that only applied to women (past pregnancy and use of hormonal contraception) males were coded 'no' for multivariable analysis. Univariable logistic regression was used to estimate the odds ratios (OR) and 95% confidence intervals (95% CI) for the association between STI infection and each of the 11 risk factors determined by the questionnaire and HIV status, age and sex. Variables that were associated with the outcome at 10% significance level in the univariable analysis were included in a multivariable model and then removed sequentially using backwards stepwise logistic regression until all remaining variables were associated with the outcome at < 5% significance. For ordinal variables p-values were calculated with a Wald test. Coefficients of variables included in the final multivariable model were multiplied by 10 to generate weights, and the weights were added for each individual to create an STI risk score. A maximum likelihood Receiver Operating Characteristic (ROC) model was fitted, with the continuous score variable divided into 15 categories of equal size. All possible cut-points of the risk prediction tool were evaluated for sensitivity, specificity, efficiency (proportion of the sample who screen positive) and number needed to test to obtain one positive result.
From a pilot study, the prevalence of the outcome was estimated at 17% [28]. A sample size of 1000 gave 80% power to detect a risk ratio of 1.7 for a risk factor with 10% prevalence, and 80% power to detect a risk ratio of 2.0 for a risk factor with 5% prevalence.

Results
Between 3 December 2019 and 5 February 2020; 1007 individuals opted for STI testing of whom 1003 gave consent for data on risk factors to be collected. The majority of these were female (78.7%, n = 789) and aged 20-24 (58.2%, n = 584) years, similar to the demographic profile of those accessing CHIEDZA services. HIV status was known for 957 participants, among whom HIV prevalence was 5.2% (n = 50). Of these, 36 (72.0%) were previously diagnosed and taking antiretroviral therapy (ART), and 14 (28.0%) were newly diagnosed through CHIEDZA.
In univariable analysis, older age, being female, being in a relationship or widowed/divorced, number of lifetime sexual partners and number of sexual partners in the past three months, having had a new sexual partner in the past three months, history of STI treatment, history of STI treatment of the partner, occasional condom use, perceived high STI risk, positive HIV status and past or current pregnancy were all associated with having an STI (p < 0.1; Table 1). The multivariable analysis including all these variables showed an independent significant association between STI infection and being female, relationship status, number of lifetime sexual partners, HIV status, perceived STI risk and past or current pregnancy. These variables were included in the final multivariable model. Odds ratios for associations between risk factors and having an STI ranged between 1.23 and 3.98 (Table 1). The STI risk prediction tool generated from the final model is shown in Table 2. The score ranged from 2 for single participants with no other risk factors to 46 for those with all six risk factors (Fig. 1). For example, an HIV negative woman with a boyfriend who had 1 lifetime sexual partner, perceived herself at low risk and had never been pregnant would score 10 + 5 + 7 = 22. The maximum possible score for males was 41. A score of 0 would only be possible for a married male with no lifetime sexual partners, which did not occur in the dataset. The sensitivity, specificity and efficiency of all possible cut-points is shown in Fig. 2, and the ROC curve with 95% CI is shown in Fig. 3. The area under the curve is 0.72. Two cut-points for the risk tool were chosen: 23 for optimised sensitivity (75.9%) and specificity (59.3%), and 19 to meet the benchmark of maximising sensitivity (82.4%) while keeping efficiency (59.4%) at less than 60% ( Table 3). The number needed to screen to diagnose one STI was 3.5 for a cut-point of 23 and 4.1 for a cut-point of 19.

Discussion
We found a high prevalence of C. trachomatis and/or N. gonorrhoeae infection among young people attending a community based SRH service in urban and peri-urban Zimbabwe. Notably, only 2.1% (21/1003) of participants were found to be positive for an STI syndrome and treated accordingly. The prevalence of C. trachomatis infection was almost three times higher than the WHO estimates for the African region, but comparable with studies conducted among young women in South Africa [9,26,29]. The prevalence of N. gonorrhoeae was similar to the recent estimates from the Spectrum-STI model for Zimbabwe of 3.8% (95% CI 1.8-6.7%) [30].
With non-selective testing the number needed to be tested to diagnose an STI was 5.7. This decreased to 4.1 using an STI risk prediction tool cut-point of 19 and 3.5 for a risk prediction tool cut-point of 23. While the STI risk prediction tool with a cut-point of 19 had the desired efficiency of < 60%, the sensitivity was suboptimal at 82%. This is because the prevalence of STIs even among clients with the lowest possible STI risk in this population was relatively high. For example STI prevalence was 5.9% among those who reported that they had never had sex and 13.9% among those who only had one lifetime sexual partner. The high STI prevalence among young people without reported risk factors in this population suggests that non-selective testing may be more appropriate than applying a risk prediction tool.
Many recently-published studies show high prevalence of STIs in general African populations but none has developed a risk prediction tool [13,17,26,27,29,31]. To our knowledge, this is the first study globally that has attempted to develop a risk prediction tool in youth, who are a high-risk group for these STIs. A recent Kenyan study among men who have sex with men developed a risk tool for anorectal C. trachomatis and/or N. gonorrhoeae infection reaching a sensitivity of 86% at an efficiency of 61% [32]. The proposed risk tool came close to the ideal benchmarks for clinical prediction rule performance for STIs of > 90% sensitivity and < 60% efficiency [20,22,23]. The only study examining the performance of STI risk prediction tools among African women was conducted in 1994 in Tanzania and reported sensitivities between 10 and 29% which are inadequate [24].
A recent study enrolling Rwandan women in 2016-2017 compared the performance of a new diagnostic algorithm ('WISH' algorithm) with reference standard diagnostic testing [19]. The WISH algorithm was predefined at the start of the study; women were considered positive according to the algorithm if they met one or more of the following criteria: currently pregnant, exchanged sex for money or goods in the past 12 months, new sexual partner in the past 3 months, or vaginal discharge with an offensive smell or pelvic inflammatory disease observed by a physician. The prevalence of C. trachomatis and/or N. gonorrhoeae was 14% using vaginal swabs investigated by GeneXpert and the sensitivity and efficiency of the WISH algorithm was 75% and 56% respectively. While the population in the WISH study is not truly comparable to our study, low sensitivities of both the WISH algorithm and our risk prediction tool would result in missing one in four individuals with STIs using the WISH algorithm and one in six using our risk prediction tool.
A previous study conducted in the same population in Zimbabwe found that only 0.5% of youth were treated for a STI syndrome and less than 5% reported symptoms when asked specifically [28]. Given the low prevalence of reported symptoms, questions about symptoms were not included in the questionnaire. However the questionnaire used in this study included a broad range of possible risk factors and was informed by studies investigating risk factors for STIs in Africa and STI risk prediction tools developed for high income settings [13, 17, 19-21, 24, 26, 27]. Questions about sexual behaviour are often subject to social desirability bias resulting in underor mis-reporting [33][34][35]. This may limit the sensitivity of risk prediction tools based on questions about sexual behaviour. While computer-assisted survey instruments rather than self-or interviewer-administered questionnaires may reduce social desirability bias, these may be more difficult to implement in low-resource settings [33][34][35]. Unsurprisingly STI risk factors in this study were closely associated with each other and with the presence of C. trachomatis and/or N. gonorrhoeae infection in univariable analysis. Due to the co-linearity of risk factors only six independent risk factors were included in the multivariable analysis. The model that was developed was applicable to both sexes even though one risk factor, pregnancy, only applied to women. A sensitivity analysis restricted to women resulted in the same five risk factors for predicting STIs, which is reassuring regarding the robustness of the model. However, we could not conduct a sensitivity analysis for men only given the limited number of men in the study (n = 214). This is a limitation of our study, but reflects the reality that men are less likely to access services [36,37].
The strengths of this study include the large sample size and high participation rate among those opting for STI testing. Also the study was embedded within a population-based SRH service offered in eight communities in two provinces accessed by youth without pre-selection on the basis of STI risk, which makes the findings generalisable to similar settings. We focused on two highly prevalent STIs that cause significant morbidity and mortality. However, we did not include Trichomonas vaginalis, which is highly prevalent especially in women in sub-Saharan Africa [13,26,38]. Also for logistic reasons we used urine instead of vaginal swabs. This may have resulted in some STIs being missed.
Importantly we decided against a test and train approach, which is usually used when developing risk prediction tools [39]. This is because our analysis demonstrated that even the most optimal risk prediction tool with a cut-point of 19 was unacceptable.
In view of the high prevalence of STIs among youth in sub-Saharan Africa, non-selective diagnostic testing as opposed to selective testing following the application of a risk prediction tool seems the most promising and appropriate approach. This approach will require true point-ofcare STI diagnostics, which are currently available only for Trichomonas vaginalis and syphilis [40,41]. While the GeneXpert platform for C. trachomatis and N. gonorrhoeae testing does not require expert skills, is easy to use and widely available in sub-Saharan Africa, the time to result (90 min) is prohibitive for the test to be used as a true pointof-care test [42]. However, other molecular tests for chlamydia and gonorrhoea with rapid time to results (30 min) have been successfully trialled in high-income settings [43]. Universal point-of-care STI testing and immediate single dose treatment for those testing positive should be considered as a strategy for curbing the STI epidemic in youth.  the first draft. All authors contributed to writing and reviewing of the manuscript. All authors read and approved the final manuscript.

Funding
The study was funded by RAF's Wellcome Trust Senior Fellowship 206316_Z_17_Z. IDO received funding though the Wellcome Trust Clinical PhD Programme awarded to the London School of Hygiene and Tropical Medicine (Grant Number 203905/Z/16/Z). The funder had no role in the study design, data collection, data analysis, data interpretation, or writing of the report. The corresponding author had full access to all the data collected and had final responsibility for the decision to submit for publication.

Availability of data and materials
Individual, anonymised participant data and a data dictionary will be available through The London School of Hygiene and Tropical Medicine repository (Data Compass) 12 months after publication of results. The datasets used and/ or analysed during the current study available from the corresponding author on reasonable request.