Introduction

Sexually transmitted infections (STIs) are a major global health concern that causes symptomatic and asymptomatic infections1,2. Most STIs caused by bacteria and parasites are curable if diagnosed and treated accordingly but all viral STIs are incurable3,4. The largest portion of STIs occurs in females across the globe2,5. In females, the consequence of undiagnosed and untreated STIs can result in reproductive health complications that include infertility, stillbirths, cancer development and increased susceptibility to HIV1,2,4,6,7. Considering this, our study focused on STIs in females.

Governments across the globe, particularly in low-and-middle-income countries (LMICs) continue to use syndromic management of STIs due to a scarcity of resources and restricted access to diagnostic laboratories8,9. This approach relies on reported signs and symptoms, and physical examinations for diagnosis and then treatment is issued for the most common STIs8,10. In light of this, it deters infected individuals who fear invasive genital examinations and stigmatization associated with STIs11. Additionally, this approach cannot address asymptomatic infections because these individuals may not seek care8,12. As such, asymptomatic individuals continue to spread infection and become susceptible to long term STI complications. Syndromic management often promotes over-diagnosing and over-treating because treatment is issued often targeting the most common STI causative pathogens instead of a specific pathogen13,14. As such Murewanhena et al.14 suggest a shift from syndromic management of STIs to a more pathogen specific diagnosis and treatment of STIs. The development of innovative alternative interventions, such as self-sampling, is key to improving STI healthcare service provision15,16,17. Self-sampling enables individuals to self-collect specimens for STI diagnosis either at home or healthcare facilities, providing convenience and accessibility in testing18. This intervention can be used to screen for asymptomatic infections11, and screen infections in remote areas where access to quality healthcare is limited19,20. Based on this, self-sampling can address the challenges linked to the syndromic management of STIs19,21. However, self-sampling may jeopardise specimen quality since the collection is performed by inexperienced individuals.

Since the potential of self-sampling interventions for STI diagnosis is evident, it is imperative to determine their diagnostic accuracy and reliability. A scoping review conducted by Jaya et al.22 in 2021 presented evidence that supports self-sampling interventions as appropriate alternatives to physician collected specimens for STI diagnosis. A meta-analysis conducted in 2005 proved that self-collected swabs were suitable alternatives to clinician-collected specimens for the diagnosis of human papillomavirus (HPV)23. A systematic review and meta-analysis conducted in 2015 on Neisseria gonorrhoea (NG) and Chlamydia trachomatis (CT) also reported that self-collected specimens were reliable for diagnostic testing15. Considering the potential impact of the self-sampling intervention on sexual and reproductive healthcare there is a need for an updated systematic review and meta-analysis sexual and reproductive healthcare. This is to foster improvements in clinical decision-making pertaining to sexual and reproductive healthcare provision. As such, the current study is an updated systematic review and meta-analysis on the accuracy of self-collected specimens compared to healthcare worker-collected specimens for STI diagnosis. This study will evaluate the diagnostic accuracy of self-sampling for STI diagnosis in studies conducted from 2015 onwards because a systematic review of a similar nature included studies up to 2015. The overarching aim of an updated systematic review is to ensure that the best evidence to inform clinical decision making and healthcare policy development for STI healthcare is provided.

Methods

Protocol and registration

The protocol for this study was submitted to the International Registration of Systematic Reviews (PROSPERO), with the registration number CRD42022341462. This study was guided by the Preferred Reporting Items for Systematic Review and Meta-analyses (PRISMA)24.

Eligibility criteria

The Population, Intervention, Comparison, and Outcome (PICO) framework for determining the research question eligibility was followed. Studies were included if they: (a) assessed the accuracy of self-collected specimens against healthcare worker-collected specimens for STI diagnosis in women were included, (b) studies that used healthcare worker collected specimens as the reference or gold standard, (c) the study population comprised of specimens that had been tested for STIs including HPV, NG, CT, Treponema pallidum (syphilis), Trichomonas vaginalis (TV), and Mycoplasma genitalium (MG), (d) examined self-collected versus clinician-collected samples using different diagnostic assays including nucleic-acid-based assays, and manual methods that included wet mount, culture, and gram stain peer-reviewed studies published in 2015 and onwards to diagnose STIs. Data on investigations conducted on females was extracted from studies that include people of another gender. There were no language restrictions applied and studies with different study designs were included. Studies were excluded if: (a) the time of self-sampling and healthcare worker specimen collection exceeded three weeks due to the window period for seroconversion, (b) presented information on combined specimen results, (c) self-sampling was not conducted in females, (d) self-sampling and healthcare worker collected specimen was collected from different individuals.

Index test

The diagnostic accuracy of self-collected specimens to diagnose STIs was evaluated against healthcare worker specimens. Self-collected specimens for STI diagnosis included vaginal swabs, urine, cervical swabs and tampons. The sensitivity and specificity of each diagnostic assay for each STI were evaluated.

Reference standard

Healthcare worker-collected specimens for the diagnosis of STIs were used as the gold reference standard in this study.

Search strategy

A systematic search of data was conducted in Cochrane, Medline, Scopus, Web of Science, and PubMed electronic databases (see Table 1). The search was limited to studies from 2015 onwards. The Principal Investigator (PI) developed the search strategy with an experienced librarian at the University of Pretoria. Medical Subject Headings (MeSH) terms were used to define our searches with Boolean operators (AND/OR) between search terms. The search terms used included but were not limited to (1) “Self-sampling” or “self-collected” or ‘self-administered” or “self-obtained” (2) “sexually transmitted infections” (3) “diagnostic specimens” or “diagnostic samples” (4) “women” or “females”. A hand search for grey literature was also conducted on the WHO website, the Department of Health South Africa (DoH SA), and the Open Grey website.

Table 1 Database search.

Study selection

Screening of studies suitable for inclusion in the systematic review and meta-analysis was conducted on the studies between 2015 and 2022. Since this systematic review stems from the findings of a scoping review which was conducted in 2021. Studies which had been screened for the scoping review from 2015 to 2021 were re-screened using eligibility criteria for the systematic review. To ensure the inclusion of studies conducted in 2022, the assisting librarian conducted a new search for studies that were published in 2022. An EndNote library was then created for all studies that were eligible for full-text screening. Thereafter, ZNJ and TD performed full-text screening of all studies that fulfilled the eligibility criteria of the systematic review and meta-analysis. NT resolved discrepancies that arose during full-text screening by ZNJ and TD. Thereafter, ZNJ and NT extracted data from studies found eligible for inclusion at the full-text screening stage. Thereafter, any disagreements were resolved by discussion until an agreement was reached. Study selection for the systematic review was guided by the PRISMA flowchart.

Data extraction

ZNJ and NT independently extracted data from eligible studies using a data extraction tool that was designed to extract data from the included primary studies. The tool was piloted using 10% of the included studies and amended accordingly before final use. The extracted data was divided into two separate sections namely a section for basic qualitative information and another section for the quantitative outcomes of interest. Basic information extracted included author name(s) and year of publication, study title, study aims, study population, study design, sample size, eligibility criteria, reference standard specimen, type of self-collected specimen, type of laboratory assay, main findings, and conclusions. Data extracted for the section on the outcome of primary studies true positive, true negative, false positive, false negative, sensitivity and specificity, positive predictive value, negative predictive value, and evidence of agreement or concordance between self-collected and healthcare worker collected specimens. In some instances, the true negative, true positive false positive and false negative results were not available, and the relevant data was requested from the authors. A 2 × 2 table was produced based on the collected data. Any discrepancies that arose between the reviewers were discussed until a unanimous resolution was reached.

Assessment of methodological quality

The Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool for primary diagnostic accuracy studies, was utilised to assess the quality of all the included studies25. This tool consists of four main domains that include patient selection, index test, reference standard, and flow and timing25, which were adapted to the current study accordingly. To determine the risk of bias, signalling questions answered as “yes” “no” or “unclear”, were used in each phase25.

Statistical analysis and data synthesis

For included studies in which sensitivity and specificity had been assessed and reported a meta-analysis of diagnostic accuracy was performed. The Review Manager (RevMan) software was used to conduct statistical analysis. The RevMan software was also used to calculate the pooled sensitivity, specificity, and diagnostic odds ratio with a 95% confidence interval. Cochran’s Q statistics were utilised to determine heterogeneity among the included primary studies. Statistical significance in all the analyses was calculated using the p-value where a p-value of < 0.05 indicated statistical significance.

Ethical approval

Ethical clearance for the study was obtained from the University of Pretoria’s Faculty of Health Sciences Research Ethics Committee. The reference number is 136/2022. Participant consent was not applicable.

Results

Study selection and characteristics of included studies

Sixteen studies conducted in 2015, which were retrieved during a database search for the scoping review underwent title screening using the relevant eligibility criteria for the systematic review. For the new database search conducted by the librarian to ensure the inclusion of studies in Aug 2022, forty-eight search results were retrieved. Nine were duplicates, which left only thirty-nine eligible for title screening. The abstract screening was then conducted on fifty-five studies (thirty-nine plus sixteen studies). Post abstract screening, thirty-seven studies were excluded and only eighteen studies were eligible for data extraction. Reasons for exclusion were studies presenting data on pooled specimens, studies not presenting data on self-collected and healthcare worker collected specimens, and studies not about self-sampling STIs. Post full text screening of the studies only fourteen were eligible for inclusion in the systematic review. Four studies were excluded for being conducted before 2015, studies not about self-sampling, not about STIs, and a study presenting data on pooled specimens. Ultimately, data extraction was conducted on a total of fourteen studies (see Fig. 1 below). There was moderate agreement between the reviewers at full-text screening (kappa = 0.5).

Figure 1
figure 1

PRISMA flow diagram of the selection process of relevant studies.

Characteristics of included studies

The characteristics of included studies are all depicted in Table 2. Fourteen studies were included in the systematic review but not all of them were included in the meta-analysis. A large portion of the studies, five studies, were from the United States of America (USA)26,27,28,29,30, one study in Canada31, one in Haiti32, one in France33, one study in Saudi Arabia34, one in India35, one in the Republic of Korea36, one study in Kenya37, one in Chad19, one study in Ghana38. See Table 3 for quantitative characteristics of included studies. It is important to note that some of the sensitivity and specificity measurements were obtained from the articles as calculated by the authors. However, where the measurements were not available, the researchers calculated using data that was already available on the manuscripts and original data obtained from authors of some of the included studies. Furthermore, for studies where this information was not available at all, it was not reported.

Table 2 Characteristics of included studies.
Table 3 Quantitative characteristics of included studies.

The characteristics of the included studies were further divided into sub-groups for meta-analysis for each STI as outlined in the following sections:

Chlamydia

A total of six studies compared the diagnostic accuracy of self-collected specimens to healthcare worker collected specimens in females19,27,30,31,33,37. Five of the studies were conducted in a clinic19,27,31,33,37, and study location was not reported for one of the studies30. Of these six studies, three of them compared healthcare worker collected vaginal swabs to self-collected vaginal swabs30,31,33. In two of the studies healthcare workers collected cervical swabs were compared to self-collected cervicovaginal swabs27,37. Only one study compared healthcare worker collected endocervical swabs to self-collected veil specimens19. STI testing was performed using automated NAAT based assays. All six studies were cross-sectional studies. In five of the studies, research participants had received instructions on how to self-collect specimens for testing19,27,31,33,37, and in one study the research participants did not receive any instructions30. The number of research participants in the studies ranged from 189 to 3860. Only four of the six studies were included in the subgroup meta-analysis19,30,31,33. Out of the two excluded studies, one study was excluded because only agreement data was reported and the other parameters were not reported37. Similarly, the other study only reported sensitivity and specificity data27. Figure 2 presents research findings for the subgroup analysis of four studies, where the summary estimate for sensitivity was 0.85 (95% Confidence Interval 0.77–0.92), while specificity was 0.95 (95% Confidence Interval 0.91–0.98). The SROC plot (Fig. 3) is a depiction of the pooled sensitivity and specificity of the studies.

Figure 2
figure 2

Forest plot of chlamydia studies that compared self-collected vaginal swabs with healthcare worker collected cervical and vaginal specimens.

Figure 3
figure 3

SROC depicting diagnostic accuracy of included studies for chlamydia.

The studies show statistical significance in the studies, but there is moderate evidence of heterogeneity among the studies. The diagnostic tests have a good discriminatory ability to differentiate between individuals with and without chlamydia (Table 4).

Table 4 Heterogeneity and statistical significance for CT.

Human papilloma virus

Five studies compared the diagnostic accuracy of healthcare worker collected specimens with self-collected specimens to diagnose HPV27,28,32,36,38. Three of the studies compared healthcare worker collected cervical swabs were compared to self-collected vaginal swabs32,36,38, while one study compared healthcare worker collected cervical swabs with self-collected tampons and vaginal swabs28, and another study compared healthcare worker collected cervical swab with self-collected cervicovaginal swabs27. All the studies were conducted in a research clinic. The sample size of the studies ranged from 151 to 1836 study participants. Study participants received instructions on how to self-collect their specimens for STI diagnosis, prior to specimen collection. NAAT based diagnostic assays were used in all the studies. Four of the studies were cross-sectional studies28,32,36, and only one was a clinical trial27. In one study, the sensitivity and specificity of self-collected specimens was 100 and 88.9% respectively, while healthcare worker collected diagnostic result sensitivity and specificity were 100 and 90% respectively27. In another study, the sensitivity and specificity of self-collected specimens compared to healthcare worker collected specimens were 92.6 and 95.9% respectively38. One study reported the sensitivity of self-collected specimens as 100%36. Another study reported the sensitivity and specificity of only self-collected swab as 86 and 94% respectively, while for the self-collected tampon it was 77 and 100% respectively28. Another study reported sensitivity results of self-collected specimens as 89.1% and sensitivity of healthcare workers collected specimens as 87.9%32. However, a sub-group meta-analysis was not performed because the relevant data for TN, FN, TP and FP was not available.

Mycoplasma genitalium

Out of the four studies that investigated MG infection, two studies compared self-collected cervicovaginal swabs with healthcare worker collected cervical swabs27,37; one study compared healthcare worker collected vaginal and cervical swabs with self-collected vaginal swabs33, and another one compared healthcare worker collected endocervical swabs with self-collected veil specimens19. Diagnostic testing was performed using NAAT based assays in all the studies. All the studies were conducted in clinics. In all the studies, research participants received instructions on how to self-collect specimens before collecting their own specimens. The sample size ranged from 193 to 1028 participants. All studies were cross-sectional. Only two of the included studies had sufficient data for a meta-analysis for this subgroup19,33. Figure 4 presents the analysis of the two studies where the summary estimate for sensitivity was 0.49 (95% Confidence Interval 0.39–0.58) and for specificity was 0.88 (95% Confidence Interval 0.81–0.94).

Figure 4
figure 4

Forest plot of MG studies that compared self-collected vaginal swabs with healthcare worker collected cervical and vaginal specimens.

Presented below in Fig. 5 is the SROC plot depicting the diagnostic accuracy of the studies in this subgroup.

Figure 5
figure 5

SROC depicting diagnostic accuracy of MG in included studies.

The sub-group meta-analysis suggests that the accuracy of the diagnostic test may vary across studies, with poor sensitivity in one study and poor specificity in the other. However, overall, the test shows a moderate to high diagnostic accuracy, as indicated by the high DOR value (Table 5).

Table 5 Heterogeneity and statistical significance for MG.

Gonorrhoea

Seven studies investigated the diagnostic accuracy of self-collected specimens in comparison to healthcare worker collected specimens in diagnosing NG. Six of these studies were cross-sectional19,26,30,31,33,37, and only one was a clinical trial27. The sample size of the studies ranged from 89 to 3860. Laboratory diagnosis was performed using automated NAAT based assays in all the studies, and one of the studies also used manual diagnostic methods26. Six studies reported that specimen collection had occurred at research clinics19,26,27,31,33,37, and one study did not indicate30. In six of the studies, research participants received instructions before specimen collection19,26,27,31,33,37, but in one study there was no report about whether research participants had been instructed how to self-collect their specimen30. Two studies compared diagnostic accuracy in healthcare worker collected vaginal swabs to self-collected vaginal swabs30,31. One study compared self-collected vaginal swabs to cervical and vaginal swabs collected by healthcare workers33. Two studies compared diagnostic accuracy in self-collected cervicovaginal swabs and healthcare worker collected cervical swabs27,37. In one study diagnostic accuracy is compared between healthcare worker collected endocervical swabs with self-collected vaginal swabs26. Another study compared diagnostic accuracy in self-collected veil specimens with healthcare worker collected endocervical swabs19. Figure 6 below presents summary estimates for the sensitivity and specificity of diagnostic accuracy of healthcare worker collected specimens compared to self-collected specimens. The summary estimate for sensitivity and specificity is 0.59 (95% Confidence Interval 0.49–0.68) and 0.84 (95% Confidence Interval 0.76–0.91).

Figure 6
figure 6

Forest plot of gonorrhoea studies that compared self-collected vaginal swabs with healthcare worker collected cervical and vaginal specimens.

Presented below in Fig. 7 is the SROC plot depicting the diagnostic accuracy of the studies in this subgroup.

Figure 7
figure 7

SROC depicting diagnostic accuracy of NG in included studies.

The Cochran's Q test shows significant heterogeneity among the studies at 17.156. The diagnostic odds ratio of 2.579 suggests that the overall accuracy of the diagnostic test is low to moderate. The p-value indicates statistical significance (Table 6).

Table 6 Heterogeneity and statistical significance for NG.

Trichomoniasis

Seven studies investigated the diagnostic accuracy of self-collected specimens in comparison to healthcare worker collected specimens in diagnosing trichomoniasis. Six of the studies were cross-sectional19,29,33,34,35,37, and one study was a clinical trial27. Four studies utilised automated NAAT-based assays19,27,33,37, one study used manual testing methods35, while two studies used both automated NAAT assays and manual methods for TV diagnosis29,34. Study participants in all the studies collected their specimens at the research clinics. In five of the studies the participants received instructions on how to self-collect specimens before collecting their specimens19,27,33,34,35,37, and in one study this was not reported29. The sample size of research participants ranged from 174 to 1867. One study compared the diagnostic accuracy of healthcare worker collected vaginal and cervical swabs with self-collected swabs33. Two studies compared healthcare worker collected cervical swabs with self-collected vaginal swabs29,37. One study compared endocervical swabs collected by healthcare workers with self-collected veil specimens19. Two studies compared diagnostic accuracy between self-collected vaginal swabs with healthcare worker collected vaginal swabs34,35. Only one study compared healthcare worker collected endocervical swabs with self-collected vaginal swabs19. Figure 8 below presents summary estimates for the sensitivity and specificity of diagnostic accuracy healthcare worker collected specimens compared to self-collected specimens.

Figure 8
figure 8

Forest plot of TV studies that compared self-collected vaginal swabs with healthcare worker collected cervical and vaginal specimens.

The summary estimate for sensitivity and specificity is 0.94 (95% Confidence Interval 0.89–0.98) and 0.91 (95% Confidence Interval 0.85–0.96) respectively and it is depicted on the SROC in Fig. 9 below. Additionally, Fig. 9 depicts the diagnostic accuracy of the studies in this subgroup.

Figure 9
figure 9

SROC depicting diagnostic accuracy of TV in included studies.

The Cochran's Q test result shows that there is significant heterogeneity among the studies and the diagnostic test is moderately accurate in identifying patients with disease (Table 7).

Table 7 Heterogeneity and statistical significance for TV.

Methodological quality of studies

Table 8 below depicts the risk of bias and applicability assessment of included studies using the QUADAS-2 tool used to assess quality25. The domains of the QUADAS-2 tool are patient selection, index test, reference standard, and flow and timing. Patient selection outlines the process of selecting study participants in the primary studies which includes setting, presentation, prior testing, and intended use of index test; index test describes how the test of interest was conducted and interpreted; reference standard describes how the standard test was conducted and interpreted, and flow and timing describe excluded studies and intervals between the index and reference tests25. For the current study, the index test is designated as the self-collected specimens, while the reference test refers to the healthcare worker collected specimens.

Table 8 QUADAS-2 summary of methodological assessment.

For the majority of the studies, the sampling approach utilised was convenience sampling and not random or consecutive sampling which are the options available in the patient selection domain. Although convenience sampling was used for most of the studies and therefore introduced a high-risk bias, that is unlikely to interfere with the diagnostic accuracy of self-sampling and healthcare worker collected specimens. The reference standard domain and flow and timing domains were found to mostly be at low risk of bias in all the studies. Concerning applicability, all studies were at low risk of bias. However, regarding the applicability of the reference standard, it was unclear for most studies. The graphical results of the included studies from the QUADAS-2 quality assessment tool are indicated in Fig. 10.

Figure 10
figure 10

Assessment of included studies using QUADAS-2.

Discussion

This study compared the diagnostic accuracy of self-collected specimens to healthcare worker collected specimens for diagnosing CT, HPV, MG, NG, syphilis, and TV in females. No studies on syphilis fulfilled the eligibility criteria for inclusion in this review. For CT, six studies were included in the analysis, out of which four were included in the subgroup meta-analysis. The summary estimate for sensitivity was 0.85 (0.77–0.92), while specificity was 0.95 (0.91–0.98). For HPV, five studies were included, and there was insufficient data to perform a sub-group meta-analysis. However, the sensitivity and specificity of self-collected specimens of the individual studies compared to healthcare worker collected specimens varied between studies, with sensitivity ranging from 86 to 100%, and specificity ranging from 88.9% to 100%. For MG, four studies investigated diagnostic accuracy, and two studies had sufficient data for a sub-group meta-analysis. The summary estimate for sensitivity was low at 0.49 (0.39–0.58), while specificity was 0.88 (0.81–0.94). For NG, seven studies were included in the analysis, and four studies were included in the sub-group meta-analysis. The pooled sensitivity and specificity estimate was 0.59 (0.49–0.68) and 0.84 (0.76–0.91) respectively.

In the case of CT and NG, it is important to note that the low sensitivity and high specificity are comparable to previous findings15. For TV, seven studies investigated diagnostic accuracy, and four studies were included in the sub-group meta-analysis. The results of the meta-analysis showed that self-collected specimens have high sensitivity and specificity for the diagnosis of trichomoniasis, with a summary estimate for sensitivity and specificity of 0.94 (0.89–0.98) and 0.91 (0.85–0.96), respectively.

The study found that there was significant heterogeneity among the studies. This may be attributed to differences in the methods used to collect and test specimens across the different studies. The DOR results indicated that the diagnostic tests used in the studies had a good ability to differentiate between individuals with and without CT, HPV, NG, MG and TV. The study also presented a SROC curve to visualize the sensitivity and specificity of all included studies, with most points falling between 0.9 and 1.00 on the y-axis (sensitivity), indicating better performance in distinguishing between the presence and absence of infection.

The QUADAS-2 tool was used to assess the quality of the included studies, and it showed that a majority of them used convenience sampling to select patients. Although this sampling method can increase the risk of bias, it did not appear to affect the diagnostic accuracy of self-collected specimens and specimens collected by healthcare workers. Most of the included studies had a low risk of bias in the index test, reference standard, flow, and timing domains. Overall, the included studies introduced minimal bias, which enhances the quality of the research findings. Study screening, selection, and data extraction were conducted systematically to ensure the most suitable studies were included in the review. A comprehensive approach to reviewing existing evidence on the diagnostic accuracy of self-collected specimens versus those collected by healthcare workers was employed. Only peer-reviewed and published studies were included to ensure reliable results. Some of the included studies utilized convenience sampling, which may have introduced bias in the patient selection process.

Since we classified healthcare worker collected specimens as the gold-standard diagnostic accuracy was presumed to be 100%. For CT the healthcare worker collected sensitivity ranged between 50 and 100%, while specificity was 88 and 99.2%; for MG sensitivity ranged between 97 and 100%, while specificity was 88 and 100%; NG sensitivity ranged between 40 and 97%, while specificity was 88 and 100%; and TV sensitivity ranged between 96 and 100%, while specificity was 88 and 100%.

The results indicate that self-collected specimens are a comparative alternative to healthcare worker collected specimens for STI testing. This is in keeping with previous studies that advocate for the use of self-sampling interventions as alternative tools to enable and promote screening of STIs even in asymptomatic patients and resource- limited settings15,39. These findings have important implications for STI testing, particularly in settings where access to healthcare workers may be limited or where stigma and embarrassment may prevent individuals from seeking testing.

Limitations

The lack of eligible studies for syphilis and insufficient study data for meta-analysis in HPV limits the comprehensiveness of the review. There was significant heterogeneity among included studies, likely due to varying specimen collection and testing methods, which introduced variability and challenges with generalizability of the findings. Despite efforts to minimise bias during data analysis, the use of convenience sampling in most studies introduced potential bias in patient selection. Assuming the accuracy of the gold standard of healthcare worker-collected specimens may not fully capture variability in sensitivity and specificity among these samples. Conversely, the wide range of sensitivity and specificity values across individual studies underscores the complexity of interpreting overall diagnostic accuracy. Lastly, it is important to consider that the findings of this study may not be generalizable to resource-limited settings where access to healthcare workers and testing facilities differs.

Conclusion

This study presents evidence of the accuracy of self-collected specimens when used to diagnose STIs in females. The meta-analysis findings highlight that the diagnostic accuracy of self-collected specimen to diagnose STIs in females is comparable with that of healthcare worker collected specimens. When considering the global burden of STIs on the public health system, such findings are an indication of how self-sampling for STI diagnosis could be used to improve STI management services across the globe. Although much evidence exists on the use of this intervention in high-income countries22, the researchers hope that the findings of this study will capture the attention of governments in LIMCs and cause them to see their need for it. Furthermore, the potential of self-sampling interventions to improve screening of asymptomatic STIs must be recognized and utilized as a tool to fulfil goal 3 of the sustainable development goals which is targeted at treating and improving access to quality healthcare for all people across the globe. The study is limited in that the investigation of diagnostic accuracy of self-collected specimens was only conducted on females. Therefore, the findings are not representative of self-collected specimens among a broader and more diverse population. We, therefore, recommend a future study to investigate the accuracy of self-collected specimens for diagnosing a wide range of STIs in a more diverse and broader population.