- Open Access
Reliability of ultrasound versus digital vaginal examination in detecting cervical dilatation during labor: a diagnostic test accuracy systematic review
The Ultrasound Journal volume 13, Article number: 37 (2021)
This systematic review aimed to investigate the reliability of ultrasound method compared with digital vaginal examinations in detecting cervical dilation.
We searched Cochrane (CENTRAL), MEDLINE, EMBASE, CINAHL, ISI Web of Science Core Collection, Trip Database, PubMed, DARE and NHS EED, HTA, and PROSPERO. Ten observational studies with a total sample size of 856 were included in the meta-analysis.
The intraclass correlation coefficient (ICC) values ranged between 0.21 and 0.69. The fixed-effect models for the ultrasound test showed an average of ICC (r = 0.32 (95% CI 0.26–0.38). Correlation between two methods was poor (r = 0.359, 95% CI 0.26–0.44). In nulliparous and multigravida participants the correlation between ultrasound measurements and digital examinations was (r = 0.349, 95% CI 0.25–0.43) and ICC (r = 0.676, 95% CI 0.419–0.833), respectively.
Trans-perineal ultrasonography seems to be a reliable method for assessing labor progression in multigravida women, but its usage in nulliparous women needs further studies.
Diagnosing the onset of labor is one of the most critical and complex judgments made by care providers in the delivery room . Measurement of cervical dilatation is considered the most crucial parameter for labor progress during childbirth and the main reason for doing digital vaginal examination (DVE) in women with signs of labor onset [2, 3]. Cervical dilatation is also used to study uterine activity, oxytocin use, and the transition from latent to the active phase of labor and it is an essential element of Bishop Score . Also, cervical dilatation is used to predict the mode of delivery so that people with prolonged labor may have more cesarean sections . Today, the correct measurement of cervical dilation has the utmost importance and help health providers to make a timely and proper decision. Methods used to assess cervical dilatation are classified into three categories: (1) traditional techniques such as mechanical devices, electromagnetic devices, and electronic sensor systems; (2) DVE; (3) and ultrasonic machines . DVE was the most common method for measuring cervical dilatation in the past and has been the gold standard for assessing labor progress [7, 8]. However, it can be subjective, inaccurate, and uncomfortable for women . Also, only 50% of assessments are accurate, and there is an increased risk of infection with frequent DVE [8, 10]. Therefore, vaginal examination cannot be a correct scale for measuring cervical dilatation, mainly when done by different examiners . Because of the poor reliability and pain associated with DVE, the use of intrapartum ultrasound to measure cervical dilation has been suggested as an alternative method . Abdominal, vaginal, trans-labial, and trans-perineal 2-dimensional (2D) and 3D ultrasounds have been used to measure cervical dilatation during labor . The first usage of trans-perineal ultrasonography for measuring the cervical os at various stages of labor was reported in 1996 by Voskresinsky . In 2009, Zimerman et al. compared 3D ultrasound assessments of cervical dilatation and clinical examinations and found a significant correlation . Hassan et al. found excellent agreement between trans-perineal ultrasound (TPUS) and DVE in measuring cervical dilation . Trans-labial 3D ultrasonography has been suggested as an accurate and reproducible method for assessing cervical dilatation . As a result, evidence indicates that the most frequently used imaging technology for measuring cervical dilatation is ultrasound . However, which particular ultrasound modality is superior to others is unclear. It is also not clear how reliable ultrasound measurements are compared to vaginal examination. The current review aims of the current review is to verify the reliability of the ultrasound method compared with DVE in detecting cervical dilation.
Materials and methods
This systematic review followed the methodology consistent with Systematic Reviews and Meta-Analyses of diagnostic testing studies. The protocol of this systematic review was published in 2019 .
In this systematic review, we included promulgated studies until April 19, 2019. The search was updated on May 17, 2020. Frequent searches were performed on Cochrane Central Register of Controlled Trials (CENTRAL) via Cochrane Library, MEDLINE via Ovid, EMBASE via Ovid, CINAHL, ISI Web of Science Core Collection, Trip Database, PubMed Systematic Reviews subset, DARE, and NHS EED via the University of York, HTA and PROSPERO via the University of York (all databases were searched from inception to the current data). Ultrasound Methods, Cervical Dilatation, and Labor were used as key-words.
Inclusion and exclusion criteria
Included in this review were all observational studies with cross-sectional or diagnostic case–control study designs evaluating the accuracy of available methods for cervical dilation measurement during labor. We assessed measurement method alone or in combination with DVE (when used in a diagnostic algorithm). We did not impose any language restriction in this review. Studies that had the following characteristics were included: studies using ultrasound and DVE for detecting cervical dilation, and those which recruiting women with singleton pregnancy, with any type of placenta attachment or type of conception, at any maternal age, and body mass index. Studies that recruited pregnant women with twins, triplets, or quadruplets were excluded. All cited studies must have obtained informed consent from each and every study participant and received protocol approval from an ethics committee or institutional review board (Fig. 1).
Study selection and data extraction
The search was carried out by MA. Two authors (MA and ZM) independently screened all searched studies and extracted data using Excel from those included in the review. If there was a conflict, it was resolved by discussion or getting advice from a third party (PA or SJ).
We used the Quality Assessment of Diagnostic Test Accuracy Studies-2 (QUADAS-2) to assess the methodological quality of the included studies. The QUADAS-2 tool was applied in four phases: it summarizes the review question, tailors the instrument, and produces review-specific and judge bias and applicability. Each paper was judged as having a ‘low’, ‘high’, or ‘unclear’ risk for each of the four domains, and concerns about applicability were assessed in three disciplines. Two review authors (MA and ZM) independently applied the QUADAS-2 tool to the full text of each study. Disagreements were resolved by discussion, or if needed, by a third review author (SJ or PA). RevMan software was also used to construct methodological quality summary graphs. The summary of risk of bias assessment is presented in Fig. 2.
The unit of analysis in studies was women in labor, as cervical dilatation is a single calculated measure using various methods. We extracted the intraclass correlation coefficient (ICC), and the Pearson correlation coefficient and P values associated with these measures to estimate the reliability of ultrasound compared with DVE. The data were then transferred into the Comprehensive Meta-Analysis Software to produce plots and estimates. We presented individual study results graphically by plotting the estimates of reliability in forest plots.
To facilitate comparisons across studies, eligibility was restricted to studies measuring reliability via: (1) intraclass correlation coefficient (ICC); (2) Pearson correlation coefficient, and (3) minimal detectable change with a 95% confidence level. The ICC, inter- and intra-tester reliability measure, essentially assess absolute agreement in repeated measurements of an object. ICC has been commonly used in the functional connectivity literature to assess reliability by some authors [18, 19]. Pearson’s correlation is used where variables are scaled and centered separately. It is used to assess the strength of a linear relationship between the results of two tests . To maximize the number of studies included in the forest plot, when an article only reported the minimal detectable change with a 95% confidence level, it was transformed into SEM using the following formula:
Then the SEM was converted into ICC using Formula (2).
Pearson correlation coefficient was also converted to ICC, and the final meta-analysis was conducted using ICC values. Using raw ICC values, we analyzed the data with the assumption that they were distributed normally. To perform a meta-analysis of pooled ICC using the random-effects model, all coefficients were transformed to Fisher’s Z values and weighted by sample size using inverse variance weight for the analysis. The average reliability coefficients and their confidence limits were back-transformed to the original metric of reliability coefficients to facilitate the interpretation of the results. ICC values of 0.7–0.9 were considered acceptable, but values higher than 0.9 deemed desirable .
The heterogeneity was assessed by calculating the Q statistic and the I2 index. The I2 above 50% was considered heterogeneous. The statistical analyses were carried out with Comprehensive Meta-analysis 3.3 software (BioStat, Englewood, NJ, USA) . Additionally, a funnel plot was drawn to estimate the publication bias.
In the primary search of databases, 1430 articles were found. After removing the duplicates (n = 1233), the titles and abstracts were screened for potentially relevant studies (n = 1222). Eleven articles were considered eligible for full-text screening. We attempted to contact the authors to obtain the complete text for one article, but received no reply, so we excluded this study . Finally, ten studies were included in the meta-analysis (Fig. 1).
The included studies were published from 2009 to 2019 with a prospective cohort design. The characteristics of the included studies are shown in Table 1. Three of the studies were conducted in the UK [11, 12, 24], and seven in each of following countries: Spain , South Korea , Republic of Ghana , Tunisia , Sweden , Turkey , and Israel . All studies were performed in the hospital setting. TPUS or trans-labial ultrasound (TIUS) versus DVE was used for the assessment of cervical dilatation during labor in all studies. The transducer of the ultrasound examinations was placed transperineally at the level of the posterior Fourchette in a sagittal position. Vaginal digital examinations were performed before or immediately after the ultrasound examination by the responsible birth attendant. Six studies recruited both nulliparous and multigravida women [3, 7, 15, 24, 25, 28], and one study enrolled only nulliparous women . One study used TLUS (trans-labial ultrasound) method, and other studies adopted TPUS (trans-perineal ultrasound) method. The sample size of all studies ranged from 25 to 195. The total number of participants enrolled was 856 in both groups.
We analyzed ten studies with a total sample size of 856. The ICC values ranged between 0.21 and 0.69. The fixed-effect models for the ultrasound test showed average reliability of ICC (r = 0.32 (95% CI 0.26–0.38). Heterogeneity was estimated using I2 = 48.72 (Table 2). The effect sizes exhibited moderate heterogeneity (based on the Q statistics and the I2 indices), supporting the decision to apply the random-effects model. Correlation between the two methods with random model was r = 0.359, (95% CI 0.26–0.44, P = 0.000). The limits of agreement were 0.267–0.446. Based on the value of the ICC with a 95% confidence interval, the correlation between the two methods for the measurement of cervical dilatation was poor. Forest plot of the intra-tester and inter-methods reliability as seen in Fig. 3 was obtained in the studies that applied TPUS or TIUS versus DVE to measure cervical dilatation.
Sub-group analysis by gravidity
Seven studies included 439 nulliparous women, and six studies included 168 multigravida women. Overall, ICC reliability was 0.32 (95% CI 0.26–0.38). The fixed-effect model showed that in nulliparous participants the correlation between ultrasound measurements and digital examinations for measurement of cervical dilatation during labor is r = 0.349, (95% CI 0.25–0.43 P = 0.000) (Fig. 4). The limits of agreement were 0.258–0.434. The I2 was 72.905, which means that 72% of the observed variance between studies is due to fundamental differences in the effect size. Only about 28% of the observed variance would have been expected based on random error. Tau-squared is 0.054. This is the “between studies” variance that was used in computing weights. The random-effect model was then used to give more weight to smaller studies. The correlation between the two methods with the random-effect model is r = 0.497 (95% CI 0.29–0.65 P = 0.000) (Fig. 5). Based on the value of the ICC with 95% confident intervals, the correlation between the two methods was poor in the nulliparous (ICC lower than seven).
In multigravida women, as seen in Fig. 6, because Tau-squared is 0.158, the random-effect models for the ultrasound test were used, and results showed an average value [ICC (r = 0.676, 95% CI 0.419–0.833), P = 0.000]. In this model, I2 was 78.007, showing the correlation between the two methods is moderate in the multigravida women (Fig. 7).
Summary of main findings
The primary purpose of the current meta-analysis was to estimate both the inter- and intra-methods reliability of using ultrasound compared to DVE in detecting cervical dilation measures. In this systematic review, ten studies were included. Overall, the random model showed poor reliability between the two methods. This can be because the women participating in these studies were not homogenous in terms of parity.
The subgroup analysis showed that the correlation between DVE and ultrasound in nulliparous women was poor, while this correlation in multiparous women was moderate. This means that in multigravida women, ultrasound measurements and digital examinations for cervical measures during labor are consistent. Overall, the pooled data indicated a low value of ultrasound diagnosis, resulting in conflicts with independent studies.
DVE is still the most commonly utilized method to assess cervical dilatation, fetal presentation, fetal position, and fetal descent during all stages of labor. However, DVE is associated with pain and the risk of infection. Therefore, clinicians tried to replace DVE with other methods, such as trans-perineal ultrasound.
The texture of the cervix changes dramatically after the first birth. Some women undergo rupture of the cervix at birth. Therefore, it is logical that multipara women would have a differently shaped cervix compared with their nulliparous counterparts . It is also possible that the cervix drastically remodels, reorganizes, and softens during gestation. Thus, the consistency and integrity of the cervix vary at different gestational age. As the fetus descends to the pelvis, more pressure is placed on the cervix. Hence, the length of the cervix is expected to shorten as a pregnancy progresses .
This is especially true in nulliparous women as the fetal descent happens during the last 4 weeks of pregnancy, and it is a slow descent, rather than a fast one, as seen in multiparous pregnancies. Moreover, women with elongated cervix might have more fiber in the cervix, making the cervix’s mechanics and structure different from those with the shorter cervixes .
In this study, we found a low value of ultrasound diagnosis, which conflicts with the independent studies. This could be due to the high heterogeneity found in our pooled data. The resolution lies in more sample size, which translates to conduct more quality RCTs. Furthermore, some of the included studies had very low sample sizes, and the effect of confounders such as the timing of membrane rupture, was not apparent. The studies also failed to mention whether the data were collected during the active or latent phase of labor.
The preliminary results of Zimerman et al. showed that ultrasound to detect cervical dilation is considered problematic . However, Hassan et al. showed that the correlation coefficient between ultrasound measurements and DVE is relatively high (r = 0.82, P = 0.05) [11, 12].
Also, Wiafe et al. in a systematic review showed a high correlation between ultrasound and digital examination of the cervix for detecting cervical dilation. Still, there was no significant difference in terms of success rate . The discrepancy between the present study and the Wiafe et al.’s study may be related to the fact that they recruited five studies. The heterogeneity in their meta-analysis was high (I2 = 96%), and they did not follow the DTA method.
DVE is the accepted clinical procedure for the detection of cervical dilatation during labor . However, DVE is a manual procedure that heavily depends on the providers’ experience. It is therefore, considered an imprecise measurement if conducted by inexperienced clinicians . In addition, examination and manipulation of the cervix might cause discomfort to women. In contrast, in ultra-sonographic cervical dilatation measurement, the uterine cervix is left intact, and natural contour is preserved . Also, cervical dilation changes in labor according to studies that used cervical ultrasound markers (clips) over time. Thus, two examiners may differ and yet both might be accurate . Martorelli et al. also concluded that transvaginal ultrasound before the onset of labor in women with gestational age > 40 weeks might help predict failed labor induction. Still, it should not be used for performing a cesarean section .
Strengths and limitations
This was the first systematic review to compare the reliability of ultrasound (TPUS or TLUS versus digital examination in detecting cervical dilation. The quality of the included studies was good, and most studies were free of serious biases.
Several limitations existed in this meta-analysis: (1) three studies failed to report parity; hence we were unable to include these studies in our subgroup analysis; (2) some other confounders such as the timing of rupture of member and the active or passive phases of labor were not evident; and (3) the sample size of the included studies was very small. These limitations could have contributed to heterogeneity substantially.
According to this systematic review, the digital examination can be replaced by trans-perineal ultrasound in multiparous women, while using this method in nulliparous women needs more thorough studies.
Trans-perineal ultrasonography seems to be a reliable method for assessing labor progression in multigravida women, but its application in nulliparous women needs further studies.
Availability of data and materials
Intraclass correlation coefficient
Digital vaginal examination
Quality Assessment of Diagnostic Test Accuracy Studies-2
Trans-perineal ultrasound method
Hanley GE, Munro S, Greyson D et al (2016) Diagnosing onset of labor: a systematic review of definitions in the research literature. BMC Preg Childbirth 16:71. https://doi.org/10.1186/s12884-016-0857-4
Brancadoro M, Tognarelli S, Fambrini F et al (2018) Devices for measuring cervical dilation during labor: systematic review and meta-analysis. Obstet Gynecol Surv 73:231–241. https://doi.org/10.1097/ogx.0000000000000548
Wiafe YA, Whitehead B, Venables H et al (2018) Intrapartum ultrasound assessment of cervical dilatation and its value in detecting active labor. J Ultrasound 21:233–239. https://doi.org/10.1007/s40477-018-0309-2
Muliira RS, Seshan V, Ramasubramaniam S (2013) Improving vaginal examinations performed by midwives. Sultan Qaboos Univ Med J 13:442–449
Hajati F (2012) Cervical dilatometer: a simple tool for objective measurement of cervix dilatation during labor. QHMS 18:82–86
Letić M (2003) Inaccuracy in cervical dilatation assessment and the progress of labour monitoring. Med Hypotheses 60:199–201. https://doi.org/10.1016/s0306-9877(02)00358-4
Kim J, Kim S, Jeon S, Jung S (2018) A longitudinal study investigating cervical changes during labor using a wireless ultrasound device. J Matern Fetal Neonatal Med 31:1787–1791. https://doi.org/10.1080/14767058.2017.1329292
Rudel D, Vidmar G, Leskosek B, Verdenik I et al (2007) Predictive value of EMG basal activity in the cervix at initiation of delivery in humans. In: Zupanic A (ed) 11th Mediterranean conference on medical and biomedical engineering and computing. Springer, Berlin (http://doi-org-443.webvpn.fjmu.edu.cn/10.1007/978-3-540-73044-6_33)
Molina FS, Terra R, Carrillo MP et al (2010) What is the most reliable ultrasound parameter for assessment of fetal head descent? UOG 36:493–499. https://doi.org/10.1002/uog.7709
Seaward PG, Hannah ME, Myhr TL et al (1998) International multicenter term PROM study: evaluation of predictors of neonatal infection in infants born to patients with premature rupture of membranes at term. Premature rupture of the membranes. Am J Obstet Gynecol 179:635–639. https://doi.org/10.1016/s0002-9378(98)70056-0
Hassan WA, Eggebø TM, Ferguson M, Lees C (2013) Simple two-dimensional ultrasound technique to assess intrapartum cervical dilatation: a pilot study. UOG 41:413–418. https://doi.org/10.1016/s0002-9378(98)70056-0
Hassan WA, Eggebø T, Ferguson M et al (2014) The sonopartogram: a novel method for recording progress of labor by ultrasound. UOG 43:189–194. https://doi.org/10.1002/uog.13212
Laughon SK, Branch DW, Beaver J, Zhang J (2012) Changes in labor patterns over 50 years. Am J Obstet Gynecol 206:419.e1-419.e9. https://doi.org/10.1016/j.ajog.2012.03.003
Voskresinsky S (1996) Bio-mechanism in labor the discrete-wave theory. Ltd Polibig, Minsk, p 186 (Original article in Russian)
Zimerman AL, Smolin A, Maymon R et al (2009) Intrapartum measurement of cervical dilatation using translabial 3-dimensional ultrasonography: correlation with digital examination and interobserver and intraobserver agreement assessment. J Ultrasound Med 28:1289–1296. https://doi.org/10.7863/jum.2009.28.10.1289
Balikuddembe M, Tumwesigye N, Wakholi P, Tylleskär T (2017) Computerized childbirth monitoring tools for health care providers managing labor: a scoping review. JMIR Med Inform 5:e14. https://doi.org/10.2196/medinform.6959
Abedi P, Nejad PS, Jahanfar S (2019) Accuracy of ultrasound methods versus other methods for detecting of cervical dilatation during labor, a protocol for systematic review. EC Gynaecol 8(4):263–277
Martin J, Girling A, Nirantharakumar K et al (2016) Intra-cluster and inter-period correlation coefficients for cross-sectional cluster randomised controlled trials for type-2 diabetes in UK primary care. Trials 17:402. https://doi.org/10.1186/s13063-016-1532-9
Chen G, Taylor PA, Haller SP et al (2018) Intraclass correlation: Improved modeling approaches and applications for neuroimaging. Hum Brain Mapp 39:1187–1206. https://doi.org/10.1002/hbm.23909
Morgan CJ, Aban I (2016) Methods for evaluating the agreement between diagnostic tests. J NUCL CARDIOL 23:511–513. https://doi.org/10.1007/s12350-015-0175-7
López-López JA, Marín-Martínez F, Sánchez-Meca J et al (2014) Estimation of the predictive power of the model in mixed-effects meta-regression: a simulation study. Br J Math Stat Psychol 67:30–48. https://doi.org/10.1111/bmsp.12002
Borenstein M, Hedges LV, Higgins JPT, Rothstein HR (2009) Introduction to meta-analysis. Wiley, Chichester
Zahalka N, Sadan O, Malinger G, Liberati M, Boaz M, Glezerman M, Rotmensch S (2005) Comparison of transvaginal sonography with digital examination and transabdominal sonography for the determination of fetal head position in the second stage of labor. Am J Obstet Gynecol 193:381–386. https://doi.org/10.1016/j.ajog.2004.12.011
Wilkinson M, Usman S, Barton H, Lees C (2017) OP1909: transperineal ultrasound to assess cervical dilatation in labour: a feasible alternative to digital vaginal examination in early labour? UOG 50:112. https://doi.org/10.1002/uog.17880
Cuerva MJ, García-Casarrubios P, García-Calvo L et al (2019) Use of intrapartum ultrasound in term pregnant women with contractions before hospital admission. Acta Obstet Gynecol Scand 98(2):162–166
Dimassi K, Hammami A, Ben Amor A et al (2016) P17. 04: relevance of ultrasound cervical dilation measurements during labour. UOG 48:222
Benediktsdottir S, Eggebø TM, Salvesen KÅ (2015) Agreement between transperineal ultrasound measurements and digital examinations of cervical dilatation during labor. BMC Preg Childbirth 15:273. https://doi.org/10.1186/s12884-015-0704-z
Yuce T, Kalafat E, Koc A (2015) Transperineal ultrasonography for labor management: accuracy and reliability. Acta Obstet Gynecol Scand 94:760–765. https://doi.org/10.1111/aogs.12649
Jo YS, Jang DG, Kim N et al (2011) Comparison of cervical parameters by three-dimensional ultrasound according to parity and previous delivery mode. Int J Med Sci 8:673–678. https://doi.org/10.7150/ijms.8.673
Jafari-Dehkordi E, Adibi A, Sirus M (2015) Reference range of the weekly uterine cervical length at 8 to 38 weeks of gestation in the center of Iran. Adv Biomed Res 4:115. https://doi.org/10.4103/2277-9175.157839
Inde Y, Nakai A, Sekiguchi A et al (2018) Cervical dilatation curves of spontaneous deliveries in pregnant Japanese females. Int J Med Sci 15:549–556. https://doi.org/10.7150/ijms.23505
Wiafe YA, Whitehead B, Venables H, Nakua EK (2016) The effectiveness of intrapartum ultrasonography in assessing cervical dilatation, head station and position: a systematic review and meta-analysis. Ultrasound 24(4):222–232. https://doi.org/10.1177/1742271X16673124
Phelps JY, Lambrou N, Roshanfekr D (1998) Accuracy and intraobserver variability of simulated cervical dilatation and effacement measurements. Prim Care Update Ob/Gyns 5:185. https://doi.org/10.1016/s1068-607x(98)00102-4
Clement S (1994) Unwanted vaginal examinations. Br J Midwifery 2:368–370
Feltovich H (2017) Cervical evaluation: from ancient medicine to precision medicine. Obstet Gynecol 130:51–63. https://doi.org/10.1097/aog.0000000000002106
Sevrin CE, Martorelli LM, Famá EAB et al (2019) Ultrasound evaluation of the cervix to predict failed labor induction. Rev Bras Ginecol Obstet 41(8):476–484. https://doi.org/10.1055/s-0039-1693679
The authors thank the authors of those original studies who responded to requests for data or clarifications.
No funding was received for this research.
Ethics approval and consent to participate
Consent for publication
There is no conflict of interest to declare.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Mohaghegh, Z., Jahanfar, S., Abedi, P. et al. Reliability of ultrasound versus digital vaginal examination in detecting cervical dilatation during labor: a diagnostic test accuracy systematic review. Ultrasound J 13, 37 (2021). https://doi.org/10.1186/s13089-021-00239-1
- Cervical dilatations
- Vaginal exams
- Diagnostic test