Comparing accuracy of bedside ultrasound examination with physical examination for detection of pleural effusion

Background In detecting pleural effusion, bedside ultrasound (US) has been shown to be more accurate than auscultation. However, US has not been previously compared to the comprehensive physical examination. This study seeks to compare the accuracy of physical examination with bedside US in detecting pleural effusion. Methods This study included a convenience sample of 34 medical inpatients from Calgary, Canada and Spokane, USA, with chest imaging performed within 24 h of recruitment. Imaging results served as the reference standard for pleural effusion. All patients underwent a comprehensive lung physical examination and a bedside US examination by two researchers blinded to the imaging results. Results Physical examination was less accurate than US (sensitivity of 44.0% [95% confidence interval (CI) 30.0–58.8%], specificity 88.9% (95% CI 65.3–98.6%), positive likelihood (LR) 3.96 (95% CI 1.03–15.18), negative LR 0.63 (95% CI 0.47–0.85) for physical examination; sensitivity 98% (95% CI 89.4–100%), specificity 94.4% (95% CI 72.7–99.9%), positive LR 17.6 (95% CI 2.6–118.6), negative LR 0.02 (95% CI 0.00–0.15) for US). The percentage of examinations rated with a confidence level of 4 or higher (out of 5) was higher for US (85% of the seated US examination and 94% of the supine US examination, compared to 35% of the PE, P < 0.001), and took less time to perform (P < 0.0001). Conclusions US examination for pleural effusion was more accurate than the physical examination, conferred higher confidence, and required less time to complete.


Background
Pleural effusions are common in general medical patients and may be caused by pathological states such as congestive heart failure, infections, cirrhosis, and malignancy [1,2]. Detection of pleural effusions is important because their presence may signal a need for diagnostic and/or therapeutic interventions [2,3]. Traditionally in internal medicine, bedside identification of pleural effusions involves performing a physical examination (PE), followed by imaging studies [4]. PE includes a number of maneuvers, with dullness to percussion and asymmetric chest expansion considered the most accurate signs [5,6].
One study found that with minimal training, novice residents were able to reliably detect pleural effusions using bedside ultrasound (US) [7]. In contrast, only 60% of medical residents accurately detected bronchial breath sounds [8], such as those that might occur above the level of a pleural effusion [9], with no appreciable increase in auscultation accuracy between 1st year and 3rd year residents [8]. While studies suggest that US outperforms PE for detecting pleural effusion [10][11][12][13][14], many studies used only auscultation as the comparator, rather than a comprehensive PE. Thus, the superior performance of US in these studies may be exaggerated. To address this gap, our study seeks to compare the diagnostic accuracy of a multi-component PE with that of bedside US in detecting pleural effusions in medical patients.

Methods
The Conjoint Health Research Ethics Board at the University of Calgary and the Providence St. Joseph Health Institutional Review Board approved this study. This study is reported to conform to STARD guidelines for reporting studies of diagnostic accuracy [15].

Aim
This study seeks to compare the diagnostic accuracy of a PE with that of bedside US in medical inpatients.

Design and setting
A convenience sample of consenting patients were prospectively recruited from the Foothills Medical Centre (Calgary, AB, Canada) between August 2019 and March 2020 and Providence Sacred Heart Medical Center (Spokane, WA, USA) between September 2019 and June 2020. Patients admitted to the general medical ward who had chest computed tomography (CT) or chest radiography (CXR) performed within 24 h of the study period were eligible. Exclusion criteria were hemodynamic instability, inability to comply with the study protocol, or presence of pain/dressings that would preclude an US scan. Recruitment occurred when patient permission to approach the patient was provided by the admission team and when at least two researchers were available.
Consenting patients underwent both a PE and a bedside US examination independently by two researchers. The order of the examinations and the researcher performing each examination were randomized.

Physical examination (PE) protocol
With the patient in a sitting position, the researcher inspected the posterior thorax for asymmetry, and performed percussion, tactile fremitus, auscultation, and egophony. Based on these findings, the researcher recorded the PE diagnosis and rated their confidence in the diagnosis using a 5-point Likert scale, where 1 = not at all confident and 5 = very confident. This procedure was then repeated on the other side by the same researcher, and time to complete the examination was recorded.

Ultrasound examination protocol
A different researcher, blinded to the PE results, performed the US exam. With the patient in the seated position, a low-frequency transducer (2-5 MHz curvilinear array, Edge II, in Calgary; 1-5 MHz phased array, M-Turbo, Sonosite Inc., in Spokane) was used to longitudinally scan the posterior thorax inferiorly from the lung apices (Fig. 1A). The diagnosis was noted and the procedure was repeated on the other side.
Next, with the patient in a supine position, the same researcher performed coronal views of the lower lung zones (Fig. 1B). The presence of the spine sign and free fluid was used to determine the presence of pleural effusion. Time taken to complete the US examination in the seated and supine position and self-reported confidence were recorded.

Reference standard
Chest imaging findings reported by radiologists at each institution on chart review served as the reference standard for the diagnosis of pleural effusion. All researchers were blinded to the imaging study results at the time of the PE and US examinations.

Researcher training
Three researchers performed the study protocol in Calgary. Two were certified in Internal Medicine by the Royal College of Physicians and Surgeons of Canada and completing their point-of-care ultrasound fellowships (MW, LS). The third (CO) was a post-graduate year (PGY-3) medical resident who underwent a 1-h didactic session on US pleural effusion, and 6 h of supervised lung US scanning prior to patient enrollment. Of the PEs, five were completed by researcher MW, eight by CO, and four by LS. Of the US examinations, 11 were performed by MW, three by CO, and three by LS.
At the Spokane site, two researchers conducted the study protocol: one was a PGY-3 medical resident (NC), who completed eight PEs and nine US examinations for this study protocol, and the other was a PGY-4 clinical teaching fellow (JC), who completed nine PEs and eight US examinations for this study. Both completed a 4-h didactic session on bedside US and 12 h of supervised lung US scanning during a 2-week bedside US elective prior to patient enrollment.

Study outcomes
The primary outcomes were the diagnostic accuracies of PE and US. Secondary outcomes included time required to perform the examinations and overall confidence.

Statistical analyses
Based on the previously reported pooled sensitivity of 0.93 for US [16], assuming an alpha of 0.5, and a prevalence of 0.94, a sample size of 28 would be required for our study [17,18]. We used Wilcoxon rank-sum tests, Fisher exact tests, analysis of variance, and post hoc Tukey tests (where appropriate) to compare differences between groups. The diagnostic accuracy of the examinations was compared to the reference standard using sensitivity, specificity, positive likelihood ratio (LR) and negative LR. LRs greater than 10 or less than 0.1 were considered large effects; LRs of 5 to 10 and 0.1 to 0.2 moderate; 2 to 5 and 0.5 to 0.2 small; 1 to 2 and 0.5 to 1 negligible [19]. Confidence between groups was compared using Kruskal-Wallis tests and Fisher's exact tests adjusted for multiple comparisons using Bonferroni corrections. All analyses were performed using SAS version 9.4 (SAS Institute Inc., Cary, NC).

Results
A convenience sample of 34 patients (n = 17 from Calgary and n = 17 from Spokane) consented to participate (Table 1). Based on chest imaging, 22 (65%) had bilateral pleural effusions, six (27%) had unilateral, and six (27%) had no effusions. Twenty patients had a CXR performed, two had a CT performed, and 12 had both CT and CXRs performed. For those with both imaging studies performed, there was complete concordance in the pleural effusion diagnosis. There was no missing data nor known adverse events from either the reference imaging studies nor the physical examination or bedside US studies.

Physical exam
Decreased tactile fremitus and egophony were more commonly noted in cases of pleural effusion ( Table 2). The overall PE for pleural effusion had a sensitivity of 44.0% and specificity of 88.9% (Table 3). Chest asymmetry had the highest specificity (94.4%) while dullness to percussion had the highest sensitivity (94.0%).

US exam
Both the seated and supine US examination had high diagnostic accuracies, (sensitivity 92% and 98%, specificity 94.4% and 94.4%, respectively for seated and supine US exam, Table 3).

Secondary outcome measures
Mean duration of the PE [2 min 53 s ± standard deviation (SD) 39 s] was longer than the US exam in the seated and supine position (2 min 6 s ± SD 38 s and 1 min 37 s ± 43 s, respectively, P < 0.05 for each comparison, Table 2). For confidence in the PE diagnosis, 24 (35%) of the 68 examinations were rated with a score of four or higher (out of five), while 58 (85%) of the US seated examination and 64 (94%) of the US supine examinations were rated as four or higher (Table 2). After accounting for multiple comparisons, where a P < 0.017 is needed for statistical significance, post hoc analyses revealed that confidence in the PE was significantly lower than either of the US examinations (P < 0.001 for both), while the confidence between the two US examinations was not significantly different (P = 0.27).

Discussion
In our study of general medical inpatients, bedside US examination demonstrated higher accuracy than the PE. Sitting and supine US examinations resulted in diagnostic LRs that are considered large in magnitude, while LRs associated with PE were small [19]. Additionally, US examinations resulted in higher confidence and took less time to perform. While asymmetric chest wall expansion and dullness to percussion demonstrated high specificity and high sensitivity, respectively, the associated LRs were negligible to small [19]. Our supine US examination results (sensitivity of 98%, specificity 94.4%, a diagnostic accuracy of 97.1%) are consistent with existing literature. A pooled sensitivity of 93% and specificity of 96% were noted in a prior systematic review [16], while a diagnostic accuracy of 95.1% was previously reported [20]. In contrast, for dullness to percussion, our positive LR 1.13 is lower than the previously reported LR of 8.7, although a wide 95% CI (2.2-33.8) was noted [5]. Pooled negative LR of 0.31 in that review was similar to ours (0.36) [5]. Also consistent with existing literature was the finding that asymmetric chest expansion demonstrated a favorable specificity of over 90% [5,21]. However, our positive LR was 3.24, while theirs was 8.1 [5], which may be a function of protocol differences, such as having trainees perform PE in our study, rather than experienced clinicians [5,21]. However, even with experienced clinicians, their positive LR of 8.1 was only moderate in strength.
Few studies directly compared PE with bedside US for pleural effusion. Five prior studies reported that US was superior [10][11][12][13][14]. However, in at least four of these studies, only auscultation was performed [11][12][13][14], and auscultation is known to be less accurate than other PE maneuvers [5,6]. In another study, added to PE, US resulted in significantly higher odds of identifying pleural effusion compared to PE alone [22]. Our present study adds to existing literature by directly comparing US with a comprehensive PE. Our results demonstrate that, despite the high specificity associated with asymmetric chest wall expansion and high sensitivity for dullness to percussion, the small likelihood ratios associated with these PE maneuvers suggest that US should be the preferred approach. Further, the associated large likelihood ratios for US suggest that, in the hands of a trained POCUS practitioner, the presence of pleural effusion can likely be ruled in or ruled out, with reasonably high accuracy, especially if the supine examination was used, evaluating for the presence or absence of the spine sign.
Our study has some limitations. First, our study sample had very few controls; only 6% did not have a pleural effusion. Although patients were not selected based upon the results of chest imaging studies, we cannot rule out the possibility that permission by the admitting team to approach patients for the study was more likely when the patient had pleural effusion, thereby introducing a potential selection bias. The resultant high prevalence of pleural effusions in our study sample is important to note for two reasons. First, representativeness of our conclusions may be limited, thereby limiting generalizability of our study conclusions [19]. Secondly, a high prevalence may also influence our resultant diagnostic accuracies, given the restriction in range seen as well as high reader expectations [23]. However, prior studies have reported that lower specificity may be reported in studies with higher prevalence [23]. Second, because of the multiple examinations required of our volunteer patients for this study protocol and the need for two clinician researchers present per patient, we were not able to perform an additional examination by an independent researcher. Thus, we do not have inter-rater reliability data for our study. Prior studies suggest that inter-rater reliability for US examination is at least moderate or higher for pleural effusions [7,24]. Third, our reported US duration did not include the time required for the machine to be located, transported to the bedside, and subsequently turned on. Thus, the actual time required may vary in the real-world setting. Fourth, resident-performed examinations may be less accurate than those performed by board-certified internists. However, senior trainees at both institutions commonly perform PEs on admission. Fifth, we did not collect information on subsequent patient management decisions, any ensuing procedures, or patient preferences regarding PE vs. US. In addition, we did not collect information on the admission diagnosis. At the Calgary site, the most common medical admission diagnoses were: congestive heart failure, alcohol withdrawal, type 1 diabetes mellitus with ketoacidosis, pneumonia, and fluid/ electrolyte/acid base disorders. At the Spokane site, the most common admission diagnoses were: sepsis, acute respiratory failure, cerebral infarction, hypertensive heart and kidney disease, and alcoholic liver disease. Finally, the majority of our reference standards were based on CXR results. CT would be a preferred gold standard. However, sensitivity analysis of our diagnostic parameters did not result in significant changes to our conclusions when limiting our analyses to only those with CT results (data not shown). In detecting pleural effusion, US may be more sensitive than CXR [16,25,26], and could be identifying clinically insignificant effusions. However, US can directly visualize septations and complex effusions, and in that regard, has a theoretical advantage over CXR and CTs [27]. Neither of these outcomes were examined in our study, but should be considered in future studies.

Conclusions
In conclusion, US examination for pleural effusion is more accurate, confers greater confidence, and may be quicker to complete than PE. Thus, in situations where the device is readily accessible and practitioner is trained to scan and interpret, US examination for pleural effusion would be preferred.