Try a new search

Format these results:

Searched for:

in-biosketch:true

person:oermae01

Total Results:

119


Revised Cardiac Risk Index versus ASA Status as a Predictor for Noncardiac Events After Posterior Lumbar Decompression

Bronheim, Rachel S; Oermann, Eric K; Bronheim, David S; Caridi, John M
BACKGROUND:The Revised Cardiac Risk Index (RCRI) was designed to predict risk for cardiac events after noncardiac surgery. However, there is a paucity of literature that directly addresses the relationship between RCRI and noncardiac outcomes after posterior lumbar decompression (PLD). The objective of this study is to determine the ability of RCRI to predict noncardiac adverse events after PLD. METHODS:The American College of Surgeons National Surgical Quality Improvement Program was used to identify patients undergoing PLD from 2006 to 2014. Multivariate and receiver operating characteristic analysis was used to identify associations between RCRI and postoperative complications. RESULTS:A total of 52,066 patients met the inclusion criteria. Membership in the RCRI=1 cohort independently predicted unplanned intubation, ventilation >48 hours, progressive renal insufficiency, acute renal failure, urinary tract infection (UTI), sepsis, septic shock, and readmission. Membership in the RCRI=2 cohort independently predicted for superficial surgical site infection, pneumonia, unplanned intubation, ventilation >48 hours, bleeding transfusion, progressive renal insufficiency, acute renal failure, UTI, sepsis, septic shock, and readmission. Membership in the RCRI=3 cohort independently predicted unplanned intubation (odds ratio [OR], 11.8), ventilation >48 hours (OR, 23.0), acute renal failure (OR, 84.5), and UTI (OR, 3.6). RCRI had a poor discriminative ability (DA) (area under the curve = 0.623), and American Society of Anesthesiologists status had a fair DA (area under the curve = 0.770) to predict a composite of noncardiac complications. CONCLUSIONS:RCRI was predictive of a wide range of noncardiac complications after PLD but had a diminished DA to predict a composite of any noncardiac complication than did American Society of Anesthesiologists score. Consideration of the RCRI as a component of preoperative surgical risk stratification can minimize patient morbidity and mortality after lumbar decompression.
PMID: 30218801
ISSN: 1878-8769
CID: 4491412

Predicting Surgical Complications in Adult Patients Undergoing Anterior Cervical Discectomy and Fusion Using Machine Learning

Arvind, Varun; Kim, Jun S; Oermann, Eric K; Kaji, Deepak; Cho, Samuel K
OBJECTIVE:Machine learning algorithms excel at leveraging big data to identify complex patterns that can be used to aid in clinical decision-making. The objective of this study is to demonstrate the performance of machine learning models in predicting postoperative complications following anterior cervical discectomy and fusion (ACDF). METHODS:Artificial neural network (ANN), logistic regression (LR), support vector machine (SVM), and random forest decision tree (RF) models were trained on a multicenter data set of patients undergoing ACDF to predict surgical complications based on readily available patient data. Following training, these models were compared to the predictive capability of American Society of Anesthesiologists (ASA) physical status classification. RESULTS:A total of 20,879 patients were identified as having undergone ACDF. Following exclusion criteria, patients were divided into 14,615 patients for training and 6,264 for testing data sets. ANN and LR consistently outperformed ASA physical status classification in predicting every complication (p < 0.05). The ANN outperformed LR in predicting venous thromboembolism, wound complication, and mortality (p < 0.05). The SVM and RF models were no better than random chance at predicting any of the postoperative complications (p < 0.05). CONCLUSION/CONCLUSIONS:ANN and LR algorithms outperform ASA physical status classification for predicting individual postoperative complications. Additionally, neural networks have greater sensitivity than LR when predicting mortality and wound complications. With the growing size of medical data, the training of machine learning on these large datasets promises to improve risk prognostication, with the ability of continuously learning making them excellent tools in complex clinical scenarios.
PMCID:6347343
PMID: 30554505
ISSN: 2586-6583
CID: 4491452

Predicting Surgical Complications in Patients Undergoing Elective Adult Spinal Deformity Procedures Using Machine Learning

Kim, Jun S; Arvind, Varun; Oermann, Eric K; Kaji, Deepak; Ranson, Will; Ukogu, Chierika; Hussain, Awais K; Caridi, John; Cho, Samuel K
STUDY DESIGN:Cross-sectional database study. OBJECTIVE:To train and validate machine learning models to identify risk factors for complications following surgery for adult spinal deformity (ASD). SUMMARY OF BACKGROUND DATA:Machine learning models such as logistic regression (LR) and artificial neural networks (ANNs) are valuable tools for analyzing and interpreting large and complex data sets. ANNs have yet to be used for risk factor analysis in orthopedic surgery. METHODS:The American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP) database was queried for patients who underwent surgery for ASD. This query returned 4,073 patients, which data were used to train and evaluate our models. The predictive variables used included sex, age, ethnicity, diabetes, smoking, steroid use, coagulopathy, functional status, American Society of Anesthesiologists (ASA) class >3, body mass index (BMI), pulmonary comorbidities, and cardiac comorbidities. The models were used to predict cardiac complications, wound complications, venous thromboembolism (VTE), and mortality. Using ASA class as a benchmark for prediction, area under receiver operating characteristic curves (AUC) was used to determine the accuracy of our machine learning models. RESULTS:The mean age of patients was 59.5 years. Forty-one percent of patients were male whereas 59.0% of patients were female. ANN and LR outperformed ASA scoring in predicting every complication (p<.05). The ANN outperformed LR in predicting cardiac complication, wound complication, and mortality (p<.05). CONCLUSIONS:Machine learning algorithms outperform ASA scoring for predicting individual risk prognosis. These algorithms also outperform LR in predicting individual risk for all complications except VTE. With the growing size of medical data, the training of machine learning on these large data sets promises to improve risk prognostication, with the ability of continuously learning making them excellent tools in complex clinical scenarios. LEVEL OF EVIDENCE:Level III.
PMID: 30348356
ISSN: 2212-1358
CID: 4491432

Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study

Zech, John R; Badgeley, Marcus A; Liu, Manway; Costa, Anthony B; Titano, Joseph J; Oermann, Eric Karl
BACKGROUND:There is interest in using convolutional neural networks (CNNs) to analyze medical imaging to provide computer-aided diagnosis (CAD). Recent work has suggested that image classification CNNs may not generalize to new data as well as previously believed. We assessed how well CNNs generalized across three hospital systems for a simulated pneumonia screening task. METHODS AND FINDINGS:A cross-sectional design with multiple model training cohorts was used to evaluate model generalizability to external sites using split-sample validation. A total of 158,323 chest radiographs were drawn from three institutions: National Institutes of Health Clinical Center (NIH; 112,120 from 30,805 patients), Mount Sinai Hospital (MSH; 42,396 from 12,904 patients), and Indiana University Network for Patient Care (IU; 3,807 from 3,683 patients). These patient populations had an age mean (SD) of 46.9 years (16.6), 63.2 years (16.5), and 49.6 years (17) with a female percentage of 43.5%, 44.8%, and 57.3%, respectively. We assessed individual models using the area under the receiver operating characteristic curve (AUC) for radiographic findings consistent with pneumonia and compared performance on different test sets with DeLong's test. The prevalence of pneumonia was high enough at MSH (34.2%) relative to NIH and IU (1.2% and 1.0%) that merely sorting by hospital system achieved an AUC of 0.861 (95% CI 0.855-0.866) on the joint MSH-NIH dataset. Models trained on data from either NIH or MSH had equivalent performance on IU (P values 0.580 and 0.273, respectively) and inferior performance on data from each other relative to an internal test set (i.e., new data from within the hospital system used for training data; P values both <0.001). The highest internal performance was achieved by combining training and test data from MSH and NIH (AUC 0.931, 95% CI 0.927-0.936), but this model demonstrated significantly lower external performance at IU (AUC 0.815, 95% CI 0.745-0.885, P = 0.001). To test the effect of pooling data from sites with disparate pneumonia prevalence, we used stratified subsampling to generate MSH-NIH cohorts that only differed in disease prevalence between training data sites. When both training data sites had the same pneumonia prevalence, the model performed consistently on external IU data (P = 0.88). When a 10-fold difference in pneumonia rate was introduced between sites, internal test performance improved compared to the balanced model (10× MSH risk P < 0.001; 10× NIH P = 0.002), but this outperformance failed to generalize to IU (MSH 10× P < 0.001; NIH 10× P = 0.027). CNNs were able to directly detect hospital system of a radiograph for 99.95% NIH (22,050/22,062) and 99.98% MSH (8,386/8,388) radiographs. The primary limitation of our approach and the available public data is that we cannot fully assess what other factors might be contributing to hospital system-specific biases. CONCLUSION:Pneumonia-screening CNNs achieved better internal than external performance in 3 out of 5 natural comparisons. When models were trained on pooled data from sites with different pneumonia prevalence, they performed better on new pooled data from these sites but not on external data. CNNs robustly identified hospital system and department within a hospital, which can have large differences in disease burden and may confound predictions.
PMCID:6219764
PMID: 30399157
ISSN: 1549-1676
CID: 4491442

Lumbar Lordosis Correction with Interbody Fusion: Systematic Literature Review and Analysis

Rothrock, Robert J; McNeill, Ian T; Yaeger, Kurt; Oermann, Eric K; Cho, Samuel K; Caridi, John M
OBJECTIVE:The goal of this study was to conduct an evidence-based quantitative assessment of the correction of lumbar lordosis achieved by each of the 3 principal lumbar interbody fusion techniques: anterior lumbar interbody fusion (ALIF), lateral lumbar interbody fusion (L-LIF), and transforaminal lumbar interbody fusion (TLIF). METHODS:A systematic review of the literature was conducted to identify studies containing degrees of correction of lumbar lordosis achieved by ALIF, L-LIF, and TLIF as shown on standing lumbar radiography at least 6 weeks after surgical intervention. Pooled and Forest plot analyses were performed for the studies that met inclusion criteria. RESULTS:For ALIF, 21 studies were identified with mean correction 4.67° (standard deviation [SD] ± 4.24) and median correction 5.20°. Fifteen studies were identified that met criteria for Forest plot analysis with mean correction 4.90° (standard error of the mean [SEM] ± 0.40). For L-LIF, 17 studies were identified with mean correction 4.47° (SD ± 4.80) and median correction 4.00°. Nine studies were identified that met criteria for Forest plot analysis with mean correction 2.91° (SEM ± 0.56). For TLIF, 31 studies were identified with mean correction 3.89° (SD ± 4.33) and median correction 3.50°. Twenty-five studies were identified that met criteria for Forest plot analysis with mean correction 5.33° (SEM ± 0.27). CONCLUSIONS:We present the current evidence-based mean correction for each of the 3 principal lumbar interbody fusion techniques based on standing radiographic data.
PMID: 29981462
ISSN: 1878-8769
CID: 4491382

Automated deep-neural-network surveillance of cranial images for acute neurologic events

Titano, Joseph J; Badgeley, Marcus; Schefflein, Javin; Pain, Margaret; Su, Andres; Cai, Michael; Swinburne, Nathaniel; Zech, John; Kim, Jun; Bederson, Joshua; Mocco, J; Drayer, Burton; Lehar, Joseph; Cho, Samuel; Costa, Anthony; Oermann, Eric K
Rapid diagnosis and treatment of acute neurological illnesses such as stroke, hemorrhage, and hydrocephalus are critical to achieving positive outcomes and preserving neurologic function-'time is brain'1-5. Although these disorders are often recognizable by their symptoms, the critical means of their diagnosis is rapid imaging6-10. Computer-aided surveillance of acute neurologic events in cranial imaging has the potential to triage radiology workflow, thus decreasing time to treatment and improving outcomes. Substantial clinical work has focused on computer-assisted diagnosis (CAD), whereas technical work in volumetric image analysis has focused primarily on segmentation. 3D convolutional neural networks (3D-CNNs) have primarily been used for supervised classification on 3D modeling and light detection and ranging (LiDAR) data11-15. Here, we demonstrate a 3D-CNN architecture that performs weakly supervised classification to screen head CT images for acute neurologic events. Features were automatically learned from a clinical radiology dataset comprising 37,236 head CTs and were annotated with a semisupervised natural-language processing (NLP) framework16. We demonstrate the effectiveness of our approach to triage radiology workflow and accelerate the time to diagnosis from minutes to seconds through a randomized, double-blinded, prospective trial in a simulated clinical environment.
PMID: 30104767
ISSN: 1546-170x
CID: 4491402

Examining the Ability of Artificial Neural Networks Machine Learning Models to Accurately Predict Complications Following Posterior Lumbar Spine Fusion

Kim, Jun S; Merrill, Robert K; Arvind, Varun; Kaji, Deepak; Pasik, Sara D; Nwachukwu, Chuma C; Vargas, Luilly; Osman, Nebiyu S; Oermann, Eric K; Caridi, John M; Cho, Samuel K
STUDY DESIGN:A cross-sectional database study. OBJECTIVE:The aim of this study was to train and validate machine learning models to identify risk factors for complications following posterior lumbar spine fusion. SUMMARY OF BACKGROUND DATA:Machine learning models such as artificial neural networks (ANNs) are valuable tools for analyzing and interpreting large and complex datasets. ANNs have yet to be used for risk factor analysis in orthopedic surgery. METHODS:The American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP) database was queried for patients who underwent posterior lumbar spine fusion. This query returned 22,629 patients, 70% of whom were used to train our models, and 30% were used to evaluate the models. The predictive variables used included sex, age, ethnicity, diabetes, smoking, steroid use, coagulopathy, functional status, American Society for Anesthesiology (ASA) class ≥3, body mass index (BMI), pulmonary comorbidities, and cardiac comorbidities. The models were used to predict cardiac complications, wound complications, venous thromboembolism (VTE), and mortality. Using ASA class as a benchmark for prediction, area under receiver operating curves (AUC) was used to determine the accuracy of our machine learning models. RESULTS:On the basis of AUC values, ANN and LR both outperformed ASA class for predicting all four types of complications. ANN was the most accurate for predicting cardiac complications, and LR was most accurate for predicting wound complications, VTE, and mortality, though ANN and LR had comparable AUC values for predicting all types of complications. ANN had greater sensitivity than LR for detecting wound complications and mortality. CONCLUSION:Machine learning in the form of logistic regression and ANNs were more accurate than benchmark ASA scores for identifying risk factors of developing complications following posterior lumbar spine fusion, suggesting they are potentially great tools for risk factor analysis in spine surgery. LEVEL OF EVIDENCE:3.
PMCID:6252089
PMID: 29016439
ISSN: 1528-1159
CID: 4491362

Trends and Disparities in Cervical Spine Fusion Procedures Utilization in the New York State

Feng, Rui; Finkelstein, Mark; Bilal, Khawaja; Oermann, Eric K; Palese, Michael; Caridi, John
STUDY DESIGN:A retrospective review of the Statewide Planning and Research Cooperative System database of the New York State. OBJECTIVE:This study examined the rate of increase of cervical spine fusion procedures at low-, medium-, and high-volume hospitals, and analyzed racial and socioeconomic characteristics of the patient population treated at these three volume categories. SUMMARY OF BACKGROUND DATA:There has been a steady increase in spinal fusion procedures performed each year in the United States, especially cervical and lumbar fusion. Our study aims to analyze the rate of increase at low-, medium-, and high-volume hospitals, and socioeconomic characteristics of the patient populations at these three volume categories. METHODS:The New York State, Statewide Planning and Research Cooperative System (SPARCS) database was searched from 2005 to 2014 for the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) Procedure Codes 81.01 (Fusion, atlas-axis), 81.02 (Fusion, anterior column, other cervical, anterior technique), and 81.03 (Fusion, posterior column, other cervical, posterior technique). Patients' primary diagnosis (ICD-9-CM), age, race/ethnicity, primary payment method, severity of illness, length of stay, hospital of operation were included. All 122 hospitals were categorized into high-, medium-, and low-volume. Trends in annual number of cervical spine fusion surgeries in each of the three hospital volume groups were reported using descriptive statistics. RESULTS:Low-volumes centers were more likely to be rural and non-teaching hospitals. African American patients comprised a greater portion of patients at low-volume hospitals, 15.1% versus 11.6% compared with high-volume hospitals. Medicaid and self-pay patients were also overrepresented at low-volume centers, 6.7% and 3.9% versus 2.6% and 1.7%, respectively. Compared with Caucasian patients, African American patients had higher rates of postoperative infection (P = 0.0020) and postoperative bleeding (P = 0.0044). Compared with privately insured patients, Medicaid patients had a higher rate of postoperative bleeding (P = 0.0266) and in-hospital mortality (P = 0.0031). CONCLUSION:Our results showed significant differences in hospital characteristics, racial distribution, and primary payments methods between the low- and high-volume categories. African American and Medicaid patients had higher rates of postoperative bleeding, despite similar rates between the three volume categories. This suggests racial and socioeconomic disparities remains problematic for disadvantaged populations, some of which may be attributed to accessibility to care at high-volume centers. LEVEL OF EVIDENCE:3.
PMID: 29016436
ISSN: 1528-1159
CID: 4491352

Survival of Patients With Multiple Intracranial Metastases Treated With Stereotactic Radiosurgery: Does the Number of Tumors Matter?

Knoll, Miriam A; Oermann, Eric K; Yang, Andrew I; Paydar, Ima; Steinberger, Jeremy; Collins, Brian; Collins, Sean; Ewend, Matthew; Kondziolka, Douglas
BACKGROUND: Defining prognostic factors is a crucial initial step for determining the management of patients with brain metastases. Randomized trials assessing radiosurgery have commonly limited inclusion criteria to 1 to 4 brain metastases, in part due to multiple retrospective studies reporting on the number of brain metastases as a prognostic indicator. The present study reports on the survival of patients with 1 to 4 versus >/=5 brain metastases treated with radiosurgery. METHODS: We evaluated a retrospective multi-institutional database of 1523 brain metastases in 507 patients who were treated with radiosurgery (Gamma Knife or Cyberknife) between 2001 and 2014. A total of 243 patients were included in the analysis. Patients with 1 to 4 brain metastases were compared with patients with >/=5 brain metastases using a standard statistical analysis. Cox hazard regression was used to construct a multivariable model of overall survival (OS). To find covariates that best separate the data at each split, a machine learning technique Chi-squared Automated Interaction Detection tree was utilized. RESULTS: On Pearson correlation, systemic disease status, number of intracranial metastases, and overall burden of disease (number of major involved organ systems) were found to be highly correlated (P<0.001). Patients with 1 to 4 metastases had a median OS of 10.8 months (95% confidence interval, 6.1-15.6 mo), compared with a median OS of 8.5 months (95% confidence interval, 4.4-12.6 mo) for patients with >/=5 metastases (P=0.143). The actuarial 6 month local failure rate was 5% for patients with 1 to 4 metastases versus 3.2% for patients with >/=5 metastases (P=0.404). There was a significant difference in systemic disease status between the 2 groups; 30% of patients had controlled systemic disease in the <5 lesions group, versus 8% controlled systemic disease in the >/=5 lesions group (P=0.005). Patients with 1 to 4 metastases did not have significantly improved OS in a multivariable model adjusting for systemic disease status, systemic extracranial metastases, and other key variables. The Chi-squared Automated Interaction Detection tree (machine learning technique) algorithm consistently identified performance status and systemic disease status as key to disease classification, but not intracranial metastases. CONCLUSIONS: Although the number of brain metastases has previously been accepted as an independent prognostic indicator, our multicenter analysis demonstrates that the number of intracranial metastases is highly correlated with overall disease burden and clinical status. Proper matching and controlling for these other determinants of survival demonstrates that the number of intracranial metastases alone is not an independent predictive factor, but rather a surrogate for other clinical factors.
PMID: 27258677
ISSN: 1537-453x
CID: 2125292

Natural Language-based Machine Learning Models for the Annotation of Clinical Radiology Reports

Zech, John; Pain, Margaret; Titano, Joseph; Badgeley, Marcus; Schefflein, Javin; Su, Andres; Costa, Anthony; Bederson, Joshua; Lehar, Joseph; Oermann, Eric Karl
Purpose To compare different methods for generating features from radiology reports and to develop a method to automatically identify findings in these reports. Materials and Methods In this study, 96 303 head computed tomography (CT) reports were obtained. The linguistic complexity of these reports was compared with that of alternative corpora. Head CT reports were preprocessed, and machine-analyzable features were constructed by using bag-of-words (BOW), word embedding, and Latent Dirichlet allocation-based approaches. Ultimately, 1004 head CT reports were manually labeled for findings of interest by physicians, and a subset of these were deemed critical findings. Lasso logistic regression was used to train models for physician-assigned labels on 602 of 1004 head CT reports (60%) using the constructed features, and the performance of these models was validated on a held-out 402 of 1004 reports (40%). Models were scored by area under the receiver operating characteristic curve (AUC), and aggregate AUC statistics were reported for (a) all labels, (b) critical labels, and (c) the presence of any critical finding in a report. Sensitivity, specificity, accuracy, and F1 score were reported for the best performing model's (a) predictions of all labels and (b) identification of reports containing critical findings. Results The best-performing model (BOW with unigrams, bigrams, and trigrams plus average word embeddings vector) had a held-out AUC of 0.966 for identifying the presence of any critical head CT finding and an average 0.957 AUC across all head CT findings. Sensitivity and specificity for identifying the presence of any critical finding were 92.59% (175 of 189) and 89.67% (191 of 213), respectively. Average sensitivity and specificity across all findings were 90.25% (1898 of 2103) and 91.72% (18 351 of 20 007), respectively. Simpler BOW methods achieved results competitive with those of more sophisticated approaches, with an average AUC for presence of any critical finding of 0.951 for unigram BOW versus 0.966 for the best-performing model. The Yule I of the head CT corpus was 34, markedly lower than that of the Reuters corpus (at 103) or I2B2 discharge summaries (at 271), indicating lower linguistic complexity. Conclusion Automated methods can be used to identify findings in radiology reports. The success of this approach benefits from the standardized language of these reports. With this method, a large labeled corpus can be generated for applications such as deep learning. © RSNA, 2018 Online supplemental material is available for this article.
PMID: 29381109
ISSN: 1527-1315
CID: 4491372