NYUHSL Faculty Bibliography

Searched for:

in-biosketch:yes

person:aphiny01

Total Results:

104

JAMIA open. 2020:3(2):243-251.DOI: 10.1093/jamiaopen/ooaa008

Estimating real-world performance of a predictive model: a case-study in predicting mortality

Major, Vincent J; Jethani, Neil; Aphinyanaphongs, Yindalon

Objective/UNASSIGNED:One primary consideration when developing predictive models is downstream effects on future model performance. We conduct experiments to quantify the effects of experimental design choices, namely cohort selection and internal validation methods, on (estimated) real-world model performance. Materials and Methods/UNASSIGNED:Four years of hospitalizations are used to develop a 1-year mortality prediction model (composite of death or initiation of hospice care). Two common methods to select appropriate patient visits from their encounter history (backwards-from-outcome and forwards-from-admission) are combined with 2 testing cohorts (random and temporal validation). Two models are trained under otherwise identical conditions, and their performances compared. Operating thresholds are selected in each test set and applied to a "real-world" cohort of labeled admissions from another, unused year. Results/UNASSIGNED:â€‰=â€‰92Â 148). Both selection methods produce similar performances when applied to a random test set. However, when applied to the temporally defined "real-world" set, forwards-from-admission yields higher areas under the ROC and precision recall curves (88.3% and 56.5% vs. 83.2% and 41.6%). Discussion/UNASSIGNED:A backwards-from-outcome experiment manipulates raw training data, simplifying the experiment. This manipulated data no longer resembles real-world data, resulting in optimistic estimates of test set performance, especially at high precision. In contrast, a forwards-from-admission experiment with a temporally separated test set consistently and conservatively estimates real-world performance. Conclusion/UNASSIGNED:Experimental design choices impose bias upon selected cohorts. A forwards-from-admission experiment, validated temporally, can conservatively estimate real-world performance. LAY SUMMARY/UNASSIGNED:The routine care of patients stands to benefit greatly from assistive technologies, including data-driven risk assessment. Already, many different machine learning and artificial intelligence applications are being developed from complex electronic health record data. To overcome challenges that arise from such data, researchers often start with simple experimental approaches to test their work. One key component is how patients (and their healthcare visits) are selected for the study from the pool of all patients seen. Another is how the group of patients used to create the risk estimator differs from the group used to evaluate how well it works. These choices complicate how the experimental setting compares to the real-world application to patients. For example, different selection approaches that depend on each patient's future outcome can simplify the experiment but are impractical upon implementation as these data are unavailable. We show that this kind of "backwards" experiment optimistically estimates how well the model performs. Instead, our results advocate for experiments that select patients in a "forwards" manner and "temporal" validation that approximates training on past data and implementing on future data. More robust results help gauge the clinical utility of recent works and aid decision-making before implementation into practice.

PMCID:7382635

PMID: 32734165

ISSN: 2574-2531

CID: 4540712

JAMA. 2020:324(8):799-801.DOI: 10.1001/jama.2020.13372

Thrombosis in Hospitalized Patients With COVID-19 in a New York City Health System

Bilaloglu, Seda; Aphinyanaphongs, Yin; Jones, Simon; Iturrate, Eduardo; Hochman, Judith; Berger, Jeffrey S

PMCID:7372509

PMID: 32702090

ISSN: 1538-3598

CID: 4532682

Journal of medical Internet research. 2020:22(4).DOI: 10.2196/16848

Development, Implementation, and Evaluation of a Personalized Machine Learning Algorithm for Clinical Decision Support: Case Study With Shingles Vaccination

Chen, Ji; Chokshi, Sara; Hegde, Roshini; Gonzalez, Javier; Iturrate, Eduardo; Aphinyanaphongs, Yin; Mann, Devin

BACKGROUND:Although clinical decision support (CDS) alerts are effective reminders of best practices, their effectiveness is blunted by clinicians who fail to respond to an overabundance of inappropriate alerts. An electronic health record (EHR)-integrated machine learning (ML) algorithm is a potentially powerful tool to increase the signal-to-noise ratio of CDS alerts and positively impact the clinician's interaction with these alerts in general. OBJECTIVE:This study aimed to describe the development and implementation of an ML-based signal-to-noise optimization system (SmartCDS) to increase the signal of alerts by decreasing the volume of low-value herpes zoster (shingles) vaccination alerts. METHODS:We built and deployed SmartCDS, which builds personalized user activity profiles to suppress shingles vaccination alerts unlikely to yield a clinician's interaction. We extracted all records of shingles alerts from January 2017 to March 2019 from our EHR system, including 327,737 encounters, 780 providers, and 144,438 patients. RESULTS:During the 6 weeks of pilot deployment, the SmartCDS system suppressed an average of 43.67% (15,425/35,315) potential shingles alerts (appointments) and maintained stable counts of weekly shingles vaccination orders (326.3 with system active vs 331.3 in the control group; P=.38) and weekly user-alert interactions (1118.3 with system active vs 1166.3 in the control group; P=.20). CONCLUSIONS:All key statistics remained stable while the system was turned on. Although the results are promising, the characteristics of the system can be subject to future data shifts, which require automated logging and monitoring. We demonstrated that an automated, ML-based method and data architecture to suppress alerts are feasible without detriment to overall order rates. This work is the first alert suppression ML-based model deployed in practice and serves as foundational work in encounter-level customization of alert display to maximize effectiveness.

PMID: 32347813

ISSN: 1438-8871

CID: 4412352

Drug & alcohol review. 2020:39(3):205-208.DOI: 10.1111/dar.13048

Detecting illicit opioid content on Twitter

Tofighi, Babak; Aphinyanaphongs, Yindalon; Marini, Christina; Ghassemlou, Shouron; Nayebvali, Peyman; Metzger, Isabel; Raghunath, Ananditha; Thomas, Shailin

INTRODUCTION AND AIMS/OBJECTIVE:This article examines the feasibility of leveraging Twitter to detect posts authored by people who use opioids (PWUO) or content related to opioid use disorder (OUD), and manually develop a multidimensional taxonomy of relevant tweets. DESIGN AND METHODS/METHODS:Twitter messages were collected between June and October 2017 (n =â€‰23â€‰827) and evaluated using an inductive coding approach. Content was then manually classified into two axes (n =â€‰17â€‰420): (i) user experience regarding accessing, using, or recovery from illicit opioids; and (ii) content categories (e.g. policies, medical information, jokes/sarcasm). RESULTS:The most prevalent categories consisted of jokes or sarcastic comments pertaining to OUD, PWUOs or hypothetically using illicit opioids (63%), informational content about treatments for OUD, overdose prevention or accessing self-help groups (20%), and commentary about government opioid policy or news related to opioids (17%). Posts by PWUOs centered on identifying illicit sources for procuring opioids (i.e. online, drug dealers; 49%), symptoms and/or strategies to quell opioid withdrawal symptoms (21%), and combining illicit opioid use with other substances, such as cocaine or benzodiazepines (17%). State and public health experts infrequently posted content pertaining to OUD (1%). DISCUSSION AND CONCLUSIONS/CONCLUSIONS:Twitter offers a feasible approach to identify PWUO. Further research is needed to evaluate the efficacy of Twitter to disseminate evidence-based content and facilitate linkage to treatment and harm reduction services.

PMID: 32202005

ISSN: 1465-3362

CID: 4357472

iScience. 2020.DOI: 10.1016/j.isci.2020.100884

Electronic Cigarette Aerosol Modulates the Oral Microbiome and Increases Risk of Infection

Pushalkar, Smruti; Paul, Bidisha; Li, Qianhao; Yang, Jian; Vasconcelos, Rebeca; Makwana, Shreya; GonzÃ¡lez, Juan Muñoz; Shah, Shivm; Xie, Chengzhi; Janal, Malvin N; Queiroz, Erica; Bederoff, Maria; Leinwand, Joshua; Solarewicz, Julia; Xu, Fangxi; Aboseria, Eman; Guo, Yuqi; Aguallo, Deanna; Gomez, Claudia; Kamer, Angela; Shelley, Donna; Aphinyanaphongs, Yindalon; Barber, Cheryl; Gordon, Terry; Corby, Patricia; Li, Xin; Saxena, Deepak

The trend of e-cigarette use among teens is ever increasing. Here we show the dysbiotic oral microbial ecology in e-cigarette users influencing the local host immune environment compared with non-smoker controls and cigarette smokers. Using 16S rRNA high-throughput sequencing, we evaluated 119 human participants, 40 in each of the three cohorts, and found significantly altered beta-diversity in e-cigarette users (pÂ = 0.006) when compared with never smokers or tobacco cigarette smokers. The abundance of Porphyromonas and Veillonella (pÂ = 0.008) was higher among vapers. Interleukin (IL)-6 and IL-1Î² were highly elevated in e-cigarette users when compared with non-users. Epithelial cell-exposed e-cigarette aerosols were more susceptible for infection. InÂ vitro infection model of premalignant Leuk-1 and malignant cell lines exposed to e-cigarette aerosol and challenged by Porphyromonas gingivalis and Fusobacterium nucleatum resulted in elevated inflammatory response. Our findings for the first time demonstrate that e-cigarette users are more prone to infection.

PMID: 32105635

ISSN: 2589-0042

CID: 4323572

PLoS one. 2019:14(10).DOI: 10.1371/journal.pone.0223796

Correction:Â Predicting childhood obesity using electronic health records and publicly available data

Hammond, Robert; Athanasiadou, Rodoniki; Curado, Silvia; Aphinyanaphongs, Yindalon; Abrams, Courtney; Messito, Mary Jo; Gross, Rachel; Katzow, Michelle; Jay, Melanie; Razavian, Narges; Elbel, Brian

[This corrects the article DOI: 10.1371/journal.pone.0215571.].

PMID: 31589654

ISSN: 1932-6203

CID: 4129312

BMJ quality & safety. 2019:28(12):959-962.DOI: 10.1136/bmjqs-2019-009858

Challenges in translating mortality risk to the point of care [Editorial]

Major, Vincent J; Aphinyanaphongs, Yindalon

PMID: 31481481

ISSN: 2044-5423

CID: 4067212

PLoS one. 2019:14(4).DOI: 10.1371/journal.pone.0215571

Predicting childhood obesity using electronic health records and publicly available data

Hammond, Robert; Athanasiadou, Rodoniki; Curado, Silvia; Aphinyanaphongs, Yindalon; Abrams, Courtney; Messito, Mary Jo; Gross, Rachel; Katzow, Michelle; Jay, Melanie; Razavian, Narges; Elbel, Brian

BACKGROUND:Because of the strong link between childhood obesity and adulthood obesity comorbidities, and the difficulty in decreasing body mass index (BMI) later in life, effective strategies are needed to address this condition in early childhood. The ability to predict obesity before age five could be a useful tool, allowing prevention strategies to focus on high risk children. The few existing prediction models for obesity in childhood have primarily employed data from longitudinal cohort studies, relying on difficult to collect data that are not readily available to all practitioners. Instead, we utilized real-world unaugmented electronic health record (EHR) data from the first two years of life to predict obesity status at age five, an approach not yet taken in pediatric obesity research. METHODS AND FINDINGS/RESULTS:We trained a variety of machine learning algorithms to perform both binary classification and regression. Following previous studies demonstrating different obesity determinants for boys and girls, we similarly developed separate models for both groups. In each of the separate models for boys and girls we found that weight for length z-score, BMI between 19 and 24 months, and the last BMI measure recorded before age two were the most important features for prediction. The best performing models were able to predict obesity with an Area Under the Receiver Operator Characteristic Curve (AUC) of 81.7% for girls and 76.1% for boys. CONCLUSIONS:We were able to predict obesity at age five using EHR data with an AUC comparable to cohort-based studies, reducing the need for investment in additional data collection. Our results suggest that machine learning approaches for predicting future childhood obesity using EHR data could improve the ability of clinicians and researchers to drive future policy, intervention design, and the decision-making process in a clinical setting.

PMID: 31009509

ISSN: 1932-6203

CID: 3821342

A Workflow for Visual Diagnostics of Binary Classifiers using Instance-Level Explanations

Chapter by: Krause, Josua; Dasgupta, Aritra; Swartz, Jordan; Aphinyanaphongs, Yindalon; Bertini, Enrico

in: 2017 IEEE Conference on Visual Analytics Science and Technology, VAST 2017 - Proceedings by

[S.l.] : Institute of Electrical and Electronics Engineers Inc., 2018

pp. 162-172

ISBN: 9781538631638

CID: 3996622

AMIA ... Annual Symposium proceedings. 2018:2018:1405-1414.DOI:

Utility of General and Specific Word Embeddings for Classifying Translational Stages of Research

Major, Vincent; Surkis, Alisa; Aphinyanaphongs, Yindalon

Conventional text classification models make a bag-of-words assumption reducing text into word occurrence counts per document. Recent algorithms such as word2vec are capable of learning semantic meaning and similarity between words in an entirely unsupervised manner using a contextual window and doing so much faster than previous methods. Each word is projected into vector space such that similar meaning words such as "strong" and "powerful" are projected into the same general Euclidean space. Open questions about these embeddings include their utility across classification tasks and the optimal properties and source of documents to construct broadly functional embeddings. In this work, we demonstrate the usefulness of pre-trained embeddings for classification in our task and demonstrate that custom word embeddings, built in the domain and for the tasks, can improve performance over word embeddings learnt on more general data including news articles or Wikipedia.

PMCID:6371342

PMID: 30815185

ISSN: 1942-597x

CID: 3698512