Searched for: in-biosketch:yes
person:aphiny01
From Sour Grapes to Low-Hanging Fruit: A Case Study Demonstrating a Practical Strategy for Natural Language Processing Portability
Johnson, Stephen B; Adekkanattu, Prakash; Campion, Thomas R; Flory, James; Pathak, Jyotishman; Patterson, Olga V; DuVall, Scott L; Major, Vincent; Aphinyanaphongs, Yindalon
Natural Language Processing (NLP) holds potential for patient care and clinical research, but a gap exists between promise and reality. While some studies have demonstrated portability of NLP systems across multiple sites, challenges remain. Strategies to mitigate these challenges can strive for complex NLP problems using advanced methods (hard-to-reach fruit), or focus on simple NLP problems using practical methods (low-hanging fruit). This paper investigates a practical strategy for NLP portability using extraction of left ventricular ejection fraction (LVEF) as a use case. We used a tool developed at the Department of Veterans Affair (VA) to extract the LVEF values from free-text echocardiograms in the MIMIC-III database. The approach showed an accuracy of 98.4%, sensitivity of 99.4%, a positive predictive value of 98.7%, and F-score of 99.0%. This experience, in which a simple NLP solution proved highly portable with excellent performance, illustrates the point that simple NLP applications may be easier to disseminate and adapt, and in the short term may prove more useful, than complex applications.
PMCID:5961788
PMID: 29888051
ISSN: 2153-4063
CID: 3154942
Text message reminders for improving patient appointment adherence in an office-based buprenorphine program: A feasibility study
Tofighi, Babak; Grazioli, Frank; Bereket, Sewit; Grossman, Ellie; Aphinyanaphongs, Yindalon; Lee, Joshua David
BACKGROUND AND OBJECTIVES: Missed visits are common in office-based buprenorphine treatment (OBOT). The feasibility of text message (TM) appointment reminders among OBOT patients is unknown. METHODS: This 6-month prospective cohort study provided TM reminders to OBOT program patients (N = 93). A feasibility survey was completed following delivery of TM reminders and at 6 months. RESULTS: Respondents reported that the reminders should be provided to all OBOT patients (100%) and helped them to adhere to their scheduled appointment (97%). At 6 months, there were no reports of intrusion to their privacy or disruption of daily activities due to the TM reminders. Most participants reported that the TM reminders were helpful in adhering to scheduled appointments (95%), that the reminders should be offered to all clinic patients (95%), and favored receiving only TM reminders rather than telephone reminders (95%). Barriers to adhering to scheduled appointment times included transportation difficulties (34%), not being able to take time off from school or work (31%), long clinic wait-times (9%), being hospitalized or sick (8%), feeling sad or depressed (6%), and child care (6%). CONCLUSIONS: This study demonstrated the acceptability and feasibility of TM appointment reminders in OBOT. Older age and longer duration in buprenorphine treatment did not diminish interest in receiving the TM intervention. Although OBOT patients expressed concern regarding the privacy of TM content sent from their providers, privacy issues were uncommon among this cohort. Scientific Significance Findings from this study highlighted patient barriers to adherence to scheduled appointments. These barriers included transportation difficulties (34%), not being able to take time off from school or work (31%), long clinic lines (9%), and other factors that may confound the effect of future TM appointment reminder interventions. Further research is also required to assess 1) the level of system changes required to integrate TM appointment reminder tools with already existing electronic medical records and appointment records software; 2) acceptability among clinicians and administrators; and 3) financial and resource constraints to healthcare systems. (Am J Addict 2017;XX:1-6).
PMID: 28799677
ISSN: 1521-0391
CID: 2664212
Using natural language processing to automate grading of student's patient notes: A pilot study of machine learning text classification [Meeting Abstract]
Kalet, A; Oh, S -Y; Marin, M; Yu, Y; Dumorne, H; Aphinyanaphongs, Y
BACKGROUND: At NYU, as part of a comprehensive objective structured clinical skills exam, experienced medical educators judge clinical knowledge, decision-making, and clinical reasoning skills of trainees based on their patient notes. Despite being rubric-driven, this task requires tremendous time and effort to establish consistent scoring, delaying and limiting individualized feedback. We conducted pilot machine learning text classification studies to establish if accurate automated scoring of clinical notes is possible. METHODS: As a use case, we tested 100 student written clinical notes from7 standardized patient cases (Vision Loss, Tel Diarrhea, Difficulty Sleeping, Shoulder Pain, Failure To Thrive, Abdominal, Pain, Palpitations) that had been scored for quality of clinical reasoning by faculty on a 1-4 scale. In order to assess performance of NLP strategies to categorize students in meaningful groups we dichotomized students based on their faculty given scores by case into "failing" (score of 1, 5-18 students per case) and "passing" (score 2,3,4). We treated each task as a binary classification task in a text classification pipeline. First, we treated each note as a bag of tokens and weight each token with term frequency-inverse document frequency (TFIDF) a numerical statistic that reflects howimportant aword is to a document. We then applied 3 different classification algorithms (random forests, support vector machines, and Bayesian logistic regression) and measured discriminatory performance using Area Under Curve (AUC) in a cross validation evaluation design. RESULTS: TFDIF performed with AUCs between 0.669 and 0.905. Logistic regression provided the highestAUC in four cases: Difficulty Sleeping (0.905), Shoulder Pain (0.618), Failure To Thrive (0.717) and Abdominal Pain (0.892). As we observed the highest AUCs in Difficulty Sleeping and Abdominal Pain cases, we have begun to refine the algorithm for these two cases by identifying the importance features that lead faculty to give students to a higher grade and improve the accuracy of NLP based scoring. Promising features include the presence and sequence of certainwords in the problem representation, sentence length in the management section, ranking of the differential diagnosis, sequence between key words (e.g. rule out appendicitis), and evidence of "thinkingness" or what many call semantic qualifiers. CONCLUSIONS: With additional effort to build targeted case specific classifiers for clinical content and reasoning, a validated machine-learning model may achieve partial or full automation of grading of the notes. This work, which builds on decades of clinical decision-making and critical reasoning research, may provide medical trainees with more and potentially better feedback; facilitating learning of clinical reasoning, freeing faculty to coach this process, and in the long run impacting healthcare quality and patient safety
EMBASE:615581953
ISSN: 0884-8734
CID: 2553842
Big Data Analyses in Health and Opportunities for Research in Radiology
Aphinyanaphongs, Yindalon
This article reviews examples of big data analyses in health care with a focus on radiology. We review the defining characteristics of big data, the use of natural language processing, traditional and novel data sources, and large clinical data repositories available for research. This article aims to invoke novel research ideas through a combination of examples of analyses and domain knowledge.
PMID: 28253531
ISSN: 1098-898x
CID: 2471542
Use of a Machine-learning Method for Predicting Highly Cited Articles Within General Radiology Journals
Rosenkrantz, Andrew B; Doshi, Ankur M; Ginocchio, Luke A; Aphinyanaphongs, Yindalon
RATIONALE AND OBJECTIVES: This study aimed to assess the performance of a text classification machine-learning model in predicting highly cited articles within the recent radiological literature and to identify the model's most influential article features. MATERIALS AND METHODS: We downloaded from PubMed the title, abstract, and medical subject heading terms for 10,065 articles published in 25 general radiology journals in 2012 and 2013. Three machine-learning models were applied to predict the top 10% of included articles in terms of the number of citations to the article in 2014 (reflecting the 2-year time window in conventional impact factor calculations). The model having the highest area under the curve was selected to derive a list of article features (words) predicting high citation volume, which was iteratively reduced to identify the smallest possible core feature list maintaining predictive power. Overall themes were qualitatively assigned to the core features. RESULTS: The regularized logistic regression (Bayesian binary regression) model had highest performance, achieving an area under the curve of 0.814 in predicting articles in the top 10% of citation volume. We reduced the initial 14,083 features to 210 features that maintain predictivity. These features corresponded with topics relating to various imaging techniques (eg, diffusion-weighted magnetic resonance imaging, hyperpolarized magnetic resonance imaging, dual-energy computed tomography, computed tomography reconstruction algorithms, tomosynthesis, elastography, and computer-aided diagnosis), particular pathologies (prostate cancer; thyroid nodules; hepatic adenoma, hepatocellular carcinoma, non-alcoholic fatty liver disease), and other topics (radiation dose, electroporation, education, general oncology, gadolinium, statistics). CONCLUSIONS: Machine learning can be successfully applied to create specific feature-based models for predicting articles likely to achieve high influence within the radiological literature.
PMID: 27692588
ISSN: 1878-4046
CID: 2273812
The safety of same-day discharge after laparoscopic hysterectomy for endometrial cancer
Lee, Jessica; Aphinyanaphongs, Yindalon; Curtin, John P; Chern, Jing-Yi; Frey, Melissa K; Boyd, Leslie R
OBJECTIVE: To determine factors influencing discharge patterns after laparoscopic hysterectomy for endometrial cancer and to evaluate the safety of same-day discharge during the 30-day postoperative period. METHODS: Using the American College of Surgeons' National Surgical Quality Improvement Project's database, patients who underwent hysterectomy for endometrial cancer from 2010 to 2014 were identified and categorized by their hospital length of stay. Statistical analyses were performed to assess the relationship between hospital stay and demographics, medical comorbidities, intraoperative surgical factors and postoperative outcomes. RESULTS: A total of 9020 patients had laparoscopic hysterectomies for endometrial cancer and of these, 729 patients (8.1%) were successfully discharged on the day of surgery. These patients were younger and had lower body mass indexes and fewer medical comorbidities than patients who were admitted after their procedure. The same-day discharge group underwent surgical procedures of less complexity than the hospital admission group based on shorter operative times and fewer relative value units (RVUs). There was a lower rate of surgical site infections in the same-day discharge group, and no difference in rates of other postoperative complications including hospital readmissions and reoperations. CONCLUSIONS: Rates of laparoscopic hysterectomy for endometrial cancer are gradually increasing but the rates of same-day discharge have increased at a much slower rate. Same-day discharge has been successful despite differences in preoperative demographics, medical comorbidities and intraoperative surgical complexity. Overall postoperative complication rates were equivalent despite length of hospital stay, demonstrating the safety and feasibility of same-day discharge after laparoscopic hysterectomy for endometrial cancer.
PMID: 27288543
ISSN: 1095-6859
CID: 2136712
Classifying publications from the clinical and translational science award program along the translational research spectrum: a machine learning approach
Surkis, Alisa; Hogle, Janice A; DiazGranados, Deborah; Hunt, Joe D; Mazmanian, Paul E; Connors, Emily; Westaby, Kate; Whipple, Elizabeth C; Adamus, Trisha; Mueller, Meridith; Aphinyanaphongs, Yindalon
BACKGROUND: Translational research is a key area of focus of the National Institutes of Health (NIH), as demonstrated by the substantial investment in the Clinical and Translational Science Award (CTSA) program. The goal of the CTSA program is to accelerate the translation of discoveries from the bench to the bedside and into communities. Different classification systems have been used to capture the spectrum of basic to clinical to population health research, with substantial differences in the number of categories and their definitions. Evaluation of the effectiveness of the CTSA program and of translational research in general is hampered by the lack of rigor in these definitions and their application. This study adds rigor to the classification process by creating a checklist to evaluate publications across the translational spectrum and operationalizes these classifications by building machine learning-based text classifiers to categorize these publications. METHODS: Based on collaboratively developed definitions, we created a detailed checklist for categories along the translational spectrum from T0 to T4. We applied the checklist to CTSA-linked publications to construct a set of coded publications for use in training machine learning-based text classifiers to classify publications within these categories. The training sets combined T1/T2 and T3/T4 categories due to low frequency of these publication types compared to the frequency of T0 publications. We then compared classifier performance across different algorithms and feature sets and applied the classifiers to all publications in PubMed indexed to CTSA grants. To validate the algorithm, we manually classified the articles with the top 100 scores from each classifier. RESULTS: The definitions and checklist facilitated classification and resulted in good inter-rater reliability for coding publications for the training set. Very good performance was achieved for the classifiers as represented by the area under the receiver operating curves (AUC), with an AUC of 0.94 for the T0 classifier, 0.84 for T1/T2, and 0.92 for T3/T4. CONCLUSIONS: The combination of definitions agreed upon by five CTSA hubs, a checklist that facilitates more uniform definition interpretation, and algorithms that perform well in classifying publications along the translational spectrum provide a basis for establishing and applying uniform definitions of translational research categories. The classification algorithms allow publication analyses that would not be feasible with manual classification, such as assessing the distribution and trends of publications across the CTSA network and comparing the categories of publications and their citations to assess knowledge transfer across the translational research spectrum.
PMCID:4974725
PMID: 27492440
ISSN: 1479-5876
CID: 2199242
Factors associated with successful outpatient laparoscopic hysterectomy for women with endometrial cancer [Meeting Abstract]
Lee, J; Aphinyanaphongs, Y; Boyd, L R
Objectives: Minimally invasive surgery is the preferred surgical method to treat women with endometrial cancer. Several single-institution reports have described the feasibility and safety of outpatient laparoscopic hysterectomies (LH) for both benign and malignant indications. The objective of this study is to identify patient and surgical factors associated with outpatient laparoscopic hysterectomies (OLH) and to compare outcomes between OLH and inpatient laparoscopic hysterectomies (ILH) in women with endometrial cancer.Methods: Data were obtained from the American College of Surgeons' National Surgical Quality Improvement Program (NSQIP) database. All patients who underwent hysterectomies for endometrial cancer from 2007 to 2013 were identified by ICD-9 and CPT codes. These patients were then filtered for LH. Comparative analyses were performed and stratified by admission status to evaluate demographics, preoperative and intraoperative variables, and surgical outcomes. Statistical tests were performed with R Studio version 0.99.442.Results: LH rates have been steadily increasing. (See Table 1.) Between 2010 and 2013, 5,851 patients underwent LH for endometrial cancer; of these, 3,428 (58.6%) were ILH and 2,423 (41.4%) were OLH. OLH rates increased each year from 30.0% in 2010 to 50.0% in 2013. OLH patients were on average 61.81 years old compared with 63.03 years for ILH patients (P <.001). Medical comorbidities were not different between the 2 groups. Total operating time and anesthesia time were both significantly shorter in the OLH group: average times were 161.3 and 187.0 minutes (P <.001) and 245.2 versus 256.3 minutes (P =.002), respectively. More lymph node dissections were performed in the ILH group than the OLH group: 2,074 (60.5%) versus 1,390 (57.4%, P =.016). There were more radical hysterectomies in the ILH group (n = 803; 23.4%) compared with the OLH group (n = 315; 13.1%) (P <.001). OLHs were assigned fewer relative value units than ILHs (mean 28.5 vs 30.6, respectively, P <.001). Postoperative complications were not different between the groups.Conclusions: Younger age, fewer RVUs, shorter operating and anesthesia times were associated with successful OLH in patients with endometrial cancer. Lymph node dissection and radical surgery were associated with an increased rate of ILH. There were no differences in postoperative complications between OLH and ILH. (table present)
EMBASE:72341428
ISSN: 1095-6859
CID: 2204972
USING NATURAL LANGUAGE PROCESSING TO AUTOMATE GRADING OF STUDENTS' PATIENT NOTES: PROOF OF CONCEPT [Meeting Abstract]
Gershgorin, Irina; Marin, Marina; Xu, Junchuan; Oh, So-Young; Zabar, Sondra; Crowe, Ruth; Tewksbury, Linda; Ogilvie, Jennifer; Gillespie, Colleen; Cantor, Michael; Aphinyanaphongs, Yindalon; Kalet, Adina
ISI:000392201601297
ISSN: 1525-1497
CID: 2481862
Models to predict hospital admission from the emergency department through the sole use of the medication administration record [Meeting Abstract]
Aphinyanaphongs, Y; Liang, Y; Theobald, J; Grover, H; Swartz, J L
Background: Multiple models have been developed to predict hospital admission for patients presenting to the ED. However, these tools suffer from multiple limitations including reliance on manual data entry (e.g. ED arrival mechanism), multiple types of data, and data that are not completely generalizable across institutions (e.g. triage score). An ideal solution would produce a disposition score that requires no data entry, employs variables already captured by all EDs, and provides a score far enough in advance to expedite admission processes. Objectives: Evaluate the discriminatory power of machine learning algorithms for predicting hospital admission at two hours of ED arrival through the sole use of the medication administration record (MAR). Methods: Our dataset included 27,757 encounters (26% admitted) from January 2013 to September 2014 and 2,109 medications encoded to RxNorm CUI numbers using MedEx. We included all medications in the MAR, including those given during prior ED visits. We employed classic and state[[Unsupported Character - Codename]]of [[Unsupported Character - Codename]]the[[Unsupported Character - Codename]]art classifiers including logistic regression, naive bayes, regularized logistic regression, classification and regression trees (CART), and linear support vector machine (SVM) with penalty parameter C. In all cases, we split the dataset into a training, validation, and test set. We used the validation set to optimize any parameters of the learning algorithm and used the test set to calculate performances. We employed 5[[Unsupported Character - Codename]]fold cross validation and reported AUC performances averaged across 5 folds. Results: The models performed with AUCs of 0.85 for linear SVM with penalty parameter C (95%CI 0.84-0.86), 0.83 for CART (95%CI 0.82-0.84), 0.79 for regularized logistic regression (95%CI 0.78-0.80), 0.70 for Naive Bayes (95%CI 0.69-0.72), and 0.68 for logistic regression (95%CI 0.67-0.69). Conclusion: MAR data is sufficient to reliably predict hospital admission two hours into the ED stay. Our models perform similarly to those from prior studies, but with the advantages of only requiring a single type of data and being highly generalizable to other institutions; MAR data is objective, does not require manual data entry, and is universally available across EDs
EMBASE:72280952
ISSN: 1553-2712
CID: 2151612