Closed-loop network of skin-interfaced wireless devices for quantifying vocal fatigue and providing user feedback
Jeong, Hyoyoung; Yoo, Jae-Young; Ouyang, Wei; Greane, Aurora Lee Jean Xue; Wiebe, Alexandra Jane; Huang, Ivy; Lee, Young Joong; Lee, Jong Yoon; Kim, Joohee; Ni, Xinchen; Kim, Suyeon; Huynh, Huong Le-Thien; Zhong, Isabel; Chin, Yu Xuan; Gu, Jianyu; Johnson, Aaron M; Brancaccio, Theresa; Rogers, John A
Vocal fatigue is a measurable form of performance fatigue resulting from overuse of the voice and is characterized by negative vocal adaptation. Vocal dose refers to cumulative exposure of the vocal fold tissue to vibration. Professionals with high vocal demands, such as singers and teachers, are especially prone to vocal fatigue. Failure to adjust habits can lead to compensatory lapses in vocal technique and an increased risk of vocal fold injury. Quantifying and recording vocal dose to inform individuals about potential overuse is an important step toward mitigating vocal fatigue. Previous work establishes vocal dosimetry methods, that is, processes to quantify vocal fold vibration dose but with bulky, wired devices that are not amenable to continuous use during natural daily activities; these previously reported systems also provide limited mechanisms for real-time user feedback. This study introduces a soft, wireless, skin-conformal technology that gently mounts on the upper chest to capture vibratory responses associated with vocalization in a manner that is immune to ambient noises. Pairing with a separate, wirelessly linked device supports haptic feedback to the user based on quantitative thresholds in vocal usage. A machine learning-based approach enables precise vocal dosimetry from the recorded data, to support personalized, real-time quantitation and feedback. These systems have strong potential to guide healthy behaviors in vocal use.
Perilaryngeal-Cranial Functional Muscle Network Differentiates Vocal Tasks: A Multi-Channel sEMG Approach
O' Keeffe, Rory; Shirazi, Seyed Yahya; Mehrdad, Sarmad; Crosby, Tyler; Johnson, Aaron M; Atashzar, S Farokh
OBJECTIVE:Objective evaluation of physiological responses using non-invasive methods for the assessment of vocal performance and voice disorders has attracted great interest. This paper, for the first time, aims to implement and evaluate perilaryngeal-cranial functional muscle networks. The study investigates the variations in topographical characteristics of the network and the corresponding ability to differentiate vocal tasks. METHOD/METHODS:Twelve surface electromyography (sEMG) signals were collected bilaterally from six perilaryngeal and cranial muscles. Data were collected from eight subjects (four females) without a known history of voice disorders. The proposed muscle network is composed of pairwise coherence between sEMG recordings. The network metrics include (a) network degree and (b) weighted clustering coefficient (WCC). RESULTS:|=0.12) in differentiating the vocal tasks. CONCLUSION/CONCLUSIONS:Perilaryngeal-cranial functional muscle network was proposed in this paper. The study showed that the functional muscle network could robustly differentiate the vocal tasks while the classic assessment of muscle activation fails to differentiate. SIGNIFICANCE/CONCLUSIONS:For the first time, we demonstrate the power of a perilaryngeal-cranial muscle network as a neurophysiological window to vocal performance. In addition, the study also discovers tasks with the highest network involvement, which may be utilized in the future to monitor voice disorders and rehabilitation.
Flow Patterns and Particle Residence Times in the Oral Cavity during Inhaled Drug Delivery
Vara Almirall, Brenda; Inthavong, Kiao; Bradshaw, Kimberley; Singh, Narinder; Johnson, Aaron; Storey, Pippa; Salati, Hana
Pulmonary drug delivery aims to deliver particles deep into the lungs, bypassing the mouth-throat airway geometry. However, micron particles under high flow rates are susceptible to inertial impaction on anatomical sites that serve as a defense system to filter and prevent foreign particles from entering the lungs. The aim of this study was to understand particle aerodynamics and its possible deposition in the mouth-throat airway that inhibits pulmonary drug delivery. In this study, we present an analysis of the aerodynamics of inhaled particles inside a patient-specific mouth-throat model generated from MRI scans. Computational Fluid Dynamics with a Discrete Phase Model for tracking particles was used to characterize the airflow patterns for a constant inhalation flow rate of 30 L/min. Monodisperse particles with diameters of 7 Î¼m to 26 Î¼m were introduced to the domain within a 3 cm-diameter sphere in front of the oral cavity. The main outcomes of this study showed that the time taken for particle deposition to occur was 0.5 s; a narrow stream of particles (medially and superiorly) were transported by the flow field; larger particles > 20 Î¼m deposited onto the oropharnyx, while smaller particles < 12 Î¼m were more disperse throughout the oral cavity and navigated the curved geometry and laryngeal jet to escape through the tracheal outlet. It was concluded that at a flow rate of 30 L/min the particle diameters depositing on the larynx and trachea in this specific patient model are likely to be in the range of 7 Î¼m to 16 Î¼m. Particles larger than 16 Î¼m primarily deposited on the oropharynx.
Re-Training of Convolutional Neural Networks for Glottis Segmentation in Endoscopic High-Speed Videos
Dollinger, Michael; Schraut, Tobias; Henrich, Lea A.; Chhetri, Dinesh; Echternach, Matthias; Johnson, Aaron M.; Kunduk, Melda; Maryn, Youri; Patel, Rita R.; Samlan, Robin; Semmler, Marion; SchÃ¼tzenberger, Anne
Endoscopic high-speed video (HSV) systems for visualization and assessment of vocal fold dynamics in the larynx are diverse and technically advancing. To consider resulting "concepts shifts" for neural network (NN)-based image processing, re-training of already trained and used NNs is necessary to allow for sufficiently accurate image processing for new recording modalities. We propose and discuss several re-training approaches for convolutional neural networks (CNN) being used for HSV image segmentation. Our baseline CNN was trained on the BAGLS data set (58,750 images). The new BAGLS-RT data set consists of additional 21,050 images from previously unused HSV systems, light sources, and different spatial resolutions. Results showed that increasing data diversity by means of preprocessing already improves the segmentation accuracy (mIoU + 6.35%). Subsequent re-training further increases segmentation performance (mIoU + 2.81%). For re-training, finetuning with dynamic knowledge distillation showed the most promising results. Data variety for training and additional re-training is a helpful tool to boost HSV image segmentation quality. However, when performing re-training, the phenomenon of catastrophic forgetting should be kept in mind, i.e., adaption to new data while forgetting already learned knowledge.
Longitudinal Effects of Base of Tongue Concurrent Chemoradiation Therapy in a Pre-Clinical Model
Benedict, Peter A; Kravietz, Adam; Yang, Jackie; Achlatis, Efstratios; Doyle, Carina; Johnson, Aaron M; Dion, Gregory R; Amin, Milan R
BACKGROUND/OBJECTIVES/OBJECTIVE:Base of tongue (BOT) dysfunction is common following oropharyngeal concurrent chemoradiation therapy (CCRT). We present a clinically relevant animal model quantifying the effects of CCRT on tongue strength and elasticity over time. METHODS:Fifty-three male and 53 female Sprague-Dawley rats were randomized to control or experimental groups. Experimental animals received cisplatin, 5-fluorouracil, and 5 fractions of 7â€‰Gy directed to the BOT. Controls received no intervention. At 2â€‰weeks, 5â€‰months, or 10â€‰months after CCRT, animals underwent non-survival surgery to measure twitch and tetanic tongue strength, which were analyzed using multivariate linear mixed effects models. Tongue displacement, a surrogate for tongue elasticity, was also determined via stress-strain testing and analyzed via a multivariate linear mixed effects model. RESULTS:Reporting the combined results of both sexes, the estimated experimental group mean peak twitch forces became more divergent over time compared to controls, being 8.3% lower than controls at 2â€‰weeks post-CCRT, 15.7% lower at 5â€‰months, and 31.6% lower at 10â€‰months. Estimated experimental group mean peak tetanic forces followed a similar course and were 2.9% lower than controls at 2â€‰weeks post CCRT, 20.7% lower at 5â€‰months, and 27.0% lower at 10â€‰months. Stress-strain testing did not find CCRT to have a significant effect on tongue displacement across experimental timepoints. CONCLUSIONS:This study demonstrates an increasing difference in tongue strength over time between controls and animals exposed to CCRT. Tongue elasticity was not significantly affected by CCRT, suggesting that changes in strength may not be caused by fibrosis. LEVEL OF EVIDENCE/METHODS:NA Laryngoscope, 2022.
Feasibility and Preliminary Efficacy of Two Technology-assisted Vocal Interventions for Older Adults Living in a Residential Facility
Johnson, Aaron M; Pukin, Farrah; Krishna, Vaishnavi; Phansikar, Madhura; Mullen, Sean P
OBJECTIVES/HYPOTHESIS/OBJECTIVE:An increasing number of older adults are seeking behavioral voice therapy to manage their voice problems. Poor adherence to voice therapy is a known problem across all treatment-seeking populations. Given age-related physical and cognitive impairments and multiple chronic conditions, older adults are more susceptible to low adherence to behavioral therapies. The purpose of this study was to test the feasibility of an at-home, vocal training intervention for older adults without a known voice disorder living in a senior living community, as well as compare the effects of two modes of mobile health (mHealth) technology-assisted vocal training targeting vocal function and adherence in older adults. STUDY DESIGN/METHODS:Cohort Study (Prospective Observational Study). METHODS:Twenty-three individuals were recruited from a single residential retirement community and randomly allocated into two experimental groups. Both groups were asked to practice the Vocal Function Exercises with increasing frequency over an 8-week period. Tablets with instructions for performing the exercises were provided to all participants. The feedback group's tablets also contained an application providing real-time feedback on pitch, loudness, and duration. Acoustic and aerodynamic measures of vocal function and cognitive measures were obtained before and after the intervention. Self-reported measures of practice frequency, perceived vocal progress and changes, and motivation were obtained weekly. RESULTS:The feedback control group adhered to the requested practice sessions more in the latter half of the intervention (weeks 5 and 8). Vocal function measures remained stable. Overall, a pattern reflecting self-reported vocal progress and a general improvement in working memory and global cognitive functioning was observed in the feedback group. CONCLUSIONS:This study demonstrated that an 8-week mHealth intervention is viable to facilitate vocal practice in older adults. Although vocal ability did not improve with training, results indicated that vocal performance remained stable and age-related vocal changes did not progress. Future research on implementation of mHealth applications in conjunction with behavioral voice therapy is warranted to assess adherence and improvements in vocal function in individuals with age-related voice problems.
Effects of Historical Recording Technology on Vibrato in Modern-Day Opera Singers
Glasner, Joshua D; Johnson, Aaron M
OBJECTIVE:Past literature indicates that vibrato measurements of singers objectively changed (i.e., vibrato rate decreased and vibrato extent increased) from 1900 to the present day; however, historical audio recording technology may distort acoustic measurements of the voice output signal, including vibrato. As such, the listener's perception of historical singing may be influenced by the limitations of historical technology. This study attempts to show how the wax cylinder phonograph system-the oldest form of mass-produced audio recording technology-alters the recorded voice output signal of modern-day singers and, thus, provides an objective lens through which to study the effect(s) of historical audio recording technology on vibrato measurements. METHODS:for female singers, three times into a flat-response omnidirectional microphone and onto an Edison Home Phonograph simultaneously. The middle 1-3 seconds (6-10 vibrato cycles) of each sample was analyzed for vibrato rate, vibrato extent, jitter (ddp), shimmer (dda), and fundamental frequency for each recording condition (wax cylinder phonograph or microphone). Steady-state and frequency-modulating sinewave test signals were also recorded under the multiple recording conditions. RESULTS:Results indicated no significant effect of recording condition on vibrato rate (mean [standard deviation], cylinder: 5.3 Hz [0.5], microphone: 5.3 Hz [0.5]) and no significant difference was found for mean fundamental frequency (cylinder: 389 Hz , microphone: 390 Hz ). A significant main effect of recording condition was found for vibrato extent (cylinder: Â±103 cents , microphone: Â±100 cents ). Additionally, mean jitter (ddp) (cylinder: 1.22% [1.09], microphone: 0.24% [0.12]) and mean shimmer (dda) (cylinder: 9.40% [4.90], microphone: 1.92% [0.94]) were significantly higher for the cylinder recording condition, indicating more cycle-to-cycle variability in the wax cylinder recorded signal. Analysis of test signals revealed similar patterns based on recording condition. DISCUSSION/CONCLUSIONS:This study validates past scholarly inquiry about vibrato measurements as extracted from digitized wax cylinder phonograph recordings by demonstrating that measured vibrato rate remains constant during both recording conditions. In other words, vibrato rate as measured from historical recordings can be viewed as an accurate representation of the historical singer being studied. Furthermore, it suggests that the value of prior vibrato extent measurements from these acoustic recordings may be slightly overestimated from the original voice output signals produced by singers near the beginning of the 20th century (i.e., a narrow vibrato extent might have been numerically smaller on average). Increased jitter and shimmer in the wax cylinder recording conditions may be indicative of nonlinearities in the phonograph recording or playback systems.
Proteomic Characterization of Senescent Laryngeal Adductor and Plantaris Hindlimb Muscles
Shembel, Adrianna C; Kanshin, Evgeny; Ueberheide, Beatrix; Johnson, Aaron M
OBJECTIVES/OBJECTIVE:The goals of this study were to 1) compare global protein expression in muscles of the larynx and hindlimb and 2) investigate differences in protein expression between aged and nonaged muscle using label-free global proteomic profiling methods. METHODS:Liquid chromatography-mass spectrometry (LC-MS/MS) analysis was performed on thyroarytenoid intrinsic laryngeal muscle and plantaris hindlimb muscle from 10 F344xBN F1 male rats (5 old and 5 young). Protein expression was compared and pathway enrichment analysis performed for each muscle type (larynx and limb) and age group (old and young muscle). RESULTS:Over 1,000 proteins were identified in common across both muscle types and age groups using LC-MS/MS analysis. Significant age-related differences were seen across 107 proteins in plantaris hindlimb and in 19 proteins in thyroarytenoid laryngeal muscle. Bioinformatic and enrichment analysis demonstrated protein differences between the hindlimb and larynx may relate to immune and stress redox responses and RNA repair. CONCLUSION/CONCLUSIONS:There are clear differences in protein expressions between the laryngeal and hindlimb skeletal muscles. Initial analysis suggests differences between the two muscle groups may relate to stress responses and repair mechanisms. Age-related changes in the thyroarytenoid appear to be less obvious than in the plantaris. Further in-depth study is needed to elucidate how aging affects protein expression in the laryngeal muscles. LEVEL OF EVIDENCE/METHODS:NA Laryngoscope, 2021.
Semi-Automated Training of Rat Ultrasonic Vocalizations
Johnson, Aaron M; Lenell, Charles; Severa, Elizabeth; Rudisch, Denis Michael; Morrison, Robert A; Shembel, Adrianna C
Rats produce ultrasonic vocalizations (USVs) for conspecific communication. These USVs are valuable biomarkers for studying behavioral and mechanistic changes in a variety of diseases and disorders. Previous work has demonstrated operant conditioning can progressively increase the number of USVs produced by rats over multiple weeks. This operant conditioning paradigm is a useful model for investigating the effects of increased laryngeal muscle use on USV acoustic characteristics and underlying central and peripheral laryngeal sensorimotor mechanisms. Previous USV operant conditioning studies relied on manual training to elicit USV productions, which is both time and labor intensive and can introduce human variability. This manuscript introduces a semi-automated method for training rats to increase their rate of USV production by pairing commercially available operant conditioning equipment with an ultrasonic detection system. USV training requires three basic components: elicitation cue, detection of the behavior, and a reward to reinforce the desired behavior. With the semi-automated training paradigm, indirect exposure to the opposite sex or an olfactory cue can be used to elicit USV production. The elicited USV is then automatically detected by the ultrasonic acoustic system, which consequently triggers the release of a sucrose pellet reward. Our results demonstrate this semi-automated procedure produces a similar increase in USV production as the manual training method. Through automation of USV detection and reward administration, staffing requirements, human error, and subject behavioral variability may be minimized while scalability and reproducibility are increased. This automation may also result in greater experimental flexibility, allowing USV training paradigms to become more customizable for a wider array of applications. This semi-automated USV behavioral training paradigm improves upon manual training techniques by increasing the ease, speed, and quality of data collection.
Relationships Across Clinical Measures of Vocal Quality and Functioning and Their Relationship With Patient Perception
Houle, Nichole; Johnson, Aaron M
Purpose The purpose of this study was to investigate the relationships among subjective auditory-perceptual ratings of vocal quality, objective acoustic and aerodynamic measures of vocal function, and patient-perceived severity of their vocal complaint. Method This study was a retrospective chart review of adult patients evaluated at a single outpatient center over a 1.5-year time period. Twenty-two clinical objective and subjective measures of voice were extracted from 676 charts (310 males, 366 females). To identify the underlying concepts addressed in an initial voice assessment, principal component analyses were conducted for males and females to account for sex differences. Linear regression models were conducted to examine the relationship between the principal components and patient perceived severity. Results Seven principal components were identified for both sexes and accounted for 75% and 71% of the variance in the clinical measures, respectively. Of these seven principal components, only two predicted male patient perceived severity, which accounted for 22% of the variance. In contrast, four principal components predicted female patient perceived severity of their voice disorder and accounted for 19% of the variance. Conclusions The results highlight the underlying aspects of vocal quality and functioning that are evaluated during an initial assessment. Male and female patients differ in which of these components may contribute self-perceived severity of a voice disorder. Identifying these underlying components may support clinical decision making when developing a clinical protocol and highlights the overlap between patient concerns and clinical measures. Supplemental Material https://doi.org/10.23641/asha.16879603.