Searched for: person:duganp01
in-biosketch:true
A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations
Goldstein, Ariel; Wang, Haocheng; Niekerken, Leonard; Schain, Mariano; Zada, Zaid; Aubrey, Bobbi; Sheffer, Tom; Nastase, Samuel A; Gazula, Harshvardhan; Singh, Aditi; Rao, Aditi; Choe, Gina; Kim, Catherine; Doyle, Werner; Friedman, Daniel; Devore, Sasha; Dugan, Patricia; Hassidim, Avinatan; Brenner, Michael; Matias, Yossi; Devinsky, Orrin; Flinker, Adeen; Hasson, Uri
This study introduces a unified computational framework connecting acoustic, speech and word-level linguistic structures to study the neural basis of everyday conversations in the human brain. We used electrocorticography to record neural signals across 100 h of speech production and comprehension as participants engaged in open-ended real-life conversations. We extracted low-level acoustic, mid-level speech and contextual word embeddings from a multimodal speech-to-text model (Whisper). We developed encoding models that linearly map these embeddings onto brain activity during speech production and comprehension. Remarkably, this model accurately predicts neural activity at each level of the language processing hierarchy across hours of new conversations not used in training the model. The internal processing hierarchy in the model is aligned with the cortical hierarchy for speech and language processing, where sensory and motor regions better align with the model's speech embeddings, and higher-level language areas better align with the model's language embeddings. The Whisper model captures the temporal sequence of language-to-speech encoding before word articulation (speech production) and speech-to-language encoding post articulation (speech comprehension). The embeddings learned by this model outperform symbolic models in capturing neural activity supporting natural speech and language. These findings support a paradigm shift towards unified computational models that capture the entire processing hierarchy for speech comprehension and production in real-world conversations.
PMID: 40055549
ISSN: 2397-3374
CID: 5807992
A left-lateralized dorsolateral prefrontal network for naming
Yu, Leyao; Dugan, Patricia; Doyle, Werner; Devinsky, Orrin; Friedman, Daniel; Flinker, Adeen
The ability to connect the form and meaning of a concept, known as word retrieval, is fundamental to human communication. While various input modalities could lead to identical word retrieval, the exact neural dynamics supporting this convergence relevant to daily auditory discourse remain poorly understood. Here, we leveraged neurosurgical electrocorticographic (ECoG) recordings from 48 patients and dissociated two key language networks that highly overlap in time and space integral to word retrieval. Using unsupervised temporal clustering techniques, we found a semantic processing network located in the middle and inferior frontal gyri. This network was distinct from an articulatory planning network in the inferior frontal and precentral gyri, which was agnostic to input modalities. Functionally, we confirmed that the semantic processing network encodes word surprisal during sentence perception. Our findings characterize how humans integrate ongoing auditory semantic information over time, a critical linguistic function from passive comprehension to daily discourse.
PMCID:11118423
PMID: 38798614
ISSN: 2692-8205
CID: 5676322
Transformer-based neural speech decoding from surface and depth electrode signals
Chen, Junbo; Chen, Xupeng; Wang, Ran; Le, Chenqian; Khalilian-Gourtani, Amirhossein; Jensen, Erika; Dugan, Patricia; Doyle, Werner; Devinsky, Orrin; Friedman, Daniel; Flinker, Adeen; Wang, Yao
PMID: 39819752
ISSN: 1741-2552
CID: 5777232
From Single Words to Sentence Production: Shared Cortical Representations but Distinct Temporal Dynamics
Morgan, Adam M; Devinsky, Orrin; Doyle, Werner K; Dugan, Patricia; Friedman, Daniel; Flinker, Adeen
Sentence production is the uniquely human ability to transform complex thoughts into strings of words. Despite the importance of this process, language production research has primarily focused on single words. It remains an untested assumption that insights from this literature generalize to more naturalistic utterances like sentences. Here, we investigate this using high-resolution neurosurgical recordings (ECoG) and an overt production experiment where patients produce six words in isolation (picture naming) and in sentences (scene description). We trained machine learning models to identify the unique brain activity pattern for each word during picture naming, and used these patterns to decode which words patients were processing while they produced sentences. Our findings reveal that words share cortical representations across tasks. In sensorimotor cortex, words were consistently activated in the order in which they were said in the sentence. However, in inferior and middle frontal gyri (IFG and MFG), the order in which words were processed depended on the syntactic structure of the sentence. This dynamic interplay between sentence structure and word processing reveals that sentence production is not simply a sequence of single word production tasks, and highlights a regional division of labor within the language network. Finally, we argue that the dynamics of word processing in prefrontal cortex may impose a subtle pressure on language evolution, explaining why nearly all the world's languages position subjects before objects.
PMCID:11565881
PMID: 39554006
ISSN: 2692-8205
CID: 5766162
A low-activity cortical network selectively encodes syntax
Morgan, Adam M; Devinsky, Orrin; Doyle, Werner K; Dugan, Patricia; Friedman, Daniel; Flinker, Adeen
Syntax, the abstract structure of language, is a hallmark of human cognition. Despite its importance, its neural underpinnings remain obscured by inherent limitations of non-invasive brain measures and a near total focus on comprehension paradigms. Here, we address these limitations with high-resolution neurosurgical recordings (electrocorticography) and a controlled sentence production experiment. We uncover three syntactic networks that are broadly distributed across traditional language regions, but with focal concentrations in middle and inferior frontal gyri. In contrast to previous findings from comprehension studies, these networks process syntax mostly to the exclusion of words and meaning, supporting a cognitive architecture with a distinct syntactic system. Most strikingly, our data reveal an unexpected property of syntax: it is encoded independent of neural activity levels. We propose that this "low-activity coding" scheme represents a novel mechanism for encoding information, reserved for higher-order cognition more broadly.
PMCID:11212956
PMID: 38948730
ISSN: 2692-8205
CID: 5676332
Redefining diagnostic lesional status in temporal lobe epilepsy with artificial intelligence
Gleichgerrcht, Ezequiel; Kaestner, Erik; Hassanzadeh, Reihaneh; Roth, Rebecca W; Parashos, Alexandra; Davis, Kathryn A; Bagić, Anto; Keller, Simon S; Rüber, Theodor; Stoub, Travis; Pardoe, Heath R; Dugan, Patricia; Drane, Daniel L; Abrol, Anees; Calhoun, Vince; Kuzniecky, Ruben I; McDonald, Carrie R; Bonilha, Leonardo
Despite decades of advancements in diagnostic MRI, 30-50% of temporal lobe epilepsy (TLE) patients remain categorized as "non-lesional" (i.e., MRI negative or MRI-) based on visual assessment by human experts. MRI- patients face diagnostic uncertainty and significant delays in treatment planning. Quantitative MRI studies have demonstrated that MRI- patients often exhibit a TLE-specific pattern of temporal and limbic atrophy that may be too subtle for the human eye to detect. This signature pattern could be successfully translated into clinical use via artificial intelligence (AI) advances in computer-aided MRI interpretation, thereby improving the detection of brain "lesional" patterns associated with TLE. Here, we tested this hypothesis by employing a three-dimensional convolutional neural network (3D CNN) applied to a dataset of 1,178 scans from 12 different centers. 3D CNN was able to differentiate TLE from healthy controls with high accuracy (85.9% ± 2.8), significantly outperforming support vector machines based on hippocampal (74.4% ± 2.6) and whole-brain (78.3% ± 3.3) volumes. Our analysis subsequently focused on a subset of patients who achieved sustained seizure freedom post-surgery as a gold standard for confirming TLE. Importantly, MRI- patients from this cohort were accurately identified as TLE 82.7% ± 0.9 of the time, an encouraging finding since clinically these were all patients considered to be MRI- (i.e., not radiographically different than controls). The saliency maps from the CNN revealed that limbic structures, particularly medial temporal, cingulate, and orbitofrontal areas, were most influential in classification, confirming the importance of the well-established TLE signature atrophy pattern for diagnosis. Indeed, the saliency maps were similar in MRI+ and MRI- TLE groups, suggesting that even when humans cannot distinguish more subtle levels of atrophy, these MRI- patients are on the same continuum common across all TLE patients. As such, AI can identify TLE lesional patterns and AI-aided diagnosis has the potential to greatly enhance the neuroimaging diagnosis of TLE and redefine the concept of "lesional" TLE.
PMID: 39842945
ISSN: 1460-2156
CID: 5802322
A corollary discharge circuit in human speech
Khalilian-Gourtani, Amirhossein; Wang, Ran; Chen, Xupeng; Yu, Leyao; Dugan, Patricia; Friedman, Daniel; Doyle, Werner; Devinsky, Orrin; Wang, Yao; Flinker, Adeen
When we vocalize, our brain distinguishes self-generated sounds from external ones. A corollary discharge signal supports this function in animals; however, in humans, its exact origin and temporal dynamics remain unknown. We report electrocorticographic recordings in neurosurgical patients and a connectivity analysis framework based on Granger causality that reveals major neural communications. We find a reproducible source for corollary discharge across multiple speech production paradigms localized to the ventral speech motor cortex before speech articulation. The uncovered discharge predicts the degree of auditory cortex suppression during speech, its well-documented consequence. These results reveal the human corollary discharge source and timing with far-reaching implication for speech motor-control as well as auditory hallucinations in human psychosis.
PMCID:11648673
PMID: 39625978
ISSN: 1091-6490
CID: 5780132
Scale matters: Large language models with billions (rather than millions) of parameters better match neural representations of natural language
Hong, Zhuoqiao; Wang, Haocheng; Zada, Zaid; Gazula, Harshvardhan; Turner, David; Aubrey, Bobbi; Niekerken, Leonard; Doyle, Werner; Devore, Sasha; Dugan, Patricia; Friedman, Daniel; Devinsky, Orrin; Flinker, Adeen; Hasson, Uri; Nastase, Samuel A; Goldstein, Ariel
Recent research has used large language models (LLMs) to study the neural basis of naturalistic language processing in the human brain. LLMs have rapidly grown in complexity, leading to improved language processing capabilities. However, neuroscience researchers haven't kept up with the quick progress in LLM development. Here, we utilized several families of transformer-based LLMs to investigate the relationship between model size and their ability to capture linguistic information in the human brain. Crucially, a subset of LLMs were trained on a fixed training set, enabling us to dissociate model size from architecture and training set size. We used electrocorticography (ECoG) to measure neural activity in epilepsy patients while they listened to a 30-minute naturalistic audio story. We fit electrode-wise encoding models using contextual embeddings extracted from each hidden layer of the LLMs to predict word-level neural signals. In line with prior work, we found that larger LLMs better capture the structure of natural language and better predict neural activity. We also found a log-linear relationship where the encoding performance peaks in relatively earlier layers as model size increases. We also observed variations in the best-performing layer across different brain regions, corresponding to an organized language processing hierarchy.
PMCID:11244877
PMID: 39005394
ISSN: 2692-8205
CID: 5676342
Author Correction: Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns
Goldstein, Ariel; Grinstein-Dabush, Avigail; Schain, Mariano; Wang, Haocheng; Hong, Zhuoqiao; Aubrey, Bobbi; Nastase, Samuel A; Zada, Zaid; Ham, Eric; Feder, Amir; Gazula, Harshvardhan; Buchnik, Eliav; Doyle, Werner; Devore, Sasha; Dugan, Patricia; Reichart, Roi; Friedman, Daniel; Brenner, Michael; Hassidim, Avinatan; Devinsky, Orrin; Flinker, Adeen; Hasson, Uri
PMID: 39353920
ISSN: 2041-1723
CID: 5739352
The impact of COVID-19 on people with epilepsy: Global results from the coronavirus and epilepsy study
Vasey, Michael J; Tai, Xin You; Thorpe, Jennifer; Jones, Gabriel Davis; Ashby, Samantha; Hallab, Asma; Ding, Ding; Andraus, Maria; Dugan, Patricia; Perucca, Piero; Costello, Daniel J; French, Jacqueline A; O'Brien, Terence J; Depondt, Chantal; Andrade, Danielle M; Sengupta, Robin; Datta, Ashis; Delanty, Norman; Jette, Nathalie; Newton, Charles R; Brodie, Martin J; Devinsky, Orrin; Cross, J Helen; Sander, Josemir W; Hanna, Jane; Besag, Frank M C; Sen, Arjune; ,
OBJECTIVE:To characterize the experience of people with epilepsy and aligned healthcare workers (HCWs) during the first 18 months of the COVID-19 pandemic and compare experiences in high-income countries (HICs) with non-HICs. METHODS:Separate surveys for people with epilepsy and HCWs were distributed online in April 2020. Responses were collected to September 2021. Data were collected for COVID-19 infections, the effect of COVID-related restrictions, access to specialist help for epilepsy (people with epilepsy), and the impact of the pandemic on work productivity (HCWs). The frequency of responses for non-HICs and HICs were compared using non-parametric Chi-square tests. RESULTS:Two thousand one hundred and five individuals with epilepsy from 53 countries and 392 HCWs from 26 countries provided data. The same proportion of people with epilepsy in non-HICs and HICs reported COVID-19 infection (7%). Those in HICs were more likely to report that COVID-19 measures had affected their health (32% vs. 23%; p < 0.001). There was no difference between non-HICs and HICs in the proportion who reported difficulty in obtaining help for epilepsy. HCWs in non-HICs were more likely to report COVID-19 infection than those in HICs (18% vs 6%; p = 0.001) and that their clinical work had been affected by concerns about contracting COVID-19, lack of personal protective equipment, and the impact of the pandemic on mental health (all p < 0.001). Compared to pre-pandemic practices, there was a significant shift to remote consultations in both non-HICs and HICs (p < 0.001). SIGNIFICANCE/CONCLUSIONS:While the frequency of COVID-19 infection was relatively low in these data from early in the pandemic, our findings suggest broader health consequences and an increased psychosocial burden, particularly among HCWs in non-HICs. Planning for future pandemics should prioritize mental healthcare alongside ensuring access to essential epilepsy services and expanding and enhancing access to remote consultations. PLAIN LANGUAGE SUMMARY/CONCLUSIONS:We asked people with epilepsy about the effects of COVID-19 on their health and healthcare. We wanted to compare responses from people in high-income countries and other countries. We found that people in high-income countries and other countries had similar levels of difficulty in getting help for their epilepsy. People in high-income countries were more likely to say that their general health had been affected. Healthcare workers in non-high-income settings were more likely to have contracted COVID-19 and have the care they deliver affected by the pandemic. Across all settings, COVID-19 associated with a large shift to remote consultations.
PMID: 39225433
ISSN: 2470-9239
CID: 5687772