Searched for: in-biosketch:true
person:wkd1
From Single Words to Sentence Production: Shared Cortical Representations but Distinct Temporal Dynamics
Morgan, Adam M; Devinsky, Orrin; Doyle, Werner K; Dugan, Patricia; Friedman, Daniel; Flinker, Adeen
Sentence production is the uniquely human ability to transform complex thoughts into strings of words. Despite the importance of this process, language production research has primarily focused on single words. It remains an untested assumption that insights from this literature generalize to more naturalistic utterances like sentences. Here, we investigate this using high-resolution neurosurgical recordings (ECoG) and an overt production experiment where patients produce six words in isolation (picture naming) and in sentences (scene description). We trained machine learning models to identify the unique brain activity pattern for each word during picture naming, and used these patterns to decode which words patients were processing while they produced sentences. Our findings reveal that words share cortical representations across tasks. In sensorimotor cortex, words were consistently activated in the order in which they were said in the sentence. However, in inferior and middle frontal gyri (IFG and MFG), the order in which words were processed depended on the syntactic structure of the sentence. This dynamic interplay between sentence structure and word processing reveals that sentence production is not simply a sequence of single word production tasks, and highlights a regional division of labor within the language network. Finally, we argue that the dynamics of word processing in prefrontal cortex may impose a subtle pressure on language evolution, explaining why nearly all the world's languages position subjects before objects.
PMCID:11565881
PMID: 39554006
ISSN: 2692-8205
CID: 5766162
A low-activity cortical network selectively encodes syntax
Morgan, Adam M; Devinsky, Orrin; Doyle, Werner K; Dugan, Patricia; Friedman, Daniel; Flinker, Adeen
Syntax, the abstract structure of language, is a hallmark of human cognition. Despite its importance, its neural underpinnings remain obscured by inherent limitations of non-invasive brain measures and a near total focus on comprehension paradigms. Here, we address these limitations with high-resolution neurosurgical recordings (electrocorticography) and a controlled sentence production experiment. We uncover three syntactic networks that are broadly distributed across traditional language regions, but with focal concentrations in middle and inferior frontal gyri. In contrast to previous findings from comprehension studies, these networks process syntax mostly to the exclusion of words and meaning, supporting a cognitive architecture with a distinct syntactic system. Most strikingly, our data reveal an unexpected property of syntax: it is encoded independent of neural activity levels. We propose that this "low-activity coding" scheme represents a novel mechanism for encoding information, reserved for higher-order cognition more broadly.
PMCID:11212956
PMID: 38948730
ISSN: 2692-8205
CID: 5676332
Scale matters: Large language models with billions (rather than millions) of parameters better match neural representations of natural language
Hong, Zhuoqiao; Wang, Haocheng; Zada, Zaid; Gazula, Harshvardhan; Turner, David; Aubrey, Bobbi; Niekerken, Leonard; Doyle, Werner; Devore, Sasha; Dugan, Patricia; Friedman, Daniel; Devinsky, Orrin; Flinker, Adeen; Hasson, Uri; Nastase, Samuel A; Goldstein, Ariel
Recent research has used large language models (LLMs) to study the neural basis of naturalistic language processing in the human brain. LLMs have rapidly grown in complexity, leading to improved language processing capabilities. However, neuroscience researchers haven't kept up with the quick progress in LLM development. Here, we utilized several families of transformer-based LLMs to investigate the relationship between model size and their ability to capture linguistic information in the human brain. Crucially, a subset of LLMs were trained on a fixed training set, enabling us to dissociate model size from architecture and training set size. We used electrocorticography (ECoG) to measure neural activity in epilepsy patients while they listened to a 30-minute naturalistic audio story. We fit electrode-wise encoding models using contextual embeddings extracted from each hidden layer of the LLMs to predict word-level neural signals. In line with prior work, we found that larger LLMs better capture the structure of natural language and better predict neural activity. We also found a log-linear relationship where the encoding performance peaks in relatively earlier layers as model size increases. We also observed variations in the best-performing layer across different brain regions, corresponding to an organized language processing hierarchy.
PMCID:11244877
PMID: 39005394
ISSN: 2692-8205
CID: 5676342
Author Correction: Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns
Goldstein, Ariel; Grinstein-Dabush, Avigail; Schain, Mariano; Wang, Haocheng; Hong, Zhuoqiao; Aubrey, Bobbi; Nastase, Samuel A; Zada, Zaid; Ham, Eric; Feder, Amir; Gazula, Harshvardhan; Buchnik, Eliav; Doyle, Werner; Devore, Sasha; Dugan, Patricia; Reichart, Roi; Friedman, Daniel; Brenner, Michael; Hassidim, Avinatan; Devinsky, Orrin; Flinker, Adeen; Hasson, Uri
PMID: 39353920
ISSN: 2041-1723
CID: 5739352
Binding of cortical functional modules by synchronous high-frequency oscillations
Garrett, Jacob C; Verzhbinsky, Ilya A; Kaestner, Erik; Carlson, Chad; Doyle, Werner K; Devinsky, Orrin; Thesen, Thomas; Halgren, Eric
Whether high-frequency phase-locked oscillations facilitate integration ('binding') of information across widespread cortical areas is controversial. Here we show with intracranial electroencephalography that cortico-cortical co-ripples (~100-ms-long ~90 Hz oscillations) increase during reading and semantic decisions, at the times and co-locations when and where binding should occur. Fusiform wordform areas co-ripple with virtually all language areas, maximally from 200 to 400 ms post-word-onset. Semantically specified target words evoke strong co-rippling between wordform, semantic, executive and response areas from 400 to 800 ms, with increased co-rippling between semantic, executive and response areas prior to correct responses. Co-ripples were phase-locked at zero lag over long distances (>12 cm), especially when many areas were co-rippling. General co-activation, indexed by non-oscillatory high gamma, was mainly confined to early latencies in fusiform and earlier visual areas, preceding co-ripples. These findings suggest that widespread synchronous co-ripples may assist the integration of multiple cortical areas for sustained periods during cognition.
PMID: 39134741
ISSN: 2397-3374
CID: 5726782
Subject-Agnostic Transformer-Based Neural Speech Decoding from Surface and Depth Electrode Signals
Chen, Junbo; Chen, Xupeng; Wang, Ran; Le, Chenqian; Khalilian-Gourtani, Amirhossein; Jensen, Erika; Dugan, Patricia; Doyle, Werner; Devinsky, Orrin; Friedman, Daniel; Flinker, Adeen; Wang, Yao
OBJECTIVE/UNASSIGNED:This study investigates speech decoding from neural signals captured by intracranial electrodes. Most prior works can only work with electrodes on a 2D grid (i.e., Electrocorticographic or ECoG array) and data from a single patient. We aim to design a deep-learning model architecture that can accommodate both surface (ECoG) and depth (stereotactic EEG or sEEG) electrodes. The architecture should allow training on data from multiple participants with large variability in electrode placements and the trained model should perform well on participants unseen during training. APPROACH/UNASSIGNED:We propose a novel transformer-based model architecture named SwinTW that can work with arbitrarily positioned electrodes, by leveraging their 3D locations on the cortex rather than their positions on a 2D grid. We train both subject-specific models using data from a single participant as well as multi-patient models exploiting data from multiple participants. MAIN RESULTS/UNASSIGNED:The subject-specific models using only low-density 8x8 ECoG data achieved high decoding Pearson Correlation Coefficient with ground truth spectrogram (PCC=0.817), over N=43 participants, outperforming our prior convolutional ResNet model and the 3D Swin transformer model. Incorporating additional strip, depth, and grid electrodes available in each participant (N=39) led to further improvement (PCC=0.838). For participants with only sEEG electrodes (N=9), subject-specific models still enjoy comparable performance with an average PCC=0.798. The multi-subject models achieved high performance on unseen participants, with an average PCC=0.765 in leave-one-out cross-validation. SIGNIFICANCE/UNASSIGNED:The proposed SwinTW decoder enables future speech neuroprostheses to utilize any electrode placement that is clinically optimal or feasible for a particular participant, including using only depth electrodes, which are more routinely implanted in chronic neurosurgical procedures. Importantly, the generalizability of the multi-patient models suggests the exciting possibility of developing speech neuroprostheses for people with speech disability without relying on their own neural data for training, which is not always feasible.
PMCID:10980022
PMID: 38559163
ISSN: 2692-8205
CID: 5676302
A shared model-based linguistic space for transmitting our thoughts from brain to brain in natural conversations
Zada, Zaid; Goldstein, Ariel; Michelmann, Sebastian; Simony, Erez; Price, Amy; Hasenfratz, Liat; Barham, Emily; Zadbood, Asieh; Doyle, Werner; Friedman, Daniel; Dugan, Patricia; Melloni, Lucia; Devore, Sasha; Flinker, Adeen; Devinsky, Orrin; Nastase, Samuel A; Hasson, Uri
Effective communication hinges on a mutual understanding of word meaning in different contexts. We recorded brain activity using electrocorticography during spontaneous, face-to-face conversations in five pairs of epilepsy patients. We developed a model-based coupling framework that aligns brain activity in both speaker and listener to a shared embedding space from a large language model (LLM). The context-sensitive LLM embeddings allow us to track the exchange of linguistic information, word by word, from one brain to another in natural conversations. Linguistic content emerges in the speaker's brain before word articulation and rapidly re-emerges in the listener's brain after word articulation. The contextual embeddings better capture word-by-word neural alignment between speaker and listener than syntactic and articulatory models. Our findings indicate that the contextual embeddings learned by LLMs can serve as an explicit numerical model of the shared, context-rich meaning space humans use to communicate their thoughts to one another.
PMID: 39096896
ISSN: 1097-4199
CID: 5696672
Temporal integration in human auditory cortex is predominantly yoked to absolute time, not structure duration
Norman-Haignere, Sam V; Keshishian, Menoua K; Devinsky, Orrin; Doyle, Werner; McKhann, Guy M; Schevon, Catherine A; Flinker, Adeen; Mesgarani, Nima
Sound structures such as phonemes and words have highly variable durations. Thus, there is a fundamental difference between integrating across absolute time (e.g., 100 ms) vs. sound structure (e.g., phonemes). Auditory and cognitive models have traditionally cast neural integration in terms of time and structure, respectively, but the extent to which cortical computations reflect time or structure remains unknown. To answer this question, we rescaled the duration of all speech structures using time stretching/compression and measured integration windows in the human auditory cortex using a new experimental/computational method applied to spatiotemporally precise intracranial recordings. We observed significantly longer integration windows for stretched speech, but this lengthening was very small (~5%) relative to the change in structure durations, even in non-primary regions strongly implicated in speech-specific processing. These findings demonstrate that time-yoked computations dominate throughout the human auditory cortex, placing important constraints on neurocomputational models of structure processing.
PMCID:11463558
PMID: 39386565
ISSN: 2692-8205
CID: 5751762
Simulated resections and RNS placement can optimize post-operative seizure outcomes when guided by fast ripple networks
Weiss, Shennan Aibel; Sperling, Michael R; Engel, Jerome; Liu, Anli; Fried, Itzhak; Wu, Chengyuan; Doyle, Werner; Mikell, Charles; Mofakham, Sima; Salamon, Noriko; Sim, Myung Shin; Bragin, Anatol; Staba, Richard
In medication-resistant epilepsy, the goal of epilepsy surgery is to make a patient seizure free with a resection/ablation that is as small as possible to minimize morbidity. The standard of care in planning the margins of epilepsy surgery involves electroclinical delineation of the seizure onset zone (SOZ) and incorporation of neuroimaging findings from MRI, PET, SPECT, and MEG modalities. Resecting cortical tissue generating high-frequency oscillations (HFOs) has been investigated as a more efficacious alternative to targeting the SOZ. In this study, we used a support vector machine (SVM), with four distinct fast ripple (FR: 350-600 Hz on oscillations, 200-600 Hz on spikes) metrics as factors. These metrics included the FR resection ratio (RR), a spatial FR network measure, and two temporal FR network measures. The SVM was trained by the value of these four factors with respect to the actual resection boundaries and actual seizure free labels of 18 patients with medically refractory focal epilepsy. Leave one out cross-validation of the trained SVM in this training set had an accuracy of 0.78. We next used a simulated iterative virtual resection targeting the FR sites that were highest rate and showed most temporal autonomy. The trained SVM utilized the four virtual FR metrics to predict virtual seizure freedom. In all but one of the nine patients seizure free after surgery, we found that the virtual resections sufficient for virtual seizure freedom were larger in volume (p<0.05). In nine patients who were not seizure free, a larger virtual resection made five virtually seizure free. We also examined 10 medically refractory focal epilepsy patients implanted with the responsive neurostimulator system (RNS) and virtually targeted the RNS stimulation contacts proximal to sites generating FR at highest rates to determine if the simulated value of the stimulated SOZ and stimulated FR metrics would trend toward those patients with a better seizure outcome. Our results suggest: 1) FR measures can accurately predict whether a resection, defined by the standard of care, will result in seizure freedom; 2) utilizing FR alone for planning an efficacious surgery can be associated with larger resections; 3) when FR metrics predict the standard of care resection will fail, amending the boundaries of the planned resection with certain FR generating sites may improve outcome; and 4) more work is required to determine if targeting RNS stimulation contact proximal to FR generating sites will improve seizure outcome.
PMCID:10996761
PMID: 38585730
CID: 5725562
Temporal dynamics of short-term neural adaptation across human visual cortex
Brands, Amber Marijn; Devore, Sasha; Devinsky, Orrin; Doyle, Werner; Flinker, Adeen; Friedman, Daniel; Dugan, Patricia; Winawer, Jonathan; Groen, Iris Isabelle Anna
Neural responses in visual cortex adapt to prolonged and repeated stimuli. While adaptation occurs across the visual cortex, it is unclear how adaptation patterns and computational mechanisms differ across the visual hierarchy. Here we characterize two signatures of short-term neural adaptation in time-varying intracranial electroencephalography (iEEG) data collected while participants viewed naturalistic image categories varying in duration and repetition interval. Ventral- and lateral-occipitotemporal cortex exhibit slower and prolonged adaptation to single stimuli and slower recovery from adaptation to repeated stimuli compared to V1-V3. For category-selective electrodes, recovery from adaptation is slower for preferred than non-preferred stimuli. To model neural adaptation we augment our delayed divisive normalization (DN) model by scaling the input strength as a function of stimulus category, enabling the model to accurately predict neural responses across multiple image categories. The model fits suggest that differences in adaptation patterns arise from slower normalization dynamics in higher visual areas interacting with differences in input strength resulting from category selectivity. Our results reveal systematic differences in temporal adaptation of neural population responses between lower and higher visual brain areas and show that a single computational model of history-dependent normalization dynamics, fit with area-specific parameters, accounts for these differences.
PMID: 38815000
ISSN: 1553-7358
CID: 5663772