Redefining the role of Broca's area in speech
For over a century neuroscientists have debated the dynamics by which human cortical language networks allow words to be spoken. Although it is widely accepted that Broca's area in the left inferior frontal gyrus plays an important role in this process, it was not possible, until recently, to detail the timing of its recruitment relative to other language areas, nor how it interacts with these areas during word production. Using direct cortical surface recordings in neurosurgical patients, we studied the evolution of activity in cortical neuronal populations, as well as the Granger causal interactions between them. We found that, during the cued production of words, a temporal cascade of neural activity proceeds from sensory representations of words in temporal cortex to their corresponding articulatory gestures in motor cortex. Broca's area mediates this cascade through reciprocal interactions with temporal and frontal motor regions. Contrary to classic notions of the role of Broca's area in speech, while motor cortex is activated during spoken responses, Broca's area is surprisingly silent. Moreover, when novel strings of articulatory gestures must be produced in response to nonword stimuli, neural activity is enhanced in Broca's area, but not in motor cortex. These unique data provide evidence that Broca's area coordinates the transformation of information across large-scale cortical networks involved in spoken word production. In this role, Broca's area formulates an appropriate articulatory code to be implemented by motor cortex.
Spectrotemporal modulation provides a unifying framework for auditory cortical asymmetries
The principles underlying functional asymmetries in cortex remain debated. For example, it is accepted that speech is processed bilaterally in auditory cortex, but a left hemisphere dominance emerges when the input is interpreted linguistically. The mechanisms, however, are contested, such as what sound features or processing principles underlie laterality. Recent findings across species (humans, canines and bats) provide converging evidence that spectrotemporal sound features drive asymmetrical responses. Typically, accounts invoke models wherein the hemispheres differ in time-frequency resolution or integration window size. We develop a framework that builds on and unifies prevailing models, using spectrotemporal modulation space. Using signal processing techniques motivated by neural responses, we test this approach, employing behavioural and neurophysiological measures. We show how psychophysical judgements align with spectrotemporal modulations and then characterize the neural sensitivities to temporal and spectral modulations. We demonstrate differential contributions from both hemispheres, with a left lateralization for temporal modulations and a weaker right lateralization for spectral modulations. We argue that representations in the modulation domain provide a more mechanistic basis to account for lateralization in auditory cortex.
Neural correlates of sign language production revealed by electrocorticography
OBJECTIVE:The combined spatiotemporal dynamics underlying sign language production remains largely unknown. To investigate these dynamics as compared to speech production we utilized intracranial electrocorticography during a battery of language tasks. METHODS:We report a unique case of direct cortical surface recordings obtained from a neurosurgical patient with intact hearing and bilingual in English and American Sign Language. We designed a battery of cognitive tasks to capture multiple modalities of language processing and production. RESULTS:We identified two spatially distinct cortical networks: ventral for speech and dorsal for sign production. Sign production recruited peri-rolandic, parietal and posterior temporal regions, while speech production recruited frontal, peri-sylvian and peri-rolandic regions. Electrical cortical stimulation confirmed this spatial segregation, identifying mouth areas for speech production and limb areas for sign production. The temporal dynamics revealed superior parietal cortex activity immediately before sign production, suggesting its role in planning and producing sign language. CONCLUSIONS:Our findings reveal a distinct network for sign language and detail the temporal propagation supporting sign production.
Single-trial speech suppression of auditory cortex activity in humans
The human auditory cortex is engaged in monitoring the speech of interlocutors as well as self-generated speech. During vocalization, auditory cortex activity is reported to be suppressed, an effect often attributed to the influence of an efference copy from motor cortex. Single-unit studies in non-human primates have demonstrated a rich dynamic range of single-trial auditory responses to self-speech consisting of suppressed, nonsuppressed and excited auditory neurons. However, human research using noninvasive methods has only reported suppression of averaged auditory cortex responses to self-generated speech. We addressed this discrepancy by recording electrocorticographic activity from neurosurgical subjects performing auditory repetition tasks. We observed that the degree of suppression varied across different regions of auditory cortex, revealing a variety of suppressed and nonsuppressed responses during vocalization. Importantly, single-trial high-gamma power (gamma(High), 70-150 Hz) robustly tracked individual auditory events and exhibited stable responses across trials for suppressed and nonsuppressed regions.
Human Screams Occupy a Privileged Niche in the Communication Soundscape
Screaming is arguably one of the most relevant communication signals for survival in humans. Despite their practical relevance and their theoretical significance as innate  and virtually universal [2, 3] vocalizations, what makes screams a unique signal and how they are processed is not known. Here, we use acoustic analyses, psychophysical experiments, and neuroimaging to isolate those features that confer to screams their alarming nature, and we track their processing in the human brain. Using the modulation power spectrum (MPS [4, 5]), a recently developed, neurally informed characterization of sounds, we demonstrate that human screams cluster within restricted portion of the acoustic space (between approximately 30 and 150 Hz modulation rates) that corresponds to a well-known perceptual attribute, roughness. In contrast to the received view that roughness is irrelevant for communication , our data reveal that the acoustic space occupied by the rough vocal regime is segregated from other signals, including speech, a pre-requisite to avoid false alarms in normal vocal communication. We show that roughness is present in natural alarm signals as well as in artificial alarms and that the presence of roughness in sounds boosts their detection in various tasks. Using fMRI, we show that acoustic roughness engages subcortical structures critical to rapidly appraise danger. Altogether, these data demonstrate that screams occupy a privileged acoustic niche that, being separated from other communication signals, ensures their biological and ultimately social efficiency.
Reconstructing speech from human auditory cortex
How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex.
Reconstructing Speech Stimuli From Human Auditory Cortex Activity Using a WaveNet Approach
[S.l.] : Institute of Electrical and Electronics Engineers Inc., 2019
Sub-centimeter language organization in the human temporal lobe
The human temporal lobe is well known to be critical for language comprehension. Previous physiological research has focused mainly on non-invasive neuroimaging and electrophysiological techniques with each approach requiring averaging across many trials and subjects. The results of these studies have implicated extended anatomical regions in peri-sylvian cortex in speech perception. These non-invasive studies typically report a spatially homogenous functional pattern of activity across several centimeters of cortex. We examined the spatiotemporal dynamics of word processing using electrophysiological signals acquired from high-density electrode arrays (4mm spacing) placed directly on the human temporal lobe. Electrocorticographic (ECoG) activity revealed a rich mosaic of language activity, which was functionally distinct at four mm separation. Cortical sites responding specifically to word and not phoneme stimuli were surrounded by sites that responded to both words and phonemes. Other sub-regions of the temporal lobe responded robustly to self-produced speech and minimally to external stimuli while surrounding sites at 4mm distance exhibited an inverse pattern of activation. These data provide evidence for temporal lobe specificity to words as well as self-produced speech. Furthermore, the results provide evidence that cortical processing in the temporal lobe is not spatially homogenous over centimeters of cortex. Rather, language processing is supported by independent and spatially distinct functional sub-regions of cortex at a resolution of at least 4mm.
Multiscale temporal integration organizes hierarchical computation in human auditory cortex
To derive meaning from sound, the brain must integrate information across many timescales. What computations underlie multiscale integration in human auditory cortex? Evidence suggests that auditory cortex analyses sound using both generic acoustic representations (for example, spectrotemporal modulation tuning) and category-specific computations, but the timescales over which these putatively distinct computations integrate remain unclear. To answer this question, we developed a general method to estimate sensory integration windows-the time window when stimuli alter the neural response-and applied our method to intracranial recordings from neurosurgical patients. We show that human auditory cortex integrates hierarchically across diverse timescales spanning from ~50 to 400â€‰ms. Moreover, we find that neural populations with short and long integration windows exhibit distinct functional properties: short-integration electrodes (less than ~200â€‰ms) show prominent spectrotemporal modulation selectivity, while long-integration electrodes (greater than ~200â€‰ms) show prominent category selectivity. These findings reveal how multiscale integration organizes auditory computation in the human brain.
Shared computational principles for language processing in humans and deep language models
Departing from traditional linguistic models, advances in deep learning have resulted in a new type of predictive (autoregressive) deep language models (DLMs). Using a self-supervised next-word prediction task, these models generate appropriate linguistic responses in a given context. In the current study, nine participants listened to a 30-min podcast while their brain responses were recorded using electrocorticography (ECoG). We provide empirical evidence that the human brain and autoregressive DLMs share three fundamental computational principles as they process the same natural narrative: (1) both are engaged in continuous next-word prediction before word onset; (2) both match their pre-onset predictions to the incoming word to calculate post-onset surprise; (3) both rely on contextual embeddings to represent words in natural contexts. Together, our findings suggest that autoregressive DLMs provide a new and biologically feasible computational framework for studying the neural basis of language.