WO2011127483A1 - Decoding words using neural signals - Google Patents

Decoding words using neural signals Download PDF

Info

Publication number
WO2011127483A1
WO2011127483A1 PCT/US2011/031995 US2011031995W WO2011127483A1 WO 2011127483 A1 WO2011127483 A1 WO 2011127483A1 US 2011031995 W US2011031995 W US 2011031995W WO 2011127483 A1 WO2011127483 A1 WO 2011127483A1
Authority
WO
WIPO (PCT)
Prior art keywords
electrodes
word
frequency
classifier
domain information
Prior art date
Application number
PCT/US2011/031995
Other languages
French (fr)
Inventor
Bradley Greger
Paul House
Spencer Kellis
Kyle Thomson
Original Assignee
University Of Utah Research Foundation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University Of Utah Research Foundation filed Critical University Of Utah Research Foundation
Publication of WO2011127483A1 publication Critical patent/WO2011127483A1/en

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/48Other medical applications
    • A61B5/4803Speech analysis specially adapted for diagnostic purposes
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/24Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
    • A61B5/316Modalities, i.e. specific diagnostic methods
    • A61B5/369Electroencephalography [EEG]
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/24Speech recognition using non-acoustical features
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

Definitions

  • the present disclosure relates to methods and systems for decoding neural signals, e.g., local field potentials recorded from a brain cortical surface.
  • Pathological conditions such as amyotrophic lateral sclerosis or damage to the brainstem can leave patients paralyzed but fully aware, in a condition known as locked-in syndrome. Communication in this state is laborious, often reduced to selecting individual letters or words by arduous residual movement.
  • Certain embodiments of the present technology provide systems and methods for decoding words using neural signals, e.g., local field potentials, recorded from a cortical surface.
  • neural signals e.g., local field potentials
  • a system for decoding words using neural signals comprises a receiver configured to receive a neural signal from each of a plurality of electrodes implanted in a patient when the patient speaks or attempts to speak a word.
  • the system further comprises a processor configured to convert the neural signal for each of the plurality of electrodes into frequency-domain information, and to apply a classifier to the frequency-domain information for the plurality of electrodes to decode the word.
  • the plurality of electrodes are placed over a cortical surface.
  • the plurality of electrodes are placed over a face motor cortex.
  • each neural signal comprises a local field potential from the cortical surface.
  • the frequency-domain information for each of the plurality of electrodes comprises a power spectra.
  • the plurality of electrodes comprise micro-electrodes.
  • the classifier comprises a principal component analysis classifier.
  • the processor is configured to decode the word by finding a centroid from a plurality of centroids that is nearest to an output of the principal component analysis classifier.
  • a method of decoding words using neural signals comprises receiving a neural signal from each of a plurality of electrodes implanted in a patient when the patient speaks or attempts to speak a word, and converting the neural signal for each of the plurality of electrodes into frequency-domain information.
  • the method further comprises applying a classifier to the frequency-domain information for the plurality of electrodes to decode the word.
  • the plurality of electrodes are placed over a cortical surface.
  • the plurality of electrodes are placed over a face motor cortex.
  • each neural signal comprises a local field potential from the cortical surface.
  • the frequency-domain information for each of the plurality of electrodes comprises a power spectra.
  • the plurality of electrodes comprise micro-electrodes.
  • the classifier is a principle component analysis classifier.
  • the method further comprising finding a centroid from a plurality of centroids that is nearest to an output of the principal component analysis classifier.
  • a method of training a classifier to decode words using neural signals comprises receiving a neural signal from each of a plurality of electrodes implanted in a patient for each one of a plurality of trials, wherein the patient speaks or attempts to speak the word for each trial, and for each trial, converting the neural signal for each of the plurality of electrodes into frequency- domain information.
  • the method further comprises training the classifier to decode the word based on the frequency-domain information for each of the plurality of electrodes for each trial.
  • the plurality of electrodes are placed over a cortical surface.
  • the frequency-domain information for each of the plurality of electrodes and each trial comprises a power spectra.
  • the training the classifier comprises performing a principal component analysis on the frequency-domain information for the plurality of electrodes for each trial.
  • Fig. la shows an example of a 16-channel 4x4 micro-electrode array.
  • Fig. lb shows placement of two micro-electrode over a cortical surface, in which one of the micro-electrode arrays is placed over the face motor cortex and the other micro- electrode array is placed over Wernicke's area.
  • Fig. lc shows an audio waveform (top) of a verbal task and a corresponding spectrogram (bottom) of neural data recorded from a single channel over the face motor cortex.
  • Fig. Id shows an audio waveform (top) of conversation, verbal task and verbal reward (top) and a corresponding spectrogram (bottom) of neural data recorded from a single channel over Wernicke's area.
  • Fig. 2a shows windows temporally aligned to spoken words that contain a frequency-domain structure in a spectrogram of neural data recorded from a micro-electrode over the face motor cortex.
  • Fig. 2b shows power spectra calculated for multiple trials and multiple electrodes.
  • Fig. 2c shows a two-dimensional matrix of micro-electrode power spectra and trial information.
  • Fig. 2d shows a principal component analysis performed on micro-electrode power spectra and trial information for two words.
  • Fig. 3a shows a distribution of performance results for each unique combination of two- word through ten- word combinations.
  • Fig. 3b shows a topography of channel performance for micro-electrodes resting over the face motor cortex.
  • Fig. 3c shows a topography of channel performance for micro-electrodes resting over Wernicke's area.
  • Fig. 4 shows a block diagram of a system for recording and analyzing data from a micro-electrode array according to some embodiments of the present invention.
  • Pathological conditions such as amyotrophic lateral sclerosis or damage to the brainstem can leave patients severely paralyzed but fully aware, in a condition known as locked- in syndrome. Communication in this state is laborious, often reduced to selecting individual letters or words by arduous residual movement. More intuitive communication may be possible by directly interfacing with language areas of the cerebral cortex. Many studies of neural interfaces for communication have focused on the challenging problem of reconstructing continuous, dynamic speech. Described herein is a more tractable approach of classifying a set of words.
  • a grid or array of subdural, nonpenetrating, high-impedance micro-electrodes are used to record local field potentials (LFPs) from the cortical surface over the face motor cortex and Wernicke's area.
  • LFP local field potentials
  • a LFP may be an electric field potential from a group of neutrons located near the corresponding electrode.
  • Neural data from many regions of the brain may be used to decode speech; however, data from electrodes over the face motor cortex were found to be the most accurately decodable.
  • Embodiments of the present invention provide a trial- by-trial decoding of spoken words from cortical surface LFPs in the human neocortex, as discussed further below.
  • BCIs brain computer interfaces
  • Penetrating electrodes have been used to perform rapid decoding of continuous motor movements from neuronal activity in the primary motor area of human neocortex; however, because of the risks associated with implantation in language centers, few studies have explored their use in speech BCIs.
  • the neurotrophic electrode is a penetrating electrode designed to mitigate the risks of chronic implantation that has been used to decode the formant frequencies of speech from neuronal activity in the left ventral premotor cortex.
  • Embodiments of the present invention provide a novel recording device and method for decoding speech.
  • LFPs local field potentials
  • a micro- electrode array may comprise a plurality of nonpenetrating, 40- ⁇ microwires with 1-mm inter- electrode spacing.
  • Such micro-electrode grids or arrays have been shown to support high temporal- and spatial-resolution recordings.
  • embodiments of the present invention decode speech by classifying finite sets of words from cortical surface LFPs, thereby reducing the complexity of the problem to determining a limited number of classes.
  • Fig. la shows an example of a single 16-channel 4x4 micro-electrode grid or array that may be used to record LFPs on the cortical surface.
  • the micro-electrode array is shown next to a U.S. quarter-dollar coin for size comparison.
  • Fig. lb shows two 16-channel 4x4 micro-electrode arrays placed beneath the dura closely approximated to the cortical surface over the face motor cortex and Wernicke's area.
  • the wire bundle 1 12a leads to the array 110a over Wernicke's area and the wire bundle 112b leads to the array 110b over the face motor cortex.
  • EoG electrocorticographic
  • Fig. lc shows an audio waveform (top) of a verbal task, in which a patient repeated the word "yes.”
  • Fig. lc also shows a corresponding spectrogram (bottom) of neural data recorded from a single channel or micro-electrode over the face motor cortex.
  • Fig. lc includes a normalized power scale indicating the power levels in the spectrogram. As shown in Fig. lc, the spectrogram reveals frequency-domain structure aligned to the individual words during the verbal task.
  • Fig. Id shows an audio waveform (top) of conversation, verbal task and verbal reward and a corresponding spectrogram (bottom) of neural data recorded from a single channel over Wernicke's area.
  • Wernicke's area is predominantly active when the patient converses and receives verbal rewards after completing an experiment, and was less active during the verbal task.
  • PCA principal component analysis
  • Fig. 2a shows an example of spectrograms 210a-210d of neural data for four different electrodes of a micro-electrode array placed over the face motor cortex.
  • a particular word is repeated three times during a verbal task with each repetition of the word corresponding to a trial.
  • the subject may speak the word or attempt to speak the word for the case where the subject is unable to intelligibly vocalize the word.
  • Fig. 2a shows three 500-msec windows 220a-220c where each window is temporally aligned to one instance of the spoken word. As shown in Fig.
  • the windows 220a- 220c contain frequency-domain structure in each spectrogram 210a-210d corresponding to the spoken word at the three trials.
  • Fig. 2b shows a power spectra for each electrode 210a-210d and each trial.
  • Fig. 2c shows a two-dimensional matrix of micro-electrode power spectra and trial information for a word.
  • power spectra information is collected for each of N electrodes of the array and each of M trials.
  • Fig. 2d shows a principal component analysis performed on micro-electrode power spectra and trial information for the words "hungry” and "thirsty.”
  • principal component analysis performed on micro-electrode power spectra and trial information for the word "hungry” generates a cluster 250 in the principal component space, where each point in the cluster 250 represents one trial.
  • principal component analysis performed on micro-electrode power spectra and trial information for the word "thirsty” generates a cluster 255 in the principal component space.
  • three dimensions of the principal component space are shown for ease of illustration, although it is to be understood that the principal component space may comprise any number of dimensions.
  • a center of mass or centroid may be computed for each cluster corresponding to a particular word.
  • the word may be classified by performing principal component analysis on micro-electrode spectra information from the patient to project the spectra
  • classification examples include maximum likelihood, support vector machine and Bayesian classification.
  • FIG. 3b,c shows performance results for individual electrodes over the face motor cortex for different words
  • Fig. 3c shows performance results for individual electrodes over Wernicke's area for different words. Examining the mean performance of each word against all other words, it was found that electrode 14 ranged from 51.5% accuracy for the word "cold” to 81.5% accuracy for the word "yes.” The standard deviation of performance across all 16 motor-sensory electrodes was measured as 6.6 ⁇ 1.5 percentage points, suggesting that surface LFPs recorded from some electrodes corresponded to aspects of speech production present in some words but not others.
  • micro-electrode that provided the highest accuracy for any single word varied. Selecting the five electrodes of the array with best overall accuracy from the face motor cortex improved classification accuracy to 89.6 ⁇ 10.8% of two-word combinations (median 90.0%; Fig. 3a). However, selecting the five highest-performing electrodes over Wernicke's area did not improve performance (73.5 ⁇ 16.4% of two-word combinations correctly classified; median 73.3%) when compared with using all 16 electrodes over that region of cortex. Some micro- electrodes over the face motor cortex may not have recorded neural signals useful in decoding the specific set of words presented, indicating a more concrete mapping of the neural signal onto patterns of speech articulation. Conversely, most of the 16 micro-electrodes over Wernicke's area appear to have recorded neural signal related to language processing, supporting a more distributed and abstract encoding of speech.
  • micro-electrode grids or array could be reduced with epidural placement, as shown for similar recording devices. Furthermore, a wireless
  • Training and decoding used subsets of channels and combinations of two through ten words. Mean, median, and standard deviation were computed for results of each
  • Topographical performance [0065] The algorithm was run using data from each electrode individually and for all combinations of two words. Classification accuracies from all combinations involving the selected word and channel were averaged.
  • FIG. 4 is block diagram showing an example of a system 450 for recording and processing LFPs from an micro-electrode array 410 that may be used for various embodiments of the invention.
  • the system 450 may include a receiver 452, a processor 455, and a memory 460.
  • the receiver 452 may be used to condition the electrical signals from the micro-electrode array 410 for processing by the processor 455.
  • the receiver 452 may include one or more of the following components: amplifiers (e.g., low-noise amplifiers) for amplifying the electrical signals, a filter for isolating electrical signals within a desired frequency bandwidth, and an analog-to-digital converter for digitizing the electrical signals for processing by the processor 455.
  • amplifiers e.g., low-noise amplifiers
  • filter for isolating electrical signals within a desired frequency bandwidth
  • an analog-to-digital converter for digitizing the electrical signals for processing by the processor 455.
  • the processor 455 may comprise a general purpose processor, a digital signal processors (DSPs), application specific integrated circuit (ASICs), discrete hardware
  • the memory 460 may comprise any computer-readable media known in the art including volatile memory, nonvolatile memory, a Random Access Memory (RAM), a flash memory, a Read Only Memory (ROM), a removable disk, a CD-ROM, a DVD, any other suitable storage device, or a combination thereof.
  • RAM Random Access Memory
  • ROM Read Only Memory
  • the processor 455 may also output raw electrical signals, processed electrical signals, and/or results of analysis to an output device 465, including, but not limited to, a display for viewing by a neurologist, a printer for generating a computer readout, a computer-readable media, and/or to another computer via a computer network connection.
  • the output device 465 may also include an audio output device that outputs the decoded word as an audio output, e.g., a synthetic voice vocalizing the decoded word.
  • the processor 455 may decode a word by receiving neural signals, e.g., local field potentials, from the micro-electrode array 410 when the patient speaks the word or attempts to speak the word.
  • the processor 455 may then convert the neural signals into frequency-domain information, e.g., power spectra, for one or more electrodes of the array.
  • the processor 455 may then classify the frequency-domain information for the one or more electrodes into one of a set of words.
  • the processor 455 may perform principal . component analysis on the frequency-domain information to project the frequency-domain information into the principal component space and determine its nearest centriod in the principal component space, as described above.
  • the processor 455 may display the decoded word on a display and/or vocalize the decoded word from an audio output device.
  • the processor 455 may be trained to classify a particular word using the methods described above with reference to Figs. 2a-2d.

Abstract

Methods and systems are described for decoding words using neural signals, e.g., local field potentials, recorded from, e.g., a brain cortical surface. Some methods include receiving neural signals from a plurality of electrodes contacting a patient when the patient speaks, or attempts to speak, a word; converting the neural signal for each electrode into frequency-domain information; and applying a classifier to the frequency-domain information for the plurality of electrodes so as to determine the word.

Description

DECODING WORDS USING NEURAL SIGNALS
Related Application
[0001] The present application claims the benefit of priority under 35 U.S.C. §119 from U.S. Provisional Patent Application Serial No. 61/322,797, filed April 9, 2010, which is hereby incorporated by reference in its entirety for all purposes.
Statement Regarding Federally Sponsored Research or Development
[0002] This invention was made with government support under Grant #EY019363 awarded by the National Institutes of Health. The government has certain rights in the invention.
Field
[0003] The present disclosure relates to methods and systems for decoding neural signals, e.g., local field potentials recorded from a brain cortical surface.
Background
[0004] Pathological conditions such as amyotrophic lateral sclerosis or damage to the brainstem can leave patients paralyzed but fully aware, in a condition known as locked-in syndrome. Communication in this state is laborious, often reduced to selecting individual letters or words by arduous residual movement.
Summary
[0005] Certain embodiments of the present technology provide systems and methods for decoding words using neural signals, e.g., local field potentials, recorded from a cortical surface.
[0006] In certain embodiments, a system for decoding words using neural signals is provided. The system comprises a receiver configured to receive a neural signal from each of a plurality of electrodes implanted in a patient when the patient speaks or attempts to speak a word. The system further comprises a processor configured to convert the neural signal for each of the plurality of electrodes into frequency-domain information, and to apply a classifier to the frequency-domain information for the plurality of electrodes to decode the word. [0007] In certain embodiments, the plurality of electrodes are placed over a cortical surface.
[0008] In certain embodiments, the plurality of electrodes are placed over a face motor cortex.
[0009] In certain embodiments, each neural signal comprises a local field potential from the cortical surface.
[0010] In certain embodiments, the frequency-domain information for each of the plurality of electrodes comprises a power spectra.
[0011] In certain embodiments, the plurality of electrodes comprise micro-electrodes.
[0012] In certain embodiments, the classifier comprises a principal component analysis classifier.
[0013] In certain embodiments, the processor is configured to decode the word by finding a centroid from a plurality of centroids that is nearest to an output of the principal component analysis classifier.
[0014] In certain embodiments, a method of decoding words using neural signals is provided. The method comprises receiving a neural signal from each of a plurality of electrodes implanted in a patient when the patient speaks or attempts to speak a word, and converting the neural signal for each of the plurality of electrodes into frequency-domain information. The method further comprises applying a classifier to the frequency-domain information for the plurality of electrodes to decode the word.
[0015] In certain embodiments, the plurality of electrodes are placed over a cortical surface.
[0016] In certain embodiments, the plurality of electrodes are placed over a face motor cortex.
[0017] In certain embodiments, each neural signal comprises a local field potential from the cortical surface.
[0018] In certain embodiments, the frequency-domain information for each of the plurality of electrodes comprises a power spectra.
[0019] In certain embodiments, the plurality of electrodes comprise micro-electrodes.
[0020] In certain embodiments, the classifier is a principle component analysis classifier. [0021] In certain embodiments, the method further comprising finding a centroid from a plurality of centroids that is nearest to an output of the principal component analysis classifier.
[0022] In certain embodiments, a method of training a classifier to decode words using neural signals is provided. For each one of a plurality of words, the method comprises receiving a neural signal from each of a plurality of electrodes implanted in a patient for each one of a plurality of trials, wherein the patient speaks or attempts to speak the word for each trial, and for each trial, converting the neural signal for each of the plurality of electrodes into frequency- domain information. For each of the plurality of words, the method further comprises training the classifier to decode the word based on the frequency-domain information for each of the plurality of electrodes for each trial.
[0023] In certain embodiments, the plurality of electrodes are placed over a cortical surface.
[0024] In certain embodiments, the frequency-domain information for each of the plurality of electrodes and each trial comprises a power spectra.
[0025] In certain embodiments, the training the classifier comprises performing a principal component analysis on the frequency-domain information for the plurality of electrodes for each trial.
[0026] For purposes of summarizing the disclosure, certain aspects, advantages, embodiments and novel features of the disclosure have been described herein. It is to be understood that not necessarily all such advantages may be achieved in accordance with any particular embodiment of the disclosure. Thus, the disclosure may be embodied or carried out in a manner that achieves or optimizes one advantage or group of advantages as taught herein without necessarily achieving other advantages as may be taught or suggested herein.
Brief Description of the Drawings
[0027] Fig. la shows an example of a 16-channel 4x4 micro-electrode array.
[0028] Fig. lb shows placement of two micro-electrode over a cortical surface, in which one of the micro-electrode arrays is placed over the face motor cortex and the other micro- electrode array is placed over Wernicke's area. [0029] Fig. lc shows an audio waveform (top) of a verbal task and a corresponding spectrogram (bottom) of neural data recorded from a single channel over the face motor cortex.
[0030] Fig. Id shows an audio waveform (top) of conversation, verbal task and verbal reward (top) and a corresponding spectrogram (bottom) of neural data recorded from a single channel over Wernicke's area.
[0031] Fig. 2a shows windows temporally aligned to spoken words that contain a frequency-domain structure in a spectrogram of neural data recorded from a micro-electrode over the face motor cortex.
[0032] Fig. 2b shows power spectra calculated for multiple trials and multiple electrodes.
[0033] Fig. 2c shows a two-dimensional matrix of micro-electrode power spectra and trial information.
[0034] Fig. 2d shows a principal component analysis performed on micro-electrode power spectra and trial information for two words.
[0035] Fig. 3a shows a distribution of performance results for each unique combination of two- word through ten- word combinations.
[0036] Fig. 3b shows a topography of channel performance for micro-electrodes resting over the face motor cortex.
[0037] Fig. 3c shows a topography of channel performance for micro-electrodes resting over Wernicke's area.
[0038] Fig. 4 shows a block diagram of a system for recording and analyzing data from a micro-electrode array according to some embodiments of the present invention.
Detailed Description
[0039] Pathological conditions such as amyotrophic lateral sclerosis or damage to the brainstem can leave patients severely paralyzed but fully aware, in a condition known as locked- in syndrome. Communication in this state is laborious, often reduced to selecting individual letters or words by arduous residual movement. More intuitive communication may be possible by directly interfacing with language areas of the cerebral cortex. Many studies of neural interfaces for communication have focused on the challenging problem of reconstructing continuous, dynamic speech. Described herein is a more tractable approach of classifying a set of words. In some embodiments, a grid or array of subdural, nonpenetrating, high-impedance micro-electrodes are used to record local field potentials (LFPs) from the cortical surface over the face motor cortex and Wernicke's area. A LFP may be an electric field potential from a group of neutrons located near the corresponding electrode. Neural data from many regions of the brain may be used to decode speech; however, data from electrodes over the face motor cortex were found to be the most accurately decodable. Embodiments of the present invention provide a trial- by-trial decoding of spoken words from cortical surface LFPs in the human neocortex, as discussed further below.
[0040] Early studies of brain computer interfaces (BCIs) for speech trained patients to use slow cortical potentials to interact with a computer for communication. More recently noninvasive BCIs have demonstrated improvements but can require extensive training to achieve moderate accuracy and rates of communication. Penetrating electrodes have been used to perform rapid decoding of continuous motor movements from neuronal activity in the primary motor area of human neocortex; however, because of the risks associated with implantation in language centers, few studies have explored their use in speech BCIs. The neurotrophic electrode is a penetrating electrode designed to mitigate the risks of chronic implantation that has been used to decode the formant frequencies of speech from neuronal activity in the left ventral premotor cortex. Studies investigating less invasive measures have shown that cortical surface potentials recorded by electrocorticographic (ECoG) electrodes can discriminate between motor and speech tasks and discriminate phonemes. Regardless of the recording paradigm used, most studies of speech BCIs have focused on the challenging task of decoding continuous, dynamic speech from the neural representations of formant frequencies in either action potentials or field potentials.
[0041] Embodiments of the present invention provide a novel recording device and method for decoding speech. In some embodiments, local field potentials (LFPs) on a cortical surface of the brain are recorded from one or more micro-electrode arrays. For example, a micro- electrode array may comprise a plurality of nonpenetrating, 40-μηι microwires with 1-mm inter- electrode spacing. Such micro-electrode grids or arrays have been shown to support high temporal- and spatial-resolution recordings. Also, rather than decoding continuous speech, embodiments of the present invention decode speech by classifying finite sets of words from cortical surface LFPs, thereby reducing the complexity of the problem to determining a limited number of classes.
[0042] Fig. la shows an example of a single 16-channel 4x4 micro-electrode grid or array that may be used to record LFPs on the cortical surface. In Fig. la, the micro-electrode array is shown next to a U.S. quarter-dollar coin for size comparison. Fig. lb shows two 16-channel 4x4 micro-electrode arrays placed beneath the dura closely approximated to the cortical surface over the face motor cortex and Wernicke's area. In Fig. lb, the wire bundle 1 12a leads to the array 110a over Wernicke's area and the wire bundle 112b leads to the array 110b over the face motor cortex. Fig. lb also shows electrocorticographic (ECoG) electrodes, which are much larger than the micro-electrodes of the arrays. The wide range of muscles required to articulate vocalizations suggests that unique neural activity in the face motor cortex may correspond to unique word formulations. Wernicke's area is known to play an important role in high-level language processing.
[0043] Fig. lc shows an audio waveform (top) of a verbal task, in which a patient repeated the word "yes." Fig. lc also shows a corresponding spectrogram (bottom) of neural data recorded from a single channel or micro-electrode over the face motor cortex. Fig. lc includes a normalized power scale indicating the power levels in the spectrogram. As shown in Fig. lc, the spectrogram reveals frequency-domain structure aligned to the individual words during the verbal task.
[0044] Fig. Id shows an audio waveform (top) of conversation, verbal task and verbal reward and a corresponding spectrogram (bottom) of neural data recorded from a single channel over Wernicke's area. As shown in Fig. Id, Wernicke's area is predominantly active when the patient converses and receives verbal rewards after completing an experiment, and was less active during the verbal task.
[0045] Previous studies have used principal component analysis (PCA) to separate frequency-domain features in neural signals. In one embodiment, PCA is used to classify a finite set of words. For each word, PCA is performed on power spectra from each electrode and each trial simultaneously. During the training phase, a center of mass, or centroid, is calculated as the average of the coordinates of all projected trials belonging to a particular word. During the classification phase, trials are projected into the principal component space and classified as specific words by their proximity to a centroid. An example of this is illustrated in Figs. 2a-2d.
[0046] Fig. 2a shows an example of spectrograms 210a-210d of neural data for four different electrodes of a micro-electrode array placed over the face motor cortex. In this example, a particular word is repeated three times during a verbal task with each repetition of the word corresponding to a trial. For each trial, the subject may speak the word or attempt to speak the word for the case where the subject is unable to intelligibly vocalize the word. For each spectrogram, Fig. 2a shows three 500-msec windows 220a-220c where each window is temporally aligned to one instance of the spoken word. As shown in Fig. 2a, the windows 220a- 220c contain frequency-domain structure in each spectrogram 210a-210d corresponding to the spoken word at the three trials. Fig. 2b shows a power spectra for each electrode 210a-210d and each trial.
[0047] Fig. 2c shows a two-dimensional matrix of micro-electrode power spectra and trial information for a word. In this example, power spectra information is collected for each of N electrodes of the array and each of M trials.
[0048] Fig. 2d shows a principal component analysis performed on micro-electrode power spectra and trial information for the words "hungry" and "thirsty." In this example, principal component analysis performed on micro-electrode power spectra and trial information for the word "hungry" generates a cluster 250 in the principal component space, where each point in the cluster 250 represents one trial. Similarly, principal component analysis performed on micro-electrode power spectra and trial information for the word "thirsty" generates a cluster 255 in the principal component space. In the example in Fig. 2d, three dimensions of the principal component space are shown for ease of illustration, although it is to be understood that the principal component space may comprise any number of dimensions.
[0049] In Fig. 2d, a center of mass or centroid may be computed for each cluster corresponding to a particular word. During the classification phase, when a patient speaks a word or attempts to speak a word, the word may be classified by performing principal component analysis on micro-electrode spectra information from the patient to project the spectra
information into the principal component space and then determining its nearest centroid. The word that the patient spoke or attempted to speak is then classified based on the word corresponding to the nearest centroid. Those skilled in the art will appreciate that other types of classification may also be used to decode a word based on the micro-electrode spectra
information. Examples of other types of classification include maximum likelihood, support vector machine and Bayesian classification.
[0050] Classification was performed both separately (Fig. 3a) and jointly for cortical surface LFP data recorded over the face motor cortex and cortical surface LFP data recorded over Wernicke's area. Electrodes over the face motor cortex offered the best classification
performance. Out of 45 unique two-word combinations, 85.0 ± 13.1 % (mean ± standard deviation) were correctly classified using data from all 16 array electrodes (median performance was 83.3%). Data recorded over Wernicke's area were less classifiable with 76.2 ± 15.0% of two-word combinations correctly classified (median 76.7%). Joint classification did not improve performance over the level achieved by the face motor electrodes alone (0.40 ± 0.43% difference in the percentage of two- through ten- word combinations classified correctly). Vocal dynamics such as varied pitch or inflection could contribute to lower-than-expected performance in discriminating some word combinations. Regardless, decoding accuracies that were well above chance and the timing of the increased spectral power suggest that the micro-electrode array over the face motor cortex recorded signals involved in speech production. Similarly, activity recorded over Wernicke's area appears to be involved in speech processing but likely represents language at a more abstract level.
[0051] Surface LFPs recorded from individual micro-electrodes were better able to decode some words than others (Fig. 3b,c). Fig. 3b shows performance results for individual electrodes over the face motor cortex for different words, and Fig. 3c shows performance results for individual electrodes over Wernicke's area for different words. Examining the mean performance of each word against all other words, it was found that electrode 14 ranged from 51.5% accuracy for the word "cold" to 81.5% accuracy for the word "yes." The standard deviation of performance across all 16 motor-sensory electrodes was measured as 6.6 ± 1.5 percentage points, suggesting that surface LFPs recorded from some electrodes corresponded to aspects of speech production present in some words but not others.
[0052] The micro-electrode that provided the highest accuracy for any single word varied. Selecting the five electrodes of the array with best overall accuracy from the face motor cortex improved classification accuracy to 89.6 ± 10.8% of two-word combinations (median 90.0%; Fig. 3a). However, selecting the five highest-performing electrodes over Wernicke's area did not improve performance (73.5 ± 16.4% of two-word combinations correctly classified; median 73.3%) when compared with using all 16 electrodes over that region of cortex. Some micro- electrodes over the face motor cortex may not have recorded neural signals useful in decoding the specific set of words presented, indicating a more concrete mapping of the neural signal onto patterns of speech articulation. Conversely, most of the 16 micro-electrodes over Wernicke's area appear to have recorded neural signal related to language processing, supporting a more distributed and abstract encoding of speech.
[0053] Decoding surface LFPs from the best five micro-electrodes simultaneously gave better results than decoding data from the same micro-electrodes individually. As much as 20.0 percentage points difference (vs. electrode 15 alone; ten- word combination) was found. On average, the collective accuracy of these five electrodes was 16.2 ± 2.8 percentage points higher than their independently measured accuracy. Neural activity recorded by these five micro- electrodes likely corresponded to multiple aspects of speech articulation that varied across the set of words used in the experiments.
[0054] The tight inter-electrode spacing and small number of electrodes limited the spatial coverage of the micro-electrode grid or array. An optimized grid design with larger spacing and more electrodes would likely cover a larger number of relevant neural signals and allow better decoding accuracy. Performance could likely be further improved with patient training to stereotype word articulation.
[0055] The invasiveness of the micro-electrode grids or array could be reduced with epidural placement, as shown for similar recording devices. Furthermore, a wireless
implementation of the system might be practical given the relatively low bandwidth required to capture cortical surface LFPs. A wireless system to decode speech, with a balance of
invasiveness and performance, could improve the quality of life for locked-in patients and others unable to communicate on their own.
[0056] The above results show that spoken words can be decoded from surface LFPs recorded over neocortical speech areas by arrays of closely spaced micro-electrodes. Therefore, classification of words using surface LFPs is a viable approach to restoring limited but useful communication to those suffering from locked-in syndrome.
[0057] Methods used to obtain the above results are discussed below.
[0058] Subject and Experiment
[0059] One male patient who required extraoperative electrocorticographic monitoring for medically refractory epilepsy gave informed consent to participate in an institutional review board-approved study. Two nonpenetrating micro-electrode arrays (PMT Neurosurgical, Chanhassen, MN) were implanted over face motor cortex and Wernicke's area. Each array comprised 16 channels of 40- μηι wire terminating in a 4x4 grid with 1 -millimeter spacing. For each of 10 words, the patient repeated the word up to 25 times over four consecutive days. Audio data and 32 channels of neural data from the two micro-electrode arrays were recorded at 30,000 samples per second by a Neuroport system (Blackrock Microsystems, Salt Lake City, UT). A subset of trials containing stereotypical articulation was selected for each word
(Supplementary Table 1).
[0060] Data Analysis
[0061] Data were filtered to discard frequencies above 500 Hz and re-referenced to the common average. Power spectra were computed for 0.5-second windows aligned to
vocalization. Log-normalized power spectra for each trial and micro-electrode were
concatenated to form a large row vector. All such trial-vectors for each word being classified (two to ten words) were stacked vertically to form a two-dimensional matrix of power spectral data comprising all available channels and trials. Principal component analysis on this data set resulted in clustering, which allowed nearest-centroid classification. Fifteen trials were used for both training and decoding. To keep these trials as temporally proximal as possible, trials from as few adjacent days as possible were used.
[0062] Multi-word performance
[0063] Training and decoding used subsets of channels and combinations of two through ten words. Mean, median, and standard deviation were computed for results of each
combination. Combinations were selected using the n-choose-k method (n=10 and k=2-10).
[0064] Topographical performance [0065] The algorithm was run using data from each electrode individually and for all combinations of two words. Classification accuracies from all combinations involving the selected word and channel were averaged.
[0066] Fig. 4 is block diagram showing an example of a system 450 for recording and processing LFPs from an micro-electrode array 410 that may be used for various embodiments of the invention. The system 450 may include a receiver 452, a processor 455, and a memory 460. The receiver 452 may be used to condition the electrical signals from the micro-electrode array 410 for processing by the processor 455. The receiver 452 may include one or more of the following components: amplifiers (e.g., low-noise amplifiers) for amplifying the electrical signals, a filter for isolating electrical signals within a desired frequency bandwidth, and an analog-to-digital converter for digitizing the electrical signals for processing by the processor 455. Some or all of the above components may also be implanted in the patient with the micro- electrode array 410.
[0067] The processor 455 may comprise a general purpose processor, a digital signal processors (DSPs), application specific integrated circuit (ASICs), discrete hardware
components, or any combination thereof. Methods for decoding speech using neural signals from the array 410 according to various embodiments of the invention discussed above may be embodied in software code that is stored in the memory 460 and executed by the processor 455. The memory 460 may comprise any computer-readable media known in the art including volatile memory, nonvolatile memory, a Random Access Memory (RAM), a flash memory, a Read Only Memory (ROM), a removable disk, a CD-ROM, a DVD, any other suitable storage device, or a combination thereof.
[0068] The processor 455 may also output raw electrical signals, processed electrical signals, and/or results of analysis to an output device 465, including, but not limited to, a display for viewing by a neurologist, a printer for generating a computer readout, a computer-readable media, and/or to another computer via a computer network connection. The output device 465 may also include an audio output device that outputs the decoded word as an audio output, e.g., a synthetic voice vocalizing the decoded word.
[0069] In one embodiment, the processor 455 may decode a word by receiving neural signals, e.g., local field potentials, from the micro-electrode array 410 when the patient speaks the word or attempts to speak the word. The processor 455 may then convert the neural signals into frequency-domain information, e.g., power spectra, for one or more electrodes of the array. The processor 455 may then classify the frequency-domain information for the one or more electrodes into one of a set of words. For example, the processor 455 may perform principal . component analysis on the frequency-domain information to project the frequency-domain information into the principal component space and determine its nearest centriod in the principal component space, as described above. After decoding the word that the patient spoke or attempted to speak, the processor 455 may display the decoded word on a display and/or vocalize the decoded word from an audio output device. The processor 455 may be trained to classify a particular word using the methods described above with reference to Figs. 2a-2d.
[0070] It will be also appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the specific embodiments disclosed herein, without departing from the scope or spirit of the disclosure as broadly described. The present
embodiments are, therefore, to be considered in all respects illustrative and not restrictive of the present invention.

Claims

What is claimed is:
1. A system for decoding words using neural signals, comprising:
a receiver configured to receive a neural signal from each of a plurality of electrodes, each of the neural signals emanating from the brain of a patient when the patient speaks, or attempts to speak, a word; and
a processor configured to convert the neural signals into frequency-domain information and to apply a classifier to the frequency-domain information so as to determine the word.
2. The system of claim 1, wherein the electrodes contact a brain cortical surface.
3. The system of claim 2, wherein the electrodes contact a face motor cortex.
4. The system of claim 2, wherein each of the neural signals comprises a local field potential from the cortical surface.
5. The system of claim 1, wherein the frequency-domain information for each of the plurality of electrodes comprises a power spectrum.
6. The system of claim 1, wherein the electrodes comprise micro-electrodes.
7. The system of claim 1, wherein the classifier comprises a principal component analysis classifier.
8. The system of claim 7, wherein the processor is programmed to determine the word by fmding a centroid, from a plurality of centroids, that is nearest to an output of the principal component analysis classifier.
9. A method of identifying words using neural signals, comprising: receiving a neural signal from each of a plurality of electrodes, each of the neural signals emanating from the brain of a patient when the patient speaks, or attempts to speak, a word; converting the neural signal for each of the plurality of electrodes into frequency-domain information; and
applying a classifier to the frequency-domain information for the plurality of electrodes so as to determine the word.
10. The method of claim 9, wherein the electrodes contact a brain cortical surface.
1 1. The method of claim 10, wherein the electrodes contact a face motor cortex.
12. The method of claim 10, wherein each of the neural signals comprises a local field potential from the cortical surface.
13. The method of claim 9, wherein the frequency-domain information for each of the plurality of electrodes comprises a power spectrum.
14. The method of claim 9, wherein the electrodes comprise micro-electrodes.
15. The method of claim 9, wherein the classifier is a principle component analysis classifier.
16. The method of claim 15, wherein the applying comprises finding a centroid, from a plurality of centroids, that is nearest to an output of the principal component analysis classifier.
17. A method of training a classifier to determine words using neural signals, comprising: for each one of a plurality of words, performing the steps of:
for each one of a plurality of trials, receiving a neural signal from each of a plurality of electrodes, each of the neural signals emanating from the brain of a patient when the patient speaks, or attempts to speak, a word; for each trial, converting the neural signals into frequency-domain information; and
training the classifier to determine the word based on the frequency-domain information for each trial.
18. The method of claim 17, wherein the plurality of electrodes contatct a cortical surface.
19. The method of claim 17, wherein the frequency-domain information for each trial comprises a power spectrum.
20. The method of claim 17, wherein the training the classifier comprises performing a principal component analysis on the frequency-domain information for each trial.
PCT/US2011/031995 2010-04-09 2011-04-11 Decoding words using neural signals WO2011127483A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US32279710P 2010-04-09 2010-04-09
US61/322,797 2010-04-09

Publications (1)

Publication Number Publication Date
WO2011127483A1 true WO2011127483A1 (en) 2011-10-13

Family

ID=44763316

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2011/031995 WO2011127483A1 (en) 2010-04-09 2011-04-11 Decoding words using neural signals

Country Status (1)

Country Link
WO (1) WO2011127483A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014066855A1 (en) * 2012-10-26 2014-05-01 The Regents Of The University Of California Methods of decoding speech from brain activity data and devices for practicing the same
US10653330B2 (en) 2016-08-25 2020-05-19 Paradromics, Inc. System and methods for processing neural signals

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6020110A (en) * 1994-06-24 2000-02-01 Cambridge Sensors Ltd. Production of electrodes for electrochemical sensing
US6233480B1 (en) * 1990-08-10 2001-05-15 University Of Washington Methods and apparatus for optically imaging neuronal tissue and activity
US20050228515A1 (en) * 2004-03-22 2005-10-13 California Institute Of Technology Cognitive control signals for neural prosthetics
US20060049957A1 (en) * 2004-08-13 2006-03-09 Surgenor Timothy R Biological interface systems with controlled device selector and related methods
US20060217782A1 (en) * 1998-10-26 2006-09-28 Boveja Birinder R Method and system for cortical stimulation to provide adjunct (ADD-ON) therapy for stroke, tinnitus and other medical disorders using implantable and external components
US20080253626A1 (en) * 2006-10-10 2008-10-16 Schuckers Stephanie Regional Fingerprint Liveness Detection Systems and Methods
US20090221896A1 (en) * 2006-02-23 2009-09-03 Rickert Joern Probe For Data Transmission Between A Brain And A Data Processing Device
US20100046799A1 (en) * 2003-07-03 2010-02-25 Videoiq, Inc. Methods and systems for detecting objects of interest in spatio-temporal signals

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6233480B1 (en) * 1990-08-10 2001-05-15 University Of Washington Methods and apparatus for optically imaging neuronal tissue and activity
US6020110A (en) * 1994-06-24 2000-02-01 Cambridge Sensors Ltd. Production of electrodes for electrochemical sensing
US20060217782A1 (en) * 1998-10-26 2006-09-28 Boveja Birinder R Method and system for cortical stimulation to provide adjunct (ADD-ON) therapy for stroke, tinnitus and other medical disorders using implantable and external components
US20100046799A1 (en) * 2003-07-03 2010-02-25 Videoiq, Inc. Methods and systems for detecting objects of interest in spatio-temporal signals
US20050228515A1 (en) * 2004-03-22 2005-10-13 California Institute Of Technology Cognitive control signals for neural prosthetics
US20060049957A1 (en) * 2004-08-13 2006-03-09 Surgenor Timothy R Biological interface systems with controlled device selector and related methods
US20090221896A1 (en) * 2006-02-23 2009-09-03 Rickert Joern Probe For Data Transmission Between A Brain And A Data Processing Device
US20080253626A1 (en) * 2006-10-10 2008-10-16 Schuckers Stephanie Regional Fingerprint Liveness Detection Systems and Methods

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014066855A1 (en) * 2012-10-26 2014-05-01 The Regents Of The University Of California Methods of decoding speech from brain activity data and devices for practicing the same
US10264990B2 (en) 2012-10-26 2019-04-23 The Regents Of The University Of California Methods of decoding speech from brain activity data and devices for practicing the same
US10653330B2 (en) 2016-08-25 2020-05-19 Paradromics, Inc. System and methods for processing neural signals

Similar Documents

Publication Publication Date Title
US10264990B2 (en) Methods of decoding speech from brain activity data and devices for practicing the same
Steinschneider et al. Intracranial study of speech-elicited activity on the human posterolateral superior temporal gyrus
Bower et al. Spatiotemporal neuronal correlates of seizure generation in focal epilepsy
D’Zmura et al. Toward EEG sensing of imagined speech
Bouchard et al. Neural decoding of spoken vowels from human sensory-motor cortex with high-density electrocorticography
EP2596416A2 (en) Multimodal brain computer interface
US20120022392A1 (en) Correlating Frequency Signatures To Cognitive Processes
US11647962B2 (en) System and method for classifying and modulating brain behavioral states
Nourski et al. Sound identification in human auditory cortex: Differential contribution of local field potentials and high gamma power as revealed by direct intracranial recordings
Tikka et al. Artificial intelligence-based classification of schizophrenia: A high density electroencephalographic and support vector machine study
Stavisky et al. Decoding speech from intracortical multielectrode arrays in dorsal “arm/hand areas” of human motor cortex
Cao et al. Classification of migraine stages based on resting-state EEG power
Kellmeyer et al. Electrophysiological correlates of neurodegeneration in motor and non-motor brain regions in amyotrophic lateral sclerosis—implications for brain–computer interfacing
Duraivel et al. High-resolution neural recordings improve the accuracy of speech decoding
Lakretz et al. Single-cell activity in human STG during perception of phonemes is organized according to manner of articulation
WO2012116232A1 (en) Systems and methods for decoding neural signals
WO2011127483A1 (en) Decoding words using neural signals
Tankus et al. Machine learning algorithm for decoding multiple subthalamic spike trains for speech brain–machine interfaces
Kellis et al. Classification of spoken words using surface local field potentials
Pailla et al. ECoG data analyses to inform closed-loop BCI experiments for speech-based prosthetic applications
Avantaggiato et al. Intelligibility of speech in Parkinson's disease relies on anatomically segregated subthalamic beta oscillations
Dichter et al. Dynamic structure of neural variability in the cortical representation of speech sounds
Khatun et al. Single channel EEG time-frequency features to detect mild cognitive impairment
Wang et al. Deep learning for micro-electrocorticographic (µECoG) data
Duraivel et al. Accurate speech decoding requires high-resolution neural interfaces

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11766869

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11766869

Country of ref document: EP

Kind code of ref document: A1