WO2011127483A1 - Decoding words using neural signals - Google Patents
Decoding words using neural signals Download PDFInfo
- Publication number
- WO2011127483A1 WO2011127483A1 PCT/US2011/031995 US2011031995W WO2011127483A1 WO 2011127483 A1 WO2011127483 A1 WO 2011127483A1 US 2011031995 W US2011031995 W US 2011031995W WO 2011127483 A1 WO2011127483 A1 WO 2011127483A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- electrodes
- word
- frequency
- classifier
- domain information
- Prior art date
Links
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/48—Other medical applications
- A61B5/4803—Speech analysis specially adapted for diagnostic purposes
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/24—Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
- A61B5/316—Modalities, i.e. specific diagnostic methods
- A61B5/369—Electroencephalography [EEG]
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7235—Details of waveform analysis
- A61B5/7264—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/24—Speech recognition using non-acoustical features
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7235—Details of waveform analysis
- A61B5/7264—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
- A61B5/7267—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
Definitions
- the present disclosure relates to methods and systems for decoding neural signals, e.g., local field potentials recorded from a brain cortical surface.
- Pathological conditions such as amyotrophic lateral sclerosis or damage to the brainstem can leave patients paralyzed but fully aware, in a condition known as locked-in syndrome. Communication in this state is laborious, often reduced to selecting individual letters or words by arduous residual movement.
- Certain embodiments of the present technology provide systems and methods for decoding words using neural signals, e.g., local field potentials, recorded from a cortical surface.
- neural signals e.g., local field potentials
- a system for decoding words using neural signals comprises a receiver configured to receive a neural signal from each of a plurality of electrodes implanted in a patient when the patient speaks or attempts to speak a word.
- the system further comprises a processor configured to convert the neural signal for each of the plurality of electrodes into frequency-domain information, and to apply a classifier to the frequency-domain information for the plurality of electrodes to decode the word.
- the plurality of electrodes are placed over a cortical surface.
- the plurality of electrodes are placed over a face motor cortex.
- each neural signal comprises a local field potential from the cortical surface.
- the frequency-domain information for each of the plurality of electrodes comprises a power spectra.
- the plurality of electrodes comprise micro-electrodes.
- the classifier comprises a principal component analysis classifier.
- the processor is configured to decode the word by finding a centroid from a plurality of centroids that is nearest to an output of the principal component analysis classifier.
- a method of decoding words using neural signals comprises receiving a neural signal from each of a plurality of electrodes implanted in a patient when the patient speaks or attempts to speak a word, and converting the neural signal for each of the plurality of electrodes into frequency-domain information.
- the method further comprises applying a classifier to the frequency-domain information for the plurality of electrodes to decode the word.
- the plurality of electrodes are placed over a cortical surface.
- the plurality of electrodes are placed over a face motor cortex.
- each neural signal comprises a local field potential from the cortical surface.
- the frequency-domain information for each of the plurality of electrodes comprises a power spectra.
- the plurality of electrodes comprise micro-electrodes.
- the classifier is a principle component analysis classifier.
- the method further comprising finding a centroid from a plurality of centroids that is nearest to an output of the principal component analysis classifier.
- a method of training a classifier to decode words using neural signals comprises receiving a neural signal from each of a plurality of electrodes implanted in a patient for each one of a plurality of trials, wherein the patient speaks or attempts to speak the word for each trial, and for each trial, converting the neural signal for each of the plurality of electrodes into frequency- domain information.
- the method further comprises training the classifier to decode the word based on the frequency-domain information for each of the plurality of electrodes for each trial.
- the plurality of electrodes are placed over a cortical surface.
- the frequency-domain information for each of the plurality of electrodes and each trial comprises a power spectra.
- the training the classifier comprises performing a principal component analysis on the frequency-domain information for the plurality of electrodes for each trial.
- Fig. la shows an example of a 16-channel 4x4 micro-electrode array.
- Fig. lb shows placement of two micro-electrode over a cortical surface, in which one of the micro-electrode arrays is placed over the face motor cortex and the other micro- electrode array is placed over Wernicke's area.
- Fig. lc shows an audio waveform (top) of a verbal task and a corresponding spectrogram (bottom) of neural data recorded from a single channel over the face motor cortex.
- Fig. Id shows an audio waveform (top) of conversation, verbal task and verbal reward (top) and a corresponding spectrogram (bottom) of neural data recorded from a single channel over Wernicke's area.
- Fig. 2a shows windows temporally aligned to spoken words that contain a frequency-domain structure in a spectrogram of neural data recorded from a micro-electrode over the face motor cortex.
- Fig. 2b shows power spectra calculated for multiple trials and multiple electrodes.
- Fig. 2c shows a two-dimensional matrix of micro-electrode power spectra and trial information.
- Fig. 2d shows a principal component analysis performed on micro-electrode power spectra and trial information for two words.
- Fig. 3a shows a distribution of performance results for each unique combination of two- word through ten- word combinations.
- Fig. 3b shows a topography of channel performance for micro-electrodes resting over the face motor cortex.
- Fig. 3c shows a topography of channel performance for micro-electrodes resting over Wernicke's area.
- Fig. 4 shows a block diagram of a system for recording and analyzing data from a micro-electrode array according to some embodiments of the present invention.
- Pathological conditions such as amyotrophic lateral sclerosis or damage to the brainstem can leave patients severely paralyzed but fully aware, in a condition known as locked- in syndrome. Communication in this state is laborious, often reduced to selecting individual letters or words by arduous residual movement. More intuitive communication may be possible by directly interfacing with language areas of the cerebral cortex. Many studies of neural interfaces for communication have focused on the challenging problem of reconstructing continuous, dynamic speech. Described herein is a more tractable approach of classifying a set of words.
- a grid or array of subdural, nonpenetrating, high-impedance micro-electrodes are used to record local field potentials (LFPs) from the cortical surface over the face motor cortex and Wernicke's area.
- LFP local field potentials
- a LFP may be an electric field potential from a group of neutrons located near the corresponding electrode.
- Neural data from many regions of the brain may be used to decode speech; however, data from electrodes over the face motor cortex were found to be the most accurately decodable.
- Embodiments of the present invention provide a trial- by-trial decoding of spoken words from cortical surface LFPs in the human neocortex, as discussed further below.
- BCIs brain computer interfaces
- Penetrating electrodes have been used to perform rapid decoding of continuous motor movements from neuronal activity in the primary motor area of human neocortex; however, because of the risks associated with implantation in language centers, few studies have explored their use in speech BCIs.
- the neurotrophic electrode is a penetrating electrode designed to mitigate the risks of chronic implantation that has been used to decode the formant frequencies of speech from neuronal activity in the left ventral premotor cortex.
- Embodiments of the present invention provide a novel recording device and method for decoding speech.
- LFPs local field potentials
- a micro- electrode array may comprise a plurality of nonpenetrating, 40- ⁇ microwires with 1-mm inter- electrode spacing.
- Such micro-electrode grids or arrays have been shown to support high temporal- and spatial-resolution recordings.
- embodiments of the present invention decode speech by classifying finite sets of words from cortical surface LFPs, thereby reducing the complexity of the problem to determining a limited number of classes.
- Fig. la shows an example of a single 16-channel 4x4 micro-electrode grid or array that may be used to record LFPs on the cortical surface.
- the micro-electrode array is shown next to a U.S. quarter-dollar coin for size comparison.
- Fig. lb shows two 16-channel 4x4 micro-electrode arrays placed beneath the dura closely approximated to the cortical surface over the face motor cortex and Wernicke's area.
- the wire bundle 1 12a leads to the array 110a over Wernicke's area and the wire bundle 112b leads to the array 110b over the face motor cortex.
- EoG electrocorticographic
- Fig. lc shows an audio waveform (top) of a verbal task, in which a patient repeated the word "yes.”
- Fig. lc also shows a corresponding spectrogram (bottom) of neural data recorded from a single channel or micro-electrode over the face motor cortex.
- Fig. lc includes a normalized power scale indicating the power levels in the spectrogram. As shown in Fig. lc, the spectrogram reveals frequency-domain structure aligned to the individual words during the verbal task.
- Fig. Id shows an audio waveform (top) of conversation, verbal task and verbal reward and a corresponding spectrogram (bottom) of neural data recorded from a single channel over Wernicke's area.
- Wernicke's area is predominantly active when the patient converses and receives verbal rewards after completing an experiment, and was less active during the verbal task.
- PCA principal component analysis
- Fig. 2a shows an example of spectrograms 210a-210d of neural data for four different electrodes of a micro-electrode array placed over the face motor cortex.
- a particular word is repeated three times during a verbal task with each repetition of the word corresponding to a trial.
- the subject may speak the word or attempt to speak the word for the case where the subject is unable to intelligibly vocalize the word.
- Fig. 2a shows three 500-msec windows 220a-220c where each window is temporally aligned to one instance of the spoken word. As shown in Fig.
- the windows 220a- 220c contain frequency-domain structure in each spectrogram 210a-210d corresponding to the spoken word at the three trials.
- Fig. 2b shows a power spectra for each electrode 210a-210d and each trial.
- Fig. 2c shows a two-dimensional matrix of micro-electrode power spectra and trial information for a word.
- power spectra information is collected for each of N electrodes of the array and each of M trials.
- Fig. 2d shows a principal component analysis performed on micro-electrode power spectra and trial information for the words "hungry” and "thirsty.”
- principal component analysis performed on micro-electrode power spectra and trial information for the word "hungry” generates a cluster 250 in the principal component space, where each point in the cluster 250 represents one trial.
- principal component analysis performed on micro-electrode power spectra and trial information for the word "thirsty” generates a cluster 255 in the principal component space.
- three dimensions of the principal component space are shown for ease of illustration, although it is to be understood that the principal component space may comprise any number of dimensions.
- a center of mass or centroid may be computed for each cluster corresponding to a particular word.
- the word may be classified by performing principal component analysis on micro-electrode spectra information from the patient to project the spectra
- classification examples include maximum likelihood, support vector machine and Bayesian classification.
- FIG. 3b,c shows performance results for individual electrodes over the face motor cortex for different words
- Fig. 3c shows performance results for individual electrodes over Wernicke's area for different words. Examining the mean performance of each word against all other words, it was found that electrode 14 ranged from 51.5% accuracy for the word "cold” to 81.5% accuracy for the word "yes.” The standard deviation of performance across all 16 motor-sensory electrodes was measured as 6.6 ⁇ 1.5 percentage points, suggesting that surface LFPs recorded from some electrodes corresponded to aspects of speech production present in some words but not others.
- micro-electrode that provided the highest accuracy for any single word varied. Selecting the five electrodes of the array with best overall accuracy from the face motor cortex improved classification accuracy to 89.6 ⁇ 10.8% of two-word combinations (median 90.0%; Fig. 3a). However, selecting the five highest-performing electrodes over Wernicke's area did not improve performance (73.5 ⁇ 16.4% of two-word combinations correctly classified; median 73.3%) when compared with using all 16 electrodes over that region of cortex. Some micro- electrodes over the face motor cortex may not have recorded neural signals useful in decoding the specific set of words presented, indicating a more concrete mapping of the neural signal onto patterns of speech articulation. Conversely, most of the 16 micro-electrodes over Wernicke's area appear to have recorded neural signal related to language processing, supporting a more distributed and abstract encoding of speech.
- micro-electrode grids or array could be reduced with epidural placement, as shown for similar recording devices. Furthermore, a wireless
- Training and decoding used subsets of channels and combinations of two through ten words. Mean, median, and standard deviation were computed for results of each
- Topographical performance [0065] The algorithm was run using data from each electrode individually and for all combinations of two words. Classification accuracies from all combinations involving the selected word and channel were averaged.
- FIG. 4 is block diagram showing an example of a system 450 for recording and processing LFPs from an micro-electrode array 410 that may be used for various embodiments of the invention.
- the system 450 may include a receiver 452, a processor 455, and a memory 460.
- the receiver 452 may be used to condition the electrical signals from the micro-electrode array 410 for processing by the processor 455.
- the receiver 452 may include one or more of the following components: amplifiers (e.g., low-noise amplifiers) for amplifying the electrical signals, a filter for isolating electrical signals within a desired frequency bandwidth, and an analog-to-digital converter for digitizing the electrical signals for processing by the processor 455.
- amplifiers e.g., low-noise amplifiers
- filter for isolating electrical signals within a desired frequency bandwidth
- an analog-to-digital converter for digitizing the electrical signals for processing by the processor 455.
- the processor 455 may comprise a general purpose processor, a digital signal processors (DSPs), application specific integrated circuit (ASICs), discrete hardware
- the memory 460 may comprise any computer-readable media known in the art including volatile memory, nonvolatile memory, a Random Access Memory (RAM), a flash memory, a Read Only Memory (ROM), a removable disk, a CD-ROM, a DVD, any other suitable storage device, or a combination thereof.
- RAM Random Access Memory
- ROM Read Only Memory
- the processor 455 may also output raw electrical signals, processed electrical signals, and/or results of analysis to an output device 465, including, but not limited to, a display for viewing by a neurologist, a printer for generating a computer readout, a computer-readable media, and/or to another computer via a computer network connection.
- the output device 465 may also include an audio output device that outputs the decoded word as an audio output, e.g., a synthetic voice vocalizing the decoded word.
- the processor 455 may decode a word by receiving neural signals, e.g., local field potentials, from the micro-electrode array 410 when the patient speaks the word or attempts to speak the word.
- the processor 455 may then convert the neural signals into frequency-domain information, e.g., power spectra, for one or more electrodes of the array.
- the processor 455 may then classify the frequency-domain information for the one or more electrodes into one of a set of words.
- the processor 455 may perform principal . component analysis on the frequency-domain information to project the frequency-domain information into the principal component space and determine its nearest centriod in the principal component space, as described above.
- the processor 455 may display the decoded word on a display and/or vocalize the decoded word from an audio output device.
- the processor 455 may be trained to classify a particular word using the methods described above with reference to Figs. 2a-2d.
Abstract
Methods and systems are described for decoding words using neural signals, e.g., local field potentials, recorded from, e.g., a brain cortical surface. Some methods include receiving neural signals from a plurality of electrodes contacting a patient when the patient speaks, or attempts to speak, a word; converting the neural signal for each electrode into frequency-domain information; and applying a classifier to the frequency-domain information for the plurality of electrodes so as to determine the word.
Description
DECODING WORDS USING NEURAL SIGNALS
Related Application
[0001] The present application claims the benefit of priority under 35 U.S.C. §119 from U.S. Provisional Patent Application Serial No. 61/322,797, filed April 9, 2010, which is hereby incorporated by reference in its entirety for all purposes.
Statement Regarding Federally Sponsored Research or Development
[0002] This invention was made with government support under Grant #EY019363 awarded by the National Institutes of Health. The government has certain rights in the invention.
Field
[0003] The present disclosure relates to methods and systems for decoding neural signals, e.g., local field potentials recorded from a brain cortical surface.
Background
[0004] Pathological conditions such as amyotrophic lateral sclerosis or damage to the brainstem can leave patients paralyzed but fully aware, in a condition known as locked-in syndrome. Communication in this state is laborious, often reduced to selecting individual letters or words by arduous residual movement.
Summary
[0005] Certain embodiments of the present technology provide systems and methods for decoding words using neural signals, e.g., local field potentials, recorded from a cortical surface.
[0006] In certain embodiments, a system for decoding words using neural signals is provided. The system comprises a receiver configured to receive a neural signal from each of a plurality of electrodes implanted in a patient when the patient speaks or attempts to speak a word. The system further comprises a processor configured to convert the neural signal for each of the plurality of electrodes into frequency-domain information, and to apply a classifier to the frequency-domain information for the plurality of electrodes to decode the word.
[0007] In certain embodiments, the plurality of electrodes are placed over a cortical surface.
[0008] In certain embodiments, the plurality of electrodes are placed over a face motor cortex.
[0009] In certain embodiments, each neural signal comprises a local field potential from the cortical surface.
[0010] In certain embodiments, the frequency-domain information for each of the plurality of electrodes comprises a power spectra.
[0011] In certain embodiments, the plurality of electrodes comprise micro-electrodes.
[0012] In certain embodiments, the classifier comprises a principal component analysis classifier.
[0013] In certain embodiments, the processor is configured to decode the word by finding a centroid from a plurality of centroids that is nearest to an output of the principal component analysis classifier.
[0014] In certain embodiments, a method of decoding words using neural signals is provided. The method comprises receiving a neural signal from each of a plurality of electrodes implanted in a patient when the patient speaks or attempts to speak a word, and converting the neural signal for each of the plurality of electrodes into frequency-domain information. The method further comprises applying a classifier to the frequency-domain information for the plurality of electrodes to decode the word.
[0015] In certain embodiments, the plurality of electrodes are placed over a cortical surface.
[0016] In certain embodiments, the plurality of electrodes are placed over a face motor cortex.
[0017] In certain embodiments, each neural signal comprises a local field potential from the cortical surface.
[0018] In certain embodiments, the frequency-domain information for each of the plurality of electrodes comprises a power spectra.
[0019] In certain embodiments, the plurality of electrodes comprise micro-electrodes.
[0020] In certain embodiments, the classifier is a principle component analysis classifier.
[0021] In certain embodiments, the method further comprising finding a centroid from a plurality of centroids that is nearest to an output of the principal component analysis classifier.
[0022] In certain embodiments, a method of training a classifier to decode words using neural signals is provided. For each one of a plurality of words, the method comprises receiving a neural signal from each of a plurality of electrodes implanted in a patient for each one of a plurality of trials, wherein the patient speaks or attempts to speak the word for each trial, and for each trial, converting the neural signal for each of the plurality of electrodes into frequency- domain information. For each of the plurality of words, the method further comprises training the classifier to decode the word based on the frequency-domain information for each of the plurality of electrodes for each trial.
[0023] In certain embodiments, the plurality of electrodes are placed over a cortical surface.
[0024] In certain embodiments, the frequency-domain information for each of the plurality of electrodes and each trial comprises a power spectra.
[0025] In certain embodiments, the training the classifier comprises performing a principal component analysis on the frequency-domain information for the plurality of electrodes for each trial.
[0026] For purposes of summarizing the disclosure, certain aspects, advantages, embodiments and novel features of the disclosure have been described herein. It is to be understood that not necessarily all such advantages may be achieved in accordance with any particular embodiment of the disclosure. Thus, the disclosure may be embodied or carried out in a manner that achieves or optimizes one advantage or group of advantages as taught herein without necessarily achieving other advantages as may be taught or suggested herein.
Brief Description of the Drawings
[0027] Fig. la shows an example of a 16-channel 4x4 micro-electrode array.
[0028] Fig. lb shows placement of two micro-electrode over a cortical surface, in which one of the micro-electrode arrays is placed over the face motor cortex and the other micro- electrode array is placed over Wernicke's area.
[0029] Fig. lc shows an audio waveform (top) of a verbal task and a corresponding spectrogram (bottom) of neural data recorded from a single channel over the face motor cortex.
[0030] Fig. Id shows an audio waveform (top) of conversation, verbal task and verbal reward (top) and a corresponding spectrogram (bottom) of neural data recorded from a single channel over Wernicke's area.
[0031] Fig. 2a shows windows temporally aligned to spoken words that contain a frequency-domain structure in a spectrogram of neural data recorded from a micro-electrode over the face motor cortex.
[0032] Fig. 2b shows power spectra calculated for multiple trials and multiple electrodes.
[0033] Fig. 2c shows a two-dimensional matrix of micro-electrode power spectra and trial information.
[0034] Fig. 2d shows a principal component analysis performed on micro-electrode power spectra and trial information for two words.
[0035] Fig. 3a shows a distribution of performance results for each unique combination of two- word through ten- word combinations.
[0036] Fig. 3b shows a topography of channel performance for micro-electrodes resting over the face motor cortex.
[0037] Fig. 3c shows a topography of channel performance for micro-electrodes resting over Wernicke's area.
[0038] Fig. 4 shows a block diagram of a system for recording and analyzing data from a micro-electrode array according to some embodiments of the present invention.
Detailed Description
[0039] Pathological conditions such as amyotrophic lateral sclerosis or damage to the brainstem can leave patients severely paralyzed but fully aware, in a condition known as locked- in syndrome. Communication in this state is laborious, often reduced to selecting individual letters or words by arduous residual movement. More intuitive communication may be possible by directly interfacing with language areas of the cerebral cortex. Many studies of neural interfaces for communication have focused on the challenging problem of reconstructing continuous, dynamic speech. Described herein is a more tractable approach of classifying a set
of words. In some embodiments, a grid or array of subdural, nonpenetrating, high-impedance micro-electrodes are used to record local field potentials (LFPs) from the cortical surface over the face motor cortex and Wernicke's area. A LFP may be an electric field potential from a group of neutrons located near the corresponding electrode. Neural data from many regions of the brain may be used to decode speech; however, data from electrodes over the face motor cortex were found to be the most accurately decodable. Embodiments of the present invention provide a trial- by-trial decoding of spoken words from cortical surface LFPs in the human neocortex, as discussed further below.
[0040] Early studies of brain computer interfaces (BCIs) for speech trained patients to use slow cortical potentials to interact with a computer for communication. More recently noninvasive BCIs have demonstrated improvements but can require extensive training to achieve moderate accuracy and rates of communication. Penetrating electrodes have been used to perform rapid decoding of continuous motor movements from neuronal activity in the primary motor area of human neocortex; however, because of the risks associated with implantation in language centers, few studies have explored their use in speech BCIs. The neurotrophic electrode is a penetrating electrode designed to mitigate the risks of chronic implantation that has been used to decode the formant frequencies of speech from neuronal activity in the left ventral premotor cortex. Studies investigating less invasive measures have shown that cortical surface potentials recorded by electrocorticographic (ECoG) electrodes can discriminate between motor and speech tasks and discriminate phonemes. Regardless of the recording paradigm used, most studies of speech BCIs have focused on the challenging task of decoding continuous, dynamic speech from the neural representations of formant frequencies in either action potentials or field potentials.
[0041] Embodiments of the present invention provide a novel recording device and method for decoding speech. In some embodiments, local field potentials (LFPs) on a cortical surface of the brain are recorded from one or more micro-electrode arrays. For example, a micro- electrode array may comprise a plurality of nonpenetrating, 40-μηι microwires with 1-mm inter- electrode spacing. Such micro-electrode grids or arrays have been shown to support high temporal- and spatial-resolution recordings. Also, rather than decoding continuous speech, embodiments of the present invention decode speech by classifying finite sets of words from
cortical surface LFPs, thereby reducing the complexity of the problem to determining a limited number of classes.
[0042] Fig. la shows an example of a single 16-channel 4x4 micro-electrode grid or array that may be used to record LFPs on the cortical surface. In Fig. la, the micro-electrode array is shown next to a U.S. quarter-dollar coin for size comparison. Fig. lb shows two 16-channel 4x4 micro-electrode arrays placed beneath the dura closely approximated to the cortical surface over the face motor cortex and Wernicke's area. In Fig. lb, the wire bundle 1 12a leads to the array 110a over Wernicke's area and the wire bundle 112b leads to the array 110b over the face motor cortex. Fig. lb also shows electrocorticographic (ECoG) electrodes, which are much larger than the micro-electrodes of the arrays. The wide range of muscles required to articulate vocalizations suggests that unique neural activity in the face motor cortex may correspond to unique word formulations. Wernicke's area is known to play an important role in high-level language processing.
[0043] Fig. lc shows an audio waveform (top) of a verbal task, in which a patient repeated the word "yes." Fig. lc also shows a corresponding spectrogram (bottom) of neural data recorded from a single channel or micro-electrode over the face motor cortex. Fig. lc includes a normalized power scale indicating the power levels in the spectrogram. As shown in Fig. lc, the spectrogram reveals frequency-domain structure aligned to the individual words during the verbal task.
[0044] Fig. Id shows an audio waveform (top) of conversation, verbal task and verbal reward and a corresponding spectrogram (bottom) of neural data recorded from a single channel over Wernicke's area. As shown in Fig. Id, Wernicke's area is predominantly active when the patient converses and receives verbal rewards after completing an experiment, and was less active during the verbal task.
[0045] Previous studies have used principal component analysis (PCA) to separate frequency-domain features in neural signals. In one embodiment, PCA is used to classify a finite set of words. For each word, PCA is performed on power spectra from each electrode and each trial simultaneously. During the training phase, a center of mass, or centroid, is calculated as the average of the coordinates of all projected trials belonging to a particular word. During the
classification phase, trials are projected into the principal component space and classified as specific words by their proximity to a centroid. An example of this is illustrated in Figs. 2a-2d.
[0046] Fig. 2a shows an example of spectrograms 210a-210d of neural data for four different electrodes of a micro-electrode array placed over the face motor cortex. In this example, a particular word is repeated three times during a verbal task with each repetition of the word corresponding to a trial. For each trial, the subject may speak the word or attempt to speak the word for the case where the subject is unable to intelligibly vocalize the word. For each spectrogram, Fig. 2a shows three 500-msec windows 220a-220c where each window is temporally aligned to one instance of the spoken word. As shown in Fig. 2a, the windows 220a- 220c contain frequency-domain structure in each spectrogram 210a-210d corresponding to the spoken word at the three trials. Fig. 2b shows a power spectra for each electrode 210a-210d and each trial.
[0047] Fig. 2c shows a two-dimensional matrix of micro-electrode power spectra and trial information for a word. In this example, power spectra information is collected for each of N electrodes of the array and each of M trials.
[0048] Fig. 2d shows a principal component analysis performed on micro-electrode power spectra and trial information for the words "hungry" and "thirsty." In this example, principal component analysis performed on micro-electrode power spectra and trial information for the word "hungry" generates a cluster 250 in the principal component space, where each point in the cluster 250 represents one trial. Similarly, principal component analysis performed on micro-electrode power spectra and trial information for the word "thirsty" generates a cluster 255 in the principal component space. In the example in Fig. 2d, three dimensions of the principal component space are shown for ease of illustration, although it is to be understood that the principal component space may comprise any number of dimensions.
[0049] In Fig. 2d, a center of mass or centroid may be computed for each cluster corresponding to a particular word. During the classification phase, when a patient speaks a word or attempts to speak a word, the word may be classified by performing principal component analysis on micro-electrode spectra information from the patient to project the spectra
information into the principal component space and then determining its nearest centroid. The word that the patient spoke or attempted to speak is then classified based on the word
corresponding to the nearest centroid. Those skilled in the art will appreciate that other types of classification may also be used to decode a word based on the micro-electrode spectra
information. Examples of other types of classification include maximum likelihood, support vector machine and Bayesian classification.
[0050] Classification was performed both separately (Fig. 3a) and jointly for cortical surface LFP data recorded over the face motor cortex and cortical surface LFP data recorded over Wernicke's area. Electrodes over the face motor cortex offered the best classification
performance. Out of 45 unique two-word combinations, 85.0 ± 13.1 % (mean ± standard deviation) were correctly classified using data from all 16 array electrodes (median performance was 83.3%). Data recorded over Wernicke's area were less classifiable with 76.2 ± 15.0% of two-word combinations correctly classified (median 76.7%). Joint classification did not improve performance over the level achieved by the face motor electrodes alone (0.40 ± 0.43% difference in the percentage of two- through ten- word combinations classified correctly). Vocal dynamics such as varied pitch or inflection could contribute to lower-than-expected performance in discriminating some word combinations. Regardless, decoding accuracies that were well above chance and the timing of the increased spectral power suggest that the micro-electrode array over the face motor cortex recorded signals involved in speech production. Similarly, activity recorded over Wernicke's area appears to be involved in speech processing but likely represents language at a more abstract level.
[0051] Surface LFPs recorded from individual micro-electrodes were better able to decode some words than others (Fig. 3b,c). Fig. 3b shows performance results for individual electrodes over the face motor cortex for different words, and Fig. 3c shows performance results for individual electrodes over Wernicke's area for different words. Examining the mean performance of each word against all other words, it was found that electrode 14 ranged from 51.5% accuracy for the word "cold" to 81.5% accuracy for the word "yes." The standard deviation of performance across all 16 motor-sensory electrodes was measured as 6.6 ± 1.5 percentage points, suggesting that surface LFPs recorded from some electrodes corresponded to aspects of speech production present in some words but not others.
[0052] The micro-electrode that provided the highest accuracy for any single word varied. Selecting the five electrodes of the array with best overall accuracy from the face motor cortex
improved classification accuracy to 89.6 ± 10.8% of two-word combinations (median 90.0%; Fig. 3a). However, selecting the five highest-performing electrodes over Wernicke's area did not improve performance (73.5 ± 16.4% of two-word combinations correctly classified; median 73.3%) when compared with using all 16 electrodes over that region of cortex. Some micro- electrodes over the face motor cortex may not have recorded neural signals useful in decoding the specific set of words presented, indicating a more concrete mapping of the neural signal onto patterns of speech articulation. Conversely, most of the 16 micro-electrodes over Wernicke's area appear to have recorded neural signal related to language processing, supporting a more distributed and abstract encoding of speech.
[0053] Decoding surface LFPs from the best five micro-electrodes simultaneously gave better results than decoding data from the same micro-electrodes individually. As much as 20.0 percentage points difference (vs. electrode 15 alone; ten- word combination) was found. On average, the collective accuracy of these five electrodes was 16.2 ± 2.8 percentage points higher than their independently measured accuracy. Neural activity recorded by these five micro- electrodes likely corresponded to multiple aspects of speech articulation that varied across the set of words used in the experiments.
[0054] The tight inter-electrode spacing and small number of electrodes limited the spatial coverage of the micro-electrode grid or array. An optimized grid design with larger spacing and more electrodes would likely cover a larger number of relevant neural signals and allow better decoding accuracy. Performance could likely be further improved with patient training to stereotype word articulation.
[0055] The invasiveness of the micro-electrode grids or array could be reduced with epidural placement, as shown for similar recording devices. Furthermore, a wireless
implementation of the system might be practical given the relatively low bandwidth required to capture cortical surface LFPs. A wireless system to decode speech, with a balance of
invasiveness and performance, could improve the quality of life for locked-in patients and others unable to communicate on their own.
[0056] The above results show that spoken words can be decoded from surface LFPs recorded over neocortical speech areas by arrays of closely spaced micro-electrodes. Therefore,
classification of words using surface LFPs is a viable approach to restoring limited but useful communication to those suffering from locked-in syndrome.
[0057] Methods used to obtain the above results are discussed below.
[0058] Subject and Experiment
[0059] One male patient who required extraoperative electrocorticographic monitoring for medically refractory epilepsy gave informed consent to participate in an institutional review board-approved study. Two nonpenetrating micro-electrode arrays (PMT Neurosurgical, Chanhassen, MN) were implanted over face motor cortex and Wernicke's area. Each array comprised 16 channels of 40- μηι wire terminating in a 4x4 grid with 1 -millimeter spacing. For each of 10 words, the patient repeated the word up to 25 times over four consecutive days. Audio data and 32 channels of neural data from the two micro-electrode arrays were recorded at 30,000 samples per second by a Neuroport system (Blackrock Microsystems, Salt Lake City, UT). A subset of trials containing stereotypical articulation was selected for each word
(Supplementary Table 1).
[0060] Data Analysis
[0061] Data were filtered to discard frequencies above 500 Hz and re-referenced to the common average. Power spectra were computed for 0.5-second windows aligned to
vocalization. Log-normalized power spectra for each trial and micro-electrode were
concatenated to form a large row vector. All such trial-vectors for each word being classified (two to ten words) were stacked vertically to form a two-dimensional matrix of power spectral data comprising all available channels and trials. Principal component analysis on this data set resulted in clustering, which allowed nearest-centroid classification. Fifteen trials were used for both training and decoding. To keep these trials as temporally proximal as possible, trials from as few adjacent days as possible were used.
[0062] Multi-word performance
[0063] Training and decoding used subsets of channels and combinations of two through ten words. Mean, median, and standard deviation were computed for results of each
combination. Combinations were selected using the n-choose-k method (n=10 and k=2-10).
[0064] Topographical performance
[0065] The algorithm was run using data from each electrode individually and for all combinations of two words. Classification accuracies from all combinations involving the selected word and channel were averaged.
[0066] Fig. 4 is block diagram showing an example of a system 450 for recording and processing LFPs from an micro-electrode array 410 that may be used for various embodiments of the invention. The system 450 may include a receiver 452, a processor 455, and a memory 460. The receiver 452 may be used to condition the electrical signals from the micro-electrode array 410 for processing by the processor 455. The receiver 452 may include one or more of the following components: amplifiers (e.g., low-noise amplifiers) for amplifying the electrical signals, a filter for isolating electrical signals within a desired frequency bandwidth, and an analog-to-digital converter for digitizing the electrical signals for processing by the processor 455. Some or all of the above components may also be implanted in the patient with the micro- electrode array 410.
[0067] The processor 455 may comprise a general purpose processor, a digital signal processors (DSPs), application specific integrated circuit (ASICs), discrete hardware
components, or any combination thereof. Methods for decoding speech using neural signals from the array 410 according to various embodiments of the invention discussed above may be embodied in software code that is stored in the memory 460 and executed by the processor 455. The memory 460 may comprise any computer-readable media known in the art including volatile memory, nonvolatile memory, a Random Access Memory (RAM), a flash memory, a Read Only Memory (ROM), a removable disk, a CD-ROM, a DVD, any other suitable storage device, or a combination thereof.
[0068] The processor 455 may also output raw electrical signals, processed electrical signals, and/or results of analysis to an output device 465, including, but not limited to, a display for viewing by a neurologist, a printer for generating a computer readout, a computer-readable media, and/or to another computer via a computer network connection. The output device 465 may also include an audio output device that outputs the decoded word as an audio output, e.g., a synthetic voice vocalizing the decoded word.
[0069] In one embodiment, the processor 455 may decode a word by receiving neural signals, e.g., local field potentials, from the micro-electrode array 410 when the patient speaks
the word or attempts to speak the word. The processor 455 may then convert the neural signals into frequency-domain information, e.g., power spectra, for one or more electrodes of the array. The processor 455 may then classify the frequency-domain information for the one or more electrodes into one of a set of words. For example, the processor 455 may perform principal . component analysis on the frequency-domain information to project the frequency-domain information into the principal component space and determine its nearest centriod in the principal component space, as described above. After decoding the word that the patient spoke or attempted to speak, the processor 455 may display the decoded word on a display and/or vocalize the decoded word from an audio output device. The processor 455 may be trained to classify a particular word using the methods described above with reference to Figs. 2a-2d.
[0070] It will be also appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the specific embodiments disclosed herein, without departing from the scope or spirit of the disclosure as broadly described. The present
embodiments are, therefore, to be considered in all respects illustrative and not restrictive of the present invention.
Claims
1. A system for decoding words using neural signals, comprising:
a receiver configured to receive a neural signal from each of a plurality of electrodes, each of the neural signals emanating from the brain of a patient when the patient speaks, or attempts to speak, a word; and
a processor configured to convert the neural signals into frequency-domain information and to apply a classifier to the frequency-domain information so as to determine the word.
2. The system of claim 1, wherein the electrodes contact a brain cortical surface.
3. The system of claim 2, wherein the electrodes contact a face motor cortex.
4. The system of claim 2, wherein each of the neural signals comprises a local field potential from the cortical surface.
5. The system of claim 1, wherein the frequency-domain information for each of the plurality of electrodes comprises a power spectrum.
6. The system of claim 1, wherein the electrodes comprise micro-electrodes.
7. The system of claim 1, wherein the classifier comprises a principal component analysis classifier.
8. The system of claim 7, wherein the processor is programmed to determine the word by fmding a centroid, from a plurality of centroids, that is nearest to an output of the principal component analysis classifier.
9. A method of identifying words using neural signals, comprising: receiving a neural signal from each of a plurality of electrodes, each of the neural signals emanating from the brain of a patient when the patient speaks, or attempts to speak, a word; converting the neural signal for each of the plurality of electrodes into frequency-domain information; and
applying a classifier to the frequency-domain information for the plurality of electrodes so as to determine the word.
10. The method of claim 9, wherein the electrodes contact a brain cortical surface.
1 1. The method of claim 10, wherein the electrodes contact a face motor cortex.
12. The method of claim 10, wherein each of the neural signals comprises a local field potential from the cortical surface.
13. The method of claim 9, wherein the frequency-domain information for each of the plurality of electrodes comprises a power spectrum.
14. The method of claim 9, wherein the electrodes comprise micro-electrodes.
15. The method of claim 9, wherein the classifier is a principle component analysis classifier.
16. The method of claim 15, wherein the applying comprises finding a centroid, from a plurality of centroids, that is nearest to an output of the principal component analysis classifier.
17. A method of training a classifier to determine words using neural signals, comprising: for each one of a plurality of words, performing the steps of:
for each one of a plurality of trials, receiving a neural signal from each of a plurality of electrodes, each of the neural signals emanating from the brain of a patient when the patient speaks, or attempts to speak, a word; for each trial, converting the neural signals into frequency-domain information; and
training the classifier to determine the word based on the frequency-domain information for each trial.
18. The method of claim 17, wherein the plurality of electrodes contatct a cortical surface.
19. The method of claim 17, wherein the frequency-domain information for each trial comprises a power spectrum.
20. The method of claim 17, wherein the training the classifier comprises performing a principal component analysis on the frequency-domain information for each trial.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US32279710P | 2010-04-09 | 2010-04-09 | |
US61/322,797 | 2010-04-09 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011127483A1 true WO2011127483A1 (en) | 2011-10-13 |
Family
ID=44763316
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2011/031995 WO2011127483A1 (en) | 2010-04-09 | 2011-04-11 | Decoding words using neural signals |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2011127483A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014066855A1 (en) * | 2012-10-26 | 2014-05-01 | The Regents Of The University Of California | Methods of decoding speech from brain activity data and devices for practicing the same |
US10653330B2 (en) | 2016-08-25 | 2020-05-19 | Paradromics, Inc. | System and methods for processing neural signals |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6020110A (en) * | 1994-06-24 | 2000-02-01 | Cambridge Sensors Ltd. | Production of electrodes for electrochemical sensing |
US6233480B1 (en) * | 1990-08-10 | 2001-05-15 | University Of Washington | Methods and apparatus for optically imaging neuronal tissue and activity |
US20050228515A1 (en) * | 2004-03-22 | 2005-10-13 | California Institute Of Technology | Cognitive control signals for neural prosthetics |
US20060049957A1 (en) * | 2004-08-13 | 2006-03-09 | Surgenor Timothy R | Biological interface systems with controlled device selector and related methods |
US20060217782A1 (en) * | 1998-10-26 | 2006-09-28 | Boveja Birinder R | Method and system for cortical stimulation to provide adjunct (ADD-ON) therapy for stroke, tinnitus and other medical disorders using implantable and external components |
US20080253626A1 (en) * | 2006-10-10 | 2008-10-16 | Schuckers Stephanie | Regional Fingerprint Liveness Detection Systems and Methods |
US20090221896A1 (en) * | 2006-02-23 | 2009-09-03 | Rickert Joern | Probe For Data Transmission Between A Brain And A Data Processing Device |
US20100046799A1 (en) * | 2003-07-03 | 2010-02-25 | Videoiq, Inc. | Methods and systems for detecting objects of interest in spatio-temporal signals |
-
2011
- 2011-04-11 WO PCT/US2011/031995 patent/WO2011127483A1/en active Application Filing
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6233480B1 (en) * | 1990-08-10 | 2001-05-15 | University Of Washington | Methods and apparatus for optically imaging neuronal tissue and activity |
US6020110A (en) * | 1994-06-24 | 2000-02-01 | Cambridge Sensors Ltd. | Production of electrodes for electrochemical sensing |
US20060217782A1 (en) * | 1998-10-26 | 2006-09-28 | Boveja Birinder R | Method and system for cortical stimulation to provide adjunct (ADD-ON) therapy for stroke, tinnitus and other medical disorders using implantable and external components |
US20100046799A1 (en) * | 2003-07-03 | 2010-02-25 | Videoiq, Inc. | Methods and systems for detecting objects of interest in spatio-temporal signals |
US20050228515A1 (en) * | 2004-03-22 | 2005-10-13 | California Institute Of Technology | Cognitive control signals for neural prosthetics |
US20060049957A1 (en) * | 2004-08-13 | 2006-03-09 | Surgenor Timothy R | Biological interface systems with controlled device selector and related methods |
US20090221896A1 (en) * | 2006-02-23 | 2009-09-03 | Rickert Joern | Probe For Data Transmission Between A Brain And A Data Processing Device |
US20080253626A1 (en) * | 2006-10-10 | 2008-10-16 | Schuckers Stephanie | Regional Fingerprint Liveness Detection Systems and Methods |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014066855A1 (en) * | 2012-10-26 | 2014-05-01 | The Regents Of The University Of California | Methods of decoding speech from brain activity data and devices for practicing the same |
US10264990B2 (en) | 2012-10-26 | 2019-04-23 | The Regents Of The University Of California | Methods of decoding speech from brain activity data and devices for practicing the same |
US10653330B2 (en) | 2016-08-25 | 2020-05-19 | Paradromics, Inc. | System and methods for processing neural signals |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10264990B2 (en) | Methods of decoding speech from brain activity data and devices for practicing the same | |
Steinschneider et al. | Intracranial study of speech-elicited activity on the human posterolateral superior temporal gyrus | |
Bower et al. | Spatiotemporal neuronal correlates of seizure generation in focal epilepsy | |
D’Zmura et al. | Toward EEG sensing of imagined speech | |
Bouchard et al. | Neural decoding of spoken vowels from human sensory-motor cortex with high-density electrocorticography | |
EP2596416A2 (en) | Multimodal brain computer interface | |
US20120022392A1 (en) | Correlating Frequency Signatures To Cognitive Processes | |
US11647962B2 (en) | System and method for classifying and modulating brain behavioral states | |
Nourski et al. | Sound identification in human auditory cortex: Differential contribution of local field potentials and high gamma power as revealed by direct intracranial recordings | |
Tikka et al. | Artificial intelligence-based classification of schizophrenia: A high density electroencephalographic and support vector machine study | |
Stavisky et al. | Decoding speech from intracortical multielectrode arrays in dorsal “arm/hand areas” of human motor cortex | |
Cao et al. | Classification of migraine stages based on resting-state EEG power | |
Kellmeyer et al. | Electrophysiological correlates of neurodegeneration in motor and non-motor brain regions in amyotrophic lateral sclerosis—implications for brain–computer interfacing | |
Duraivel et al. | High-resolution neural recordings improve the accuracy of speech decoding | |
Lakretz et al. | Single-cell activity in human STG during perception of phonemes is organized according to manner of articulation | |
WO2012116232A1 (en) | Systems and methods for decoding neural signals | |
WO2011127483A1 (en) | Decoding words using neural signals | |
Tankus et al. | Machine learning algorithm for decoding multiple subthalamic spike trains for speech brain–machine interfaces | |
Kellis et al. | Classification of spoken words using surface local field potentials | |
Pailla et al. | ECoG data analyses to inform closed-loop BCI experiments for speech-based prosthetic applications | |
Avantaggiato et al. | Intelligibility of speech in Parkinson's disease relies on anatomically segregated subthalamic beta oscillations | |
Dichter et al. | Dynamic structure of neural variability in the cortical representation of speech sounds | |
Khatun et al. | Single channel EEG time-frequency features to detect mild cognitive impairment | |
Wang et al. | Deep learning for micro-electrocorticographic (µECoG) data | |
Duraivel et al. | Accurate speech decoding requires high-resolution neural interfaces |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11766869 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 11766869 Country of ref document: EP Kind code of ref document: A1 |