WO2007076279A3 - Method for classifying speech data - Google Patents
Method for classifying speech data Download PDFInfo
- Publication number
- WO2007076279A3 WO2007076279A3 PCT/US2006/062032 US2006062032W WO2007076279A3 WO 2007076279 A3 WO2007076279 A3 WO 2007076279A3 US 2006062032 W US2006062032 W US 2006062032W WO 2007076279 A3 WO2007076279 A3 WO 2007076279A3
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- speech data
- vowel
- classifying speech
- amplitude spectrum
- classifying
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/93—Discriminating between voiced and unvoiced parts of speech signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G10L2015/025—Phonemes, fenemes or fenones being the recognition units
Abstract
A computationally non-intensive method for classifying real-time speech data is useful for improved animations of avatars. The method includes identifying a voiced speech segment of the speech data (step 410). A high-amplitude spectrum is then determined by performing a spectral analysis on a high-amplitude component of the voiced speech segment (step 415). The high-amplitude spectrum is then classified as a vowel phoneme, where the vowel phoneme is selected from a reduced vowel set (step 440).
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200510121718.7 | 2005-12-29 | ||
CNA2005101217187A CN1991981A (en) | 2005-12-29 | 2005-12-29 | Method for voice data classification |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2007076279A2 WO2007076279A2 (en) | 2007-07-05 |
WO2007076279A3 true WO2007076279A3 (en) | 2008-04-24 |
Family
ID=38214193
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2006/062032 WO2007076279A2 (en) | 2005-12-29 | 2006-12-13 | Method for classifying speech data |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN1991981A (en) |
WO (1) | WO2007076279A2 (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2468140A (en) * | 2009-02-26 | 2010-09-01 | Dublin Inst Of Technology | A character animation tool which associates stress values with the locations of vowels |
CN107431635B (en) * | 2015-03-27 | 2021-10-08 | 英特尔公司 | Avatar facial expression and/or speech driven animation |
US11176960B2 (en) * | 2018-06-18 | 2021-11-16 | University Of Florida Research Foundation, Incorporated | Method and apparatus for differentiating between human and electronic speaker for voice interface security |
CN109087629A (en) * | 2018-08-24 | 2018-12-25 | 苏州玩友时代科技股份有限公司 | A kind of mouth shape cartoon implementation method and device based on speech recognition |
CN111326143B (en) * | 2020-02-28 | 2022-09-06 | 科大讯飞股份有限公司 | Voice processing method, device, equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5884267A (en) * | 1997-02-24 | 1999-03-16 | Digital Equipment Corporation | Automated speech alignment for image synthesis |
US20030117485A1 (en) * | 2001-12-20 | 2003-06-26 | Yoshiyuki Mochizuki | Virtual television phone apparatus |
-
2005
- 2005-12-29 CN CNA2005101217187A patent/CN1991981A/en active Pending
-
2006
- 2006-12-13 WO PCT/US2006/062032 patent/WO2007076279A2/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5884267A (en) * | 1997-02-24 | 1999-03-16 | Digital Equipment Corporation | Automated speech alignment for image synthesis |
US20030117485A1 (en) * | 2001-12-20 | 2003-06-26 | Yoshiyuki Mochizuki | Virtual television phone apparatus |
Non-Patent Citations (1)
Title |
---|
BALABKO P.: "Speech and Music Discrimination Based on Signal Modulation Spectrum", 24 June 1999 (1999-06-24), Retrieved from the Internet <URL:http://www.lamspeople.epfl.ch/balabko/Projects/IDIAP/report.pdf> * |
Also Published As
Publication number | Publication date |
---|---|
WO2007076279A2 (en) | 2007-07-05 |
CN1991981A (en) | 2007-07-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2006107839A3 (en) | Method and apparatus for anti-sparseness filtering of a bandwidth extended speech prediction excitation signal | |
JP5581377B2 (en) | Speech synthesis and coding method | |
WO2005115014A3 (en) | Method, system, and program product for measuring audio video synchronization | |
WO2007076278A3 (en) | Method for animating a facial image using speech data | |
WO2005083677A3 (en) | Method and system for generating training data for an automatic speech recogniser | |
US20080215321A1 (en) | Pitch model for noise estimation | |
EP1675102A3 (en) | Method for extracting feature vectors for speech recognition | |
EP1696421A3 (en) | Learning in automatic speech recognition | |
WO2008084575A1 (en) | Vehicle-mounted voice recognition apparatus | |
WO2005077024A3 (en) | Methods and apparatus for data analysis | |
WO2004061750A3 (en) | Method and apparatus for displaying speech recognition results | |
EP1349145A3 (en) | System and method for providing information using spoken dialogue interface | |
UA94041C2 (en) | Method and device for anti-sparseness filtering | |
Jokinen et al. | Vocal effort compensation for MFCC feature extraction in a shouted versus normal speaker recognition task | |
WO2007076279A3 (en) | Method for classifying speech data | |
BRPI0503959A (en) | method and apparatus for improving speech reproduction quality | |
SG140445A1 (en) | Method and apparatus for automatically recognizing audio data | |
Urbain et al. | Evaluation of HMM-based laughter synthesis | |
Barker et al. | Speech fragment decoding techniques for simultaneous speaker identification and speech recognition | |
EP2507794B1 (en) | Obfuscated speech synthesis | |
EP1533791A3 (en) | Voice/unvoice determination and dialogue enhancement | |
JP2009003008A (en) | Noise-suppressing device, speech recognition device, noise-suppressing method and program | |
CA2483607A1 (en) | Syllabic nuclei extracting apparatus and program product thereof | |
JP2006349723A (en) | Acoustic model creating device, method, and program, speech recognition device, method, and program, and recording medium | |
JP5382780B2 (en) | Utterance intention information detection apparatus and computer program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 06846602 Country of ref document: EP Kind code of ref document: A2 |