WO2007076279A3 - Method for classifying speech data - Google Patents

Method for classifying speech data Download PDF

Info

Publication number
WO2007076279A3
WO2007076279A3 PCT/US2006/062032 US2006062032W WO2007076279A3 WO 2007076279 A3 WO2007076279 A3 WO 2007076279A3 US 2006062032 W US2006062032 W US 2006062032W WO 2007076279 A3 WO2007076279 A3 WO 2007076279A3
Authority
WO
WIPO (PCT)
Prior art keywords
speech data
vowel
classifying speech
amplitude spectrum
classifying
Prior art date
Application number
PCT/US2006/062032
Other languages
French (fr)
Other versions
WO2007076279A2 (en
Inventor
Yi-Qing Zu
Jian-Cheng Huang
Kai-Zhi Wang
Original Assignee
Motorola Inc
Yi-Qing Zu
Jian-Cheng Huang
Kai-Zhi Wang
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola Inc, Yi-Qing Zu, Jian-Cheng Huang, Kai-Zhi Wang filed Critical Motorola Inc
Publication of WO2007076279A2 publication Critical patent/WO2007076279A2/en
Publication of WO2007076279A3 publication Critical patent/WO2007076279A3/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/93Discriminating between voiced and unvoiced parts of speech signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • G10L2015/025Phonemes, fenemes or fenones being the recognition units

Abstract

A computationally non-intensive method for classifying real-time speech data is useful for improved animations of avatars. The method includes identifying a voiced speech segment of the speech data (step 410). A high-amplitude spectrum is then determined by performing a spectral analysis on a high-amplitude component of the voiced speech segment (step 415). The high-amplitude spectrum is then classified as a vowel phoneme, where the vowel phoneme is selected from a reduced vowel set (step 440).
PCT/US2006/062032 2005-12-29 2006-12-13 Method for classifying speech data WO2007076279A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200510121718.7 2005-12-29
CNA2005101217187A CN1991981A (en) 2005-12-29 2005-12-29 Method for voice data classification

Publications (2)

Publication Number Publication Date
WO2007076279A2 WO2007076279A2 (en) 2007-07-05
WO2007076279A3 true WO2007076279A3 (en) 2008-04-24

Family

ID=38214193

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2006/062032 WO2007076279A2 (en) 2005-12-29 2006-12-13 Method for classifying speech data

Country Status (2)

Country Link
CN (1) CN1991981A (en)
WO (1) WO2007076279A2 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2468140A (en) * 2009-02-26 2010-09-01 Dublin Inst Of Technology A character animation tool which associates stress values with the locations of vowels
CN107431635B (en) * 2015-03-27 2021-10-08 英特尔公司 Avatar facial expression and/or speech driven animation
US11176960B2 (en) * 2018-06-18 2021-11-16 University Of Florida Research Foundation, Incorporated Method and apparatus for differentiating between human and electronic speaker for voice interface security
CN109087629A (en) * 2018-08-24 2018-12-25 苏州玩友时代科技股份有限公司 A kind of mouth shape cartoon implementation method and device based on speech recognition
CN111326143B (en) * 2020-02-28 2022-09-06 科大讯飞股份有限公司 Voice processing method, device, equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5884267A (en) * 1997-02-24 1999-03-16 Digital Equipment Corporation Automated speech alignment for image synthesis
US20030117485A1 (en) * 2001-12-20 2003-06-26 Yoshiyuki Mochizuki Virtual television phone apparatus

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5884267A (en) * 1997-02-24 1999-03-16 Digital Equipment Corporation Automated speech alignment for image synthesis
US20030117485A1 (en) * 2001-12-20 2003-06-26 Yoshiyuki Mochizuki Virtual television phone apparatus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BALABKO P.: "Speech and Music Discrimination Based on Signal Modulation Spectrum", 24 June 1999 (1999-06-24), Retrieved from the Internet <URL:http://www.lamspeople.epfl.ch/balabko/Projects/IDIAP/report.pdf> *

Also Published As

Publication number Publication date
WO2007076279A2 (en) 2007-07-05
CN1991981A (en) 2007-07-04

Similar Documents

Publication Publication Date Title
WO2006107839A3 (en) Method and apparatus for anti-sparseness filtering of a bandwidth extended speech prediction excitation signal
JP5581377B2 (en) Speech synthesis and coding method
WO2005115014A3 (en) Method, system, and program product for measuring audio video synchronization
WO2007076278A3 (en) Method for animating a facial image using speech data
WO2005083677A3 (en) Method and system for generating training data for an automatic speech recogniser
US20080215321A1 (en) Pitch model for noise estimation
EP1675102A3 (en) Method for extracting feature vectors for speech recognition
EP1696421A3 (en) Learning in automatic speech recognition
WO2008084575A1 (en) Vehicle-mounted voice recognition apparatus
WO2005077024A3 (en) Methods and apparatus for data analysis
WO2004061750A3 (en) Method and apparatus for displaying speech recognition results
EP1349145A3 (en) System and method for providing information using spoken dialogue interface
UA94041C2 (en) Method and device for anti-sparseness filtering
Jokinen et al. Vocal effort compensation for MFCC feature extraction in a shouted versus normal speaker recognition task
WO2007076279A3 (en) Method for classifying speech data
BRPI0503959A (en) method and apparatus for improving speech reproduction quality
SG140445A1 (en) Method and apparatus for automatically recognizing audio data
Urbain et al. Evaluation of HMM-based laughter synthesis
Barker et al. Speech fragment decoding techniques for simultaneous speaker identification and speech recognition
EP2507794B1 (en) Obfuscated speech synthesis
EP1533791A3 (en) Voice/unvoice determination and dialogue enhancement
JP2009003008A (en) Noise-suppressing device, speech recognition device, noise-suppressing method and program
CA2483607A1 (en) Syllabic nuclei extracting apparatus and program product thereof
JP2006349723A (en) Acoustic model creating device, method, and program, speech recognition device, method, and program, and recording medium
JP5382780B2 (en) Utterance intention information detection apparatus and computer program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06846602

Country of ref document: EP

Kind code of ref document: A2