CA2165546A1 - Method of encoding a signal containing speech - Google Patents

Method of encoding a signal containing speech

Info

Publication number
CA2165546A1
CA2165546A1 CA002165546A CA2165546A CA2165546A1 CA 2165546 A1 CA2165546 A1 CA 2165546A1 CA 002165546 A CA002165546 A CA 002165546A CA 2165546 A CA2165546 A CA 2165546A CA 2165546 A1 CA2165546 A1 CA 2165546A1
Authority
CA
Canada
Prior art keywords
frame
mode
pitch
thq
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002165546A
Other languages
French (fr)
Inventor
Kumar Swaminathan
Kalyan Ganesan
Prabhat K. Gupta
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
DirecTV Group Inc
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=26921843&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=CA2165546(A1) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by Individual filed Critical Individual
Publication of CA2165546A1 publication Critical patent/CA2165546A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/012Comfort noise or silence coding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/26Pre-filtering or post-filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/90Pitch determination of speech signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0002Codebook adaptations
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0003Backward prediction of gain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/09Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being zero crossing rates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/93Discriminating between voiced and unvoiced parts of speech signals

Abstract

A method of encoding a signal containing speech is employed in a bit rate Codebook Excited Linear Predictor (CELP) communication system. The system includes a transmitter that organizes a signal containing speech into frames of 40 millisecond duration, and classifies each frame as one of three modes: voiced and stationary, unvoiced or transient, and background noise.

Description

wo gsl28824 2 1 6 5 5 4 6 A ~I~,J.~, _,10 1_77 METHOD OF ENCODING A SIGNAL CONTAINING SPEECH
BACKGROUND OF THE INVENTION
Fi~ld of th~ ~nv~ntion q~he pr~ent 1,~ n ~ 1 ly relate~ to a ~othod of encod-lnq ~ ~Lgn~l cont~ining ~peech ~nd more part1r~ y to ~ method ~ploylng a line~r pr~dictor to encod~ a ~lqn~l.
De~crlDtion of the Related ~rt A ~odern _ Ir~tlon technique e~ploy~ a C~ Excited L~ln~ns Pr~dictLon (C~P) coder. Th~ c~ 1 a t~_le r~ ini~q nrclt~tlon vnctOr~ for ~ nS~ by ~ lln ~r pr~dic-tlv~ fLlter. ITho t~chnigue lnvolv~ p~stltlonLng an lnput ~ign~l lnto ~ultlpl~ portLon~ ~nd, for ~ch portion, ~~~rrhi-~g tho for the v~ctor th~t ,~r lu ~ ~ filter output slgnal th~t i~ clo~e~t to the lnput ~lgn~l.

~ ` f ~ 2 1 6 55 46 wo s~2ss24 1 ~I/L~ _ 1577 Tha typlc~l CI~P technique may di-tort portion~ of the input 3ignal dominAted by noiDe becauDe the ~ el~ ~nd thQ linear pre-dictivQ filtQr thAt may be optimum for ~peech m~y be inappropri~te f or noi n~T~ r~ smQ~ o~
~ t i~ an ob~-ct of thQ pre~ent Lnv-ntlon to provlde ~ method of ~nro~l~ng _ ~Lgn~l containlng both Dpeech _nd noiDe whlle avoiding ~om~ of the di~tortionD irL ~l. ~d by typical CEI,P encod-ing techniquQD
Additional ob~ectives And advantAge~ of thQ invention will b~
~et forth in the deDcription th_t follows _nd in pArt will be ob-ViouD from the deocrLption, or ~y be le_rned by practlc~ of th~
invQntiOn ThQ ob~ect- and advAnt~guD of the inv~nt$on m~y be and att ined by meanD of the irD~ -Al~tie~ and combi-n_tion3 p~rt~ lA~ly pointed out ln the ~E~ ' claimD
To _chlav~ th ob~ectD And in ~ r~ wlth the purpo~ of thu inv~ntlon, _~ d And broadly ~ hQr in, ~ method of pro~n~ n~ a ~ l havlng ~ peech ,t, th~ ign~l being org~nizod a- a plur~llty of frcm~-, 1D u- d Th~ mQthod compri~-~
thQ ~t-p~ ' for each fr~me, of dQt-~m~n~-~ whQthQr the frAme ~ y -~ to a firDt mode, ~ q on whether the spQech AI~t1Ally ~bDent from th- fr~me~ g-n~r~tlng an ncod~d fr~e in ~~: - with one of a firDt coding Dcheme, when thQ frAme c~ 1D to the fir-t mode, and A Decond coding ~ch~m~ when th~ fr~me doeD not ~ Cy~A~ to th~ firDt mode; and dc~ o~1ng the encoded frame in ~c ~ - e with on~ of th~ fLr~t .
woss/2ss24 r~ 5~0l 77 codlng ~cheme, when the fr~me C~IL~ to the ~Ir-t mc~é, ~nd thQ ~econd codlng ~cheme when the fr~me doe~ not COL' ~YC,A.'I to the fir-t mod~
Rl2T1~P r ~ ~ o~ T~S DR~DGS
~ he forqgo;n~ And other ob~ect-, Aspect- ~nd _dv_nt~qe- will be ~atter u~d~L~L~ from the followlnq det~iled de-cription of ~
preferr~d ` ~ L of the invention wlth reforence to the drav-inqs, in which I
FIG l 18 _ block di_qram of a tr~n~mitter in ~ wlrele~ com-munic_tion sy~tem Acc~r~i{nq to a pr~ferred A ' ~ t of the in-v~ntion;
~ IG 2 is ~ block di~gr~m of ~ receiver in ~ wir~la~- com-munic_tion ~y~tem Accor~l1n~ to the p.~f.L._d ~ i t of the invention;
FIG 3 i- block diAgram of th- encoder in the tran-mitter Jhown in FIG . l;
FIG 4 i- ~ bloc~c dlagr~m of the decod~r in the receiv-r shown in FIG. 2 ~ TG 5A i~ a ti~ng dlagrA showing th~ Alla t of linear predictlon ~m~ly~s window- in th~ encoder shown ln FIG 3;

; `;- `~ 2 ~ 65546 WO95/28824 p~,""~ c~o1-77 rIG~ 5~ timing dl_grA~ ~howLng the ~ , t of pit~h prediction ~n~ly~i~ windows for open loop pitch prediction in the encoder Yhown Ln FIG 3;
FIG 6 and 68 _re a f lowchart illustr_ting the 26-blt line spectral ~ vector quAnti2atlon proce-- performed by th-encoder of l! ~G 3;
FIG ~ is a flowchart illustrAting the op~_tinn of ~ pitch tr~l cklng algorithm;
FTG 8 i~ _ block diagra~ showing in more det_il the open loop pitch e~tlm~tion of the encoder shown in FIG 3;
FIG g i- a f ~ t illu~tr~ting th- oper~tion of thn modi-fied pitch i 'ng algorithm i ,1~ by th- op~n loop pitch ~tim tion ~hown in F$G B;
PIG 10 i~ _ fl~ t ~howing the ~__ m~ ' -9 ~ ~ r - by the mode i~t^~m~nA~ n module ~hown in ~IG 3;
FIG 11 is a dataflow di_gra~ showing a part of the proce~-ing of a ~tep of det~ininq spectr_l ~tationarity ~r~lue~ shown ir~
FIG 10;

wo ss/zssz4 Pcr/usss/04s77 ~IG 12 1- a dataflow diagram showing anothQr part of the ~e~-in~ of the step of det~ininq spectral statlonarity v~l-u~;
FIG 13 18 a dataflow diaqram showing ~nother part of the proces~ing of the ~tep of det~"nin;nq ~pectral ~t_tlonarity val-u~ 5 FIG 14 i~ a dataflow diagram ~howing th~ pro~ nq of the stop of det~ n;~J pltch stationarity value~ ~hown in FIG 10;
FIG 15 is a ~A~fl~ dlagram showlng the pro~a~ln~ of the ~t-p of g~nerating z~ro cro~ing rat~ valu~ ~hown ln FIG 10;
FIG 16 is a dataflow dl_gram showlng th~ p~u~e~~~nq of the ~tep of det~n~q level grA~i~^nt value~ ln YIG 10;
FIG 17 1~ a d~t~ dlagram showing tho p,~c ~-in7 of tha _top of date~n~ng Ahort-t~rm energy value- ~hown in FIG 10;
~ IGS. 18~, 18B and 18C are a fl~ t of detn~in~n~ the moda b~- ~d on th~ ~ U~d value- a~ hown in YIG 10;
FIG. 19 i- a S~locl~ dlagram showing in mor~ det~il the ~ tlon of th~ e~ccltatlon l~ng c~rcultry o~ the encodet ~hown in PIG 3;
_ 5 _ 2 1 6 ~ 5 4 6 w0 ss/2ss24 r~l~L ./~ ~s77 PIGS 20 1J a diagram lllustratLng a proce~Lng of the ~ncod~r ~how Ln FLg 3;
FIGS 21~ ant 21B are a chart of speech coder ~ ~er~ for mod~ A;
FIGS 22 LJ a chart of ~peech coder parameter~ for mode A;
FIG 23 L~ a chart of spe~ch coder paramet~r~ for mode A;
~ IG 24 Ls a block dLagram Lllu~tratlng a ~_ _ e ~ i nq of the ~peech decoder ghowA ln FIG 4; and PIG 25 Ls a timing diagram showing ~n alternative ~1~, t of llnear predictlon analy~l~ window-~n DEscRIPq!~ON OF A r~rSr~, ~M~nr~T~vuq~ OF ~HE lh.r~
FIG 1 ~how~ the tr~n~mitter of the i.,af~ tion~y~t~ Analoq-to-dlgltal (AtD) ~ ,La~ 11 Rample- analog ~peech fro~ a t~lq~h~ - hand-~t at an 8 1~}~ rate, ~_,L. to digltal value- and tupplie~ the dlgital v~lue- to the speech en-cod~r 12 Channel encoder 13 further ~ncode~ th~ signal, a~ may be requlred ln a digltal ~ r ~ 1 rtlom~ ~y tem, and ~p-pll~ a r~ultlng encoded bit ~tr~am to a modulator 14 Digital-to-~n~log (DtA) converter 15 c~ L~ the output of th~ modulator wo g5n8824 P~
1~, to Ph_~- Shit ~ying (PS~) ~ignal~ Radlo fr~ (RFl up cv ~ .L&r 16 amplifLe~ and fL~q,_n ~ multiplie~ the PS~ ~iignals and ~upplie~ thQ amplified ~lgnal~ to anttinna 17 A low-pa~, AntiAliA~i"q, filtQr (not thown) filt-r~ tho ~na-log speech signal input to A/D converter 11 A high-pa~ cont ordQr blqu~d, filter (not ~hown~ filter~ th~ digitized ~ample~
fsom A/D Co~, LLt ll Th- tran~f~r function i~
l 2z-1 +z-2 HE~p(Z) ' 1 -1 . 8891Z-i +0 . 89503Z-2 The hiqh pa~i filt~r attQnuate~ D C or hum contamination nay occur in the i n~ -q ~peech sign~l FIG 2 Hhow~ th~ receivQr of tho L_~f3'_ld ~Ation Jy~~
tem RF down CV~ LL~ 22 receive~ a ~ignal from antQnna 21 and hoteLv~ tho ~ign_l to An i I~te -tL~.~ !) . A/D
cv ~ LL r 23 cv ~, L~ the ~F signAl to ~ digital bit ~tre_m, znd ~d 1 Ator 24 ' ' 1 Ate~ the re~ulting ~it ~tre~m At thi~
point the reVQr~Q of the ;~i~7 proce~ ln th- trAn~mitter talc~
plac- Ch_nn~l decodQr 2S _nd ~pe-ch d~cod~r 26 p~rform '-- 'ing O/A cv,~Les 27 ,~ ~e-i--- _mllog ~p~ch from th~ output of thQ
~peech decoder ISuch of th~ p~cer~ hed in thi~ ~! f ~Ation i~
f ' by a guneral purpo~ ~ign_l ~ a ;"~ progrAm DL~t t~ To facilitate a de~cript$on of th- ~ .f~L..I com-munic~tlon ~y~tem, howeYer, th~ p.~r.. ~ r ~c~tion ~y~tem L~
illustrat~d in t~rm~ of block and circuit fl~ On~ of ordi-n~ry ~kill in the a~t could re~dlly e - ~ the~e ~I~r, int~
progrllm st~t -- for a pLa-e~--. , `` 2 1 ~5546 W0 98/28824 ~ : . J ~ 4~77 FIG. 3 ~how~ th~ encod-r 12 of PIG. 1 ln ~or~ detall, lnclud-lng an audlo PL~ or 31, lln~r pr dlctl~re (t.P) analy~i~ aAd quantization module 32, and open loop pitch e~timation module 33.
Xodule 34 analyze~ each frame of thQ siqnal to determlne whether th~ fr me 1~ mode A, mode B, or modQ C, a~ de~crLbed in more de-t~il bQlow. Xodul~ 35 pArfo~ excitatlon m '~ n~ 'in7 on th~ mode d~t~ l by module 3~. Pr_ 36 ~ --L- com-pros~ed ~peech blt~.
FIG. 4 shows the decoder 26 of Y~G. 2, ~ n7 a ~.oc~.~o~
41 for llnr~rlr~n7 of compressed spe~ch bit~, module 42 for .xclta-tlon ~ignal reconstruction, filter 43, ~peech ~ynthe~l~ fllter ~, and global po~t f ilter 45 .
PIG. 5A ~hows linear predlctlon analy~ls wLndows. Th- pre-ferred ~ tion y~t.m employ~ 40 m~. ~peech frame~. For ~ach frame, modul~ 32 ~ LP (lin-ar ~ rtlo-~) analy~i~ on two 30 ms. windows that are spaced apart by 20 m~. Th~s fLr~t LP
window 1~ c. \~ A at the middle, and the second LP window i~ cen-t~red at th- l~adlng edg~ of th~ ~p~ch f ra~e ~uch that the s~conc;
LP window est~nd~ 15 m~. into tho n~st framo. In oth-r word~, modul~ 32 an~lyz~s a fir~t part of th~ frame (~P window 1) to qen-~r~t- ~ flr~t ~t of fllter '~{r~ t~ and analyz~ a ~econd p~rt of th~ frame and ~ part of a n-st fram (LP wlndow 2) to gen~
rat~ a ~cond set of filter ~
rIG. 5B ~how~ pltch analy~i~ window~. For .each frame, module 32 p~-f~- pltch analysi~ on two 37.62S m~. wLndow~. ThR fir~t pitch analy~is wlndow i~ caAt~L~ at the middl~, and the ~econd pitch analy~is wlndow is cer.te ~d at the l~adlng edge of the woss/2ss24 2 1 6554 6 ~ 77 ~pe~ch frame Duch that thQ ocond pit~h analy~1- window extond~
18 8125 m- lnto the ne~t fr me In other word~, module 32 tn~-A third part of the fr~me (pitch analysi~ window 1) to gen-~rate ~ f~rDt pitch e~timato ant analyzeD a fourth part of the frAme and a part of the ne~t frame (pitch analy-i~ window 2) to generate a Decond pitch e~timat~
~ odul~ 32 employ~ ~ultiplication by ~ Hamming window followeo by a tenth order au~ G-,O lation ~athod of ~ tnaly~L- Nith thi-method of I~P ~naly~iK, module 32 obtalns optimal filter coQf-ficient~ and optimal roflectlon coeffl~-1s~t- In additlon, the re~idual enorgy after LP an~lyDis is alDo readily obtained ~nd, when ~A~ ei as a frtction of thfJ speech energy of the windowed LP ~n~ly-iD buffnr, i~ denoted t- 31 for th~ first LP wLndow ~nd a2 for the second rP wlndow The~e output~ of tho rP analy~i-are uDed ~,' lft,~ tly in the mode ~el~ n algorith~ a~ me~sures of ~pectr~l stationarity, as '- hf~i in ~ore detail below Aft~r LP analy-i~, module 32 ~ th ~r-~' ~ the f~lter coet'f~r~ for the fir-t r~ window, and for th- Decond LP win-dow, by 25 ~z, con~ert~ the ~ rl- ~ to ten line Dpectr~l fre~
tLSF), and ~ th?S~ t n lin~ Dp.~ctr~l f.~ n~ ie~
with a 26-bit LS~ vector ql:~nt~tion (VQ), a~ '- hed below llodule 32 employ- t 26-bit vector qutnt~7~t~on (VQ) for e~ch s t of ten LSFD ~hl- VQ provid.~D good and robuDt ~lLg -nr~
~cro~ a wide range of h~nd-et- ~nd D~ r~ S-partte VQ
co~ are ~ ~' for IRS filt-red tnd ~fltt unfilt.?red (~non-IRs-filtere?d ) speech ~-t~r~Al Tl~e ~nT~-nt1~i LSP vf~ctor 1~ qu-ne~ by th~ S flltered VQ ttble- as well t~ th- fltt _ g _ WO 95/28824 ` 2 1 ~ ~ 5 4 6 PCT/US95/04577 unfLlterQd~ VQ table- The optimum clas~iflcation i~ selected on th~ ba~ls of the cepstral dl~tortlon mea~ure Withln each cla~Lflcatlon, the vector quantlzation i~ carrled out ~lultiple candltates for each split vector are chosen on the basil~ of energy welghtet mean ~quare error, and an overall optimal selectlon i~
mado within each cla~-iflcatlon on th~ ba-l~ of tho cep~tral dlstortlon mea~ure among all comblnation- of cantLdate~ After the optimum c1A~1fi~ation is cho~Qn, thQ q -nt1 ~ llne spectral L,e~l,.s~cles ar~ ~o.~ ~ to filter coeff1~i~nt~
21ore ~ 1fir~11y, module 32 quantlze- the ten line spectr~l frequencles for both sets with a 26-bit multl-cod~bool~ spllt vec-tor quantlzer that clA~ifie~ the ~nT~-nt~?ed llne spectral fre-qu~ncy vector a- a ~voicQd IRS-fLltered,- ~unvolcet IRS-flltered,~
~volcad non-IRS-flltQred,~ and "unvolcQd non-IRS-flltered~ v~ctor, where ~RS~ r~fer~ to Ln~ '~At~ cfla_ ~e ~y~t~m fllter a~
r -ifi~i by CC~q~T, B1U8 ~OOk, RQC.P.4~.
FIG 6 show an outllne of thQ LSF vector guantizatlon pro-c~ odule 32 employ~ ~ spllt vector q ~ ~ for each cla~-lflcatlon, 5n~ 5~"~ a 3-4-3 pllt ve~ctor qu~ntlzer for the volc~d IRS-fllter d~ and th~ ~volced non-IRS-flltQred~ categorie~
51 and S3 T'ne flr-t three LSF- u~e an 8-blt: ' ' ln functior modul~ 55 and 57, th~ ne~ct four LSF- u~- a 10-blt ~ Ln functlon modulQ- 59 and 61, and the la~t thre~e LSFs use a 6-bit co~l~hook ln functlon modulQ~ 63 and 65. For thQ ~unvoiced IRS-fllt~r~td- ~nd tho ~unvoiced non-IRS-filter~d~ categorl~ 52 ~nd 54~ a 3-3-4 lspl$t vector quantizQr Ls u~ d The flrst threst LSF~ USQ a 7-bit ~ in functlon slodules 56 and 58, th- ne~t - : - 21 65546 wo ss/2ss24 . ~ ~ s77 thr~o LSF~ u~ aA 8-blt vector ~ in function module~ 60 and 62, and the last four LSFs U8f, a 9-b$t co~l^~^,ol~ ln function mod-ule~ 6~. And 66 Prom e~ch spllt vector ,o~ ol~, the three be~ft candLdAte~ arQ selected in functLon module~ 67, 6a, 69, and 70 uJing the energy ~_~qht- me~n ~qu_re error crltQrLa The fnerqy welghting reflects the po~Qr lev~l of the spectrAl envelo~ at ~ch l1n~ ~p~ctral f~l r The thre~ be~t candldAte~ for each of the three spl1t vector~ re~ult in a tot_l of twenty-~evQn com-b1n~tLons for each ~;c~f ~ The search 1~ constr~lned so that at le~st one combln_tlon would re~ult in ~n ordered ~et of LSF~
Thls i~ usu~lly a very mlld con~tr~lnt impo~ed on the ~earch The optimum combln~tion of these twenty-~even comb1natlons 1~ ~elected in functlon module 71 rie,p_n~lfn~ on the cepstral dl~tortlon mea-~ure Flnally, the optim~l C~tQgory or ~lA~1ff~etlon is deter-mined _l-o on the ba~i~ of the cep~tr~ll dl~tortlon me~ure The quAnt1- ~ LSFs ~re c~ L-~ to filter co~fff^f-nt- and then to . ,~oc~,Lcl~tion l~q~ for lnterpol_tlon y~
The re~ultlng LSF vector q.~-ntf --r 8chem~ 1~ not only eff~c-tive acro~s nL -~--r~ but al-o acro~ v~rylng degree~ of IRS fil-tering which mod~l- the fnfl ~ ~~ of th- h~nd~et ~ - Th~
: -~--' of th v~ctor ql~-ntf7~r- ~r train~d fro~ a ~1~cty talker spe-ch 'f't^~--G u~1n~ fl~t a~ w~ IRS f~ I ~h~pLn~ Thl~
i~ ~~~lgn~f to provide consl~tent ~nd good pc,~ 9 _cro~ sev-fr~l spe_ker~ And ~Icro~ v_rlou- h-- ~sC~ The average log ~pec-tral distortlon ~Acro~ the entlre TIA h~lf r_te d~t~ba~e i~ ~p-prwcim~tely 1 2 dB for IRS flltered ~peech d_ta ~nd Arr~ teiy 1.3 dB for non-IRS flltered speech d~t~l.

`. 2~ 65 4 wo ss/2ss24 5 6 i ~"1 ~c l~77 Two e~timAte- of the pltch ~re deto m1-- per fr~e ~t lnter-ral~ of 20 m ec ThQs~ opQn loop pLtch e~tim~te~ ~re u~ed in mode ~slection and to encode the clo~ed loop pitch an~ly-$- Lf th~ ~e-lected mode i~ a ~, nAntly voicQd mods Module 33 deto-m~ the two pitch e~tLmate~ from the two pitch ~n~lysL~ wlndow~ ~~ lhsd _bore ln connection w$th FIG 5B
using ~ 1fiod form of the pitch tr~cking ~lgorithm shown in FIG 7 Thi~ pitch Q~timation ~lgorithm m~k~- an initi~l pitch ~-tim_te in function module 73 u-ing ~n error function calcul~ted for ~11 v~lue~ in the set {(22 0, 22 5, , 11~ 5~, follow_d by pitch tr~cking to yield ~n o~r-r~ll optimum pitch r~lu~ Function module 74 employs look-bAck pitch tr_cking u~ing the error func-tion~ and pitch e~timatQs of the preriou~ two pitch ~n~ly~is win-dow~ Function module 75 employ~ look-~he~d pltch tracking using thQ ~rror function- of th- two future pitch analy~i~ window~ D--cision modul~ 76 _--eq pitch e~tim~te~ ng on look-bJck ~nd look-~hQ_d pitch trAcking to yiald ~n ov-r_ll optimum pitch rlllue ~t output ~ The pitch e~tim~tion ~lgorithm ~hown ln FIG
tha error function~ of two futurO pitch ~naly~i~ win-dow~ for it~ look-ah~d pitc~ tr~cking ~nd thu- ~ del~y of 40 IlU In order to aroid thi~ ponalty, th L_~f __ ~ co~-1r~t1~7n ~y~tem employ~ ~ 1f1r~t~1 of the pitch e~tLmation ~lgorithm of YIG 7 ~ IG 8 ~how~ th~ open loop pitch e~t~ 33 of rIG 3 Lnmore d~tail Pitch ~n~ly-i~ window~ on- ~nd two ~r~ input to re-~pQCtiV~ Co_putQ Qrror function- 331 And 332 Th~ output~ of tho~ error functlon comput~tion ~r~ input to ~ rgf1- L of ' 1 G5~46 WO95/28824 P~,11~J.,._'0~'77 p~t pltch eJtimate- 333, and the roflned pitch e-timate- are i~ent to both look b~ck and look ah-ad pitch tr~r1r{n5t 33~. and 335 for pitch window one The output~ of the pitch tr~lring circuits are input to ~elector 336 which select the open loop pitch on~ as the f is~t output The ~elected op~n loop pltch one l- alJo lnput to a look b~ck pitch trJ~cking circuit for pLtch window two whlch out-puts the open loop pitch two Fig 9 how~ the - 'i f i9d pitch tr~r--~ng algorlthm imple-mented by th- pitch estim tion circuitry of FIG 8 The ~~fi~
p$tch eJtl~ t~n algorithm Qmploy- the sam error function as in the Fig 7 algorithm in each pitch an~ly-i~ window, but the pitch tracking scheme i- ~ltered Prlor to pitch t-arl~ ng for either the first or second pitch analysis window, the pre~ious two pitch ~stimate- of the two previous pitch analy i- window are ref ined in function modul~ 81 and 82, re-pectively, with both look-back pitch ~_--'n5t and look-ahead pitch tracking u-ing the ~rror func-tion- of the current two pitch analy~iJ wlndow~ ThiJ i- followed' by look-back pitch trl-r--in~ in fu~ction modul~ 83 for th~ fir~t pitch analy~i~ window using th- r~fined pitch ~timate- and error fllnrri~n~ of th~ two prl~rious pitch an~ly-i~ window ~ook-ahe~d pitch i 'n~ for th~ fir-t pitch annly iJ windo~ in function modul- 8~ i- li2ited to u-ing th- rror function of the second pitch an~ly~i~ window The two e-timate- ar- _ red in deri~ior module 8S to yield an o~-r~ll best pitch e-timat~ for the fir~t pitch analy i~ window For the -cond pitch analy~ window, look-back pitch i ' 'n~t i8 carried out in function modul~ 86 as well a~ th~ pitch estimate of the first pitch analyJis window and _ 13 --f~ 21 6~546 W0 9512882J r~ . ' 1;77 it~ rror function No look-ahead pitch ~r^cl~nrJ i~ u~d for thi~
~econd pltch analy~i~ window wlth th~ re~ult that the look-back pltch e~tLmate 1 taken to bQ the overall be-t pLtch e~ti~te at output 87 PIG 10 show~ the modn d~termLnatlon procP~in7 performed by mode selector 34 . DerPn~t~ n~ on spectral st~tionarlty, pltch ~tationarity, ahort t~rm energy, Ahort tQrm level gradient, and zero cros~lng r~te of each 40 m~ frame, m ode ~lector 34 cla~
fie~ each fr_me lnto one of threo modQ-~ volcQd _nd statlonary mode (Mode A), unvolced or ~rAn~ nt mode (~lode 8), ~nd b~ J
nol~e mode (~odQ C) !Sore speciflcally, mode ~elector 34 gener-ates two loglc~l values, each indicating spectr~l st~tionarity or ~imi1~rity of ~pectr_l content between the currently ~L. e~
fram~ and the prevlou~ frame (St-p 1010) Node selector 34 g~n~r ~tes tw- logicAl v~lue~ indlcating pltch tation~rity, ~imilArity of f lnri tal f~ le~, between the ~ y ~ e~?i fr~Q
and th~ pr~vlou~ fram~ (Step 1020) ~lode ~1ect~?~ 34 gennr~te~
two loglcal value- indlcating th~l zero, ~r ~~lng rat~ of tho cur-r~ntly ~ EI frame (step 1030), a r~te in~l-- - by thQ
h~gher ~ ~ ~ ~ of tho fram~ r~l~tiv~ to the lower of th~ frame ModQ ~slector 3~ gQnQr_te~ twq loglcal v~luQ~ ind$catlng lQvel ~ '~Pnt- within th~ currently y: ~?~ fr_me (step 1030) ~lode ~ Lo~ 34, ~.ta- flve logical valu~- lndicating short-term energy of the currently pro-c~-~ed frame (Step 1050) Su~ ly, mode selector 34 deter-mine~ the mode of thQ frame to be modQ A, moda a, or mode C, de-pendlng on the value~ gener~ted in Step~ 1010-1050 tStep 1060) -- 1~. --2 f 6 ~ 5 4 6 wo ss/2ss24 r~ 0 1~77 F~G 11 1~ a block dlagr~m ~howinq a proce~ of Step 1010 of FIG 10 ln mor- detail The pro~q~in7 of F~G 11 dQtermLne~ a cepstral dl~tortlon ln dB Module 1110 convert~ the guantized f Llter coef f icient~ of window 2 of the current f rame lnto the lag domain, and module 1120 convert- the quantizQd fllter coefflclont~
of window 2 of tho previou~ f rame into thQ laq domaln ~(odule 1130 lnterpolatQ- the output- of moduls~ 1110 and 1120, and ~odule 11~.0 cv ~.Ls the output of modhle 1130 back lnto fllter co-~fici~n-e Modulo 1150 co.,~ .,L~ the output from module 11~0 into the c~pstral domaln, ar~d module 1160 c~ Ls the llnTlAnt1 7ed fil~
- ter coefilclent~ from window 1 of tho current frame lnto the cnp~tral do~aLn ModulQ 11~0 gnnerate~ the cep~tril dl~tortion dc from th~ outputs of 1150 and 1160 PIG 12 ~how~ genQratlon of ~pectral ~tatlonarlty value LPCFIAGl, whieh 18 a r~latlv~ly ~trong 1n 1~r~eor of ~pectral ~tatlonarlty for the fr_me ~lode ~elector 3~ ~ LPCFLAGl u-lng a ~ 'nA~ n of tw~ te~-hn~ -- for - n~ pectral ~tationarity The flrst technlgue ~ the c-p~tral dl~tor-tlon dc u-ing compar_tor~ 1210 and 1220 In Flg 12, th- dtl t` h~ input to comparator 1210 1- -~ 0 and th~ dt2 th~ ld inpue to comparator 1220 1~ -6.0 ~ he seeond tr-~n~T~ i5 ba-ed on thQ ~ l energy after Il?C analy l-, ~::A~ ai a~ a fraetion of the LPC analy~ peech buffer ~p~etral energy Thl~ nergy 1~ a ~ v~..L of LPC analysl-, a- ~9~ above ThQ ~1 lnput to eomparator 1230 i- th- ~J~ energy for th~ filt~r ::9~1c~ t of window 1 and the ~2 input to comparator 1240 1- th~ r~trl~ l energy of 21 6~546 WO 9~/28824 P~ .J.. 1'77 the flltQr coefficientA of window 2. The tl input to compara-torJ 1230 ~nd 1240 i- a thr~hold equ~l to 0 . 25 .
PIG. 13 how~ dataflow within mode ~olQctor 34 for a genera-tion of spQctral 3tationarity valuQ f lag LPCFLllG2, ~hich i~ a rel~tiYeiy weak indicator of ~pectral stationarity. The proce~-lng shown in FIG. 13 i- ~imil~r to that ~hown in FIG. 12, e~cept th~t LPCP~AG2 i~ ba~d on a rQlativoly r~la~ced s~t of thre~hold~.
~he dt2 input to comparator 1310 i~ -6.0, thQ dt3 input to com-parator 1320 i~ -4.0, the dt~ input to comp~rator 1350 i~ -2.0, the .~tl input to comparator~ 1330 ~nd 1340 i~ a thrQ~hold 0.25, and the ~t2 to comparators 1360 and 1370 i~ 0.15.
Mode selector 34 mea~ure~ pLtch se~tinn~ity u~ing both the opQn loop pitch value~ of the currQnt fr mQ, denoted a~ Pl for pltch window 1 and P2 for pitch window 2, and th~ open loop pitch valu~ of window 2 of th~ pr~vlou~ fr~o donoted by P_l. A lowor rangQ of pitch value~ (PLlPUl) ~nd an upper r~ngQ of pltch valuQ-( PL 2PU2 ) ar PLl MIN (~ P2) - Pt P~l llIN (P_l, P2) + Pt PL2 ~A~ (P_l, P2) Pt PU2 IIA~ (P_l, P2) + Pt, wh~r- Pt 1~ 8Ø If tho t ro r~nge~ arn - o rl~1ngr i.o., PL~
~ PU~ ~ then only a weak indicator of pitch ~tation~rity, dQnoted by PITCXPLAG2, is E ~ i hle ~nd P~TCHPLAC2 i~ ~Qt if Pl liQ~ withir~
~ither thn lower rango (PL1, PUl) or upp~r ran~o (PL2, PU2). If 2~ 65546 wo ss/2ss24 ~ 577 the two rang-~ are overlapping, i ~, PL2 ~ PUl, a ~trong indic~-tor of piteh ~tationarity, denoted by PITC~FLAGl, i~ po~ihi~ and i~ set if P1 lie~ within the r~ng- (PL~ PU) ~ where PL ' ~P-l+p2)~2 2pt P ~ ~P IP )/2 1 2P
FIG 1~ ~how~ a dat~flow for gener~tinq PTTC~FLAGl and PITCHFLAG2 wlthin mode ~le~tor 34 Nodule 14005 ~ ~ te3 ~n output equal to the input having the larg-~t value, and module 14010, - t211 an output equal to the input having th~ ~mall~t value~ Nodule 1420 generates an output that i~ an averags of ~hq v~lue~ of the two input~ Module~ 14030, 14035, 14040, 140~5, 14050 ~nd 14055 aro adder- Module~ 14080, 14025 and 1~090 are AD gates Nodule 1408? L~ an inYerter Nodule~ 14065, 14070, ~nd 140?5 are eaeh logic bloc3c~ generating a true output when (C~B)~(C~A) The clrcult of FIG 14 ~l-o ~ r~l~Ah~l1ty value~ V 1 Vl, and V2, eaeh indicatlng wh ther th value~ P 1' Pl, and P2, r~peetiv-ly, ar~ r liable Typlc~llly, th-~- r^l~ah~l~ty valu~
~re a ~ ~ L of th- pltch calculatlon algorith~ Th circuit ~hown ln FIG 14, t~- fal~e v~lue~ for PIq~G 1 and PITC~}J~G 2 lf any of the~ f lag~ V 1 ' Yl ' V2 ~ ar~ f al~- Pro-e-~lng of th~-e rQl~h~l~ty value~ i~ opt~
FIG 15 ~how~ dataflow wlthln mode ~ 34, for g~neratin~
two loglc~l valu~ indleatlng a zQro c_ ~ng rate for the fr~
Nodul-~ 15002, 15004, 15006, 15008, 15010, 15012, 1501J and 15016 wo ss/2ss24 2 1 6 5 5 4 6 ~ 77 ach count th~l numher of zQro ~ i nq~ ln a re~pectiv~ 5 mil-D~ l f~ - of the fram~ currently being ~,~cE~ei For ~camplc, module 15006 countJ the num_er of 2ero LOD~n~ of the ~ignal o~lrri"~ from th~ time 10 millir~ ' from the beginning of the frame to the time lS m~ from the beqinning of th~ frame Comparators lS018, 15020, 15022, 1402~, 15026, 15028, 15030, an~i 15032 in comblnation with adder 15035, g~n_L ,te a ~ralue indlcating the numher of 5 m~llir~ ~ (IIS) ~' r - haYing zero cro~ing~
of ~ lS C tos 15040 Qt~ the fl~g ZC_BOW when the number of ~uch ~--hf ~ leDs than 2, and the comparator 1503~ set~
the flag ZC HIGH when the numher of such 8 hf ~ is greater than 5 The irDalu~ ZCt input to comparatorD 15018-15032 is lS, the valuc Ztl lnput to to 150~0 i~ 2, and th- ~alue Zt2 input to comparator 15037 i~ 5 rlgD 16A, 16B, and 16C how a d~ta flow for gonerating two logical Yalue~ indicati~r~ of ~hort t~rm lev~ Mod~
l-ctor 34 - _D ~hort t~rm l~r l ~ , an indication of t ~n.i~nt~ within a frame, u-ing ~ ~~ filtered ver~ion of th~ - -' input signal amplitude ISodule 16005 g~nerate~ the ~ l t~ ralue of th input Dign~l S(n), module 16010 - - it~
input ~ignnl, and 1~ fllt-r 16015 ~ e~ ~ ~ignal Al,ln) th~t, ~t t~ in~tant n, iD- e ~ i by A~,(n) - (63/64)AI~(n~ (1/64)C(I D(n)¦ ) where the -~irg function C( ) i~ th~ ~I-law function _ 18 --21 6~46 WO 95128824 i i ~ p~ 0 ~'77 in CCIqT G 711 Delay 16025 generates an output that iB a 10 ms-delayed ~rer~lon of it~ Lnput and subtractor 16027 generate~ a dlf-f~renes bQtween AI,~n) and the AL~n~ ~odule 16030 generate~ a ~ignal that Ls an absolute value of its input ~ ery S ms, mode ~elector 34 compares AL~n~ with that of 10 m~ ago and, if the differ--nce ~ n)-A~(n-80)¦ ~xceeds a ~ixod relaxed th ~ t~ a counter ( In th~ preceding ex-pression, 80 c~L,~ ~ ds to 8 samples per ~sS times 10 ~ As shown in Fig 16C, Lf this difference does not ~ceed a relatively stringent threshold ~Lt2 ~ 32) for any ~ mode sslector ~3 s-ts LVBFLAG2, wQakly indicating ~m ab~onc~ of t~n-~nt~ A~
hown in ~ig 16B, if th~ ~ di6 exceed~ ~I more relax~d th l1ho~ Ltl - 10) for no more than one _ - (Lt3 - 2) mode ~-l9cl a- 34 getg LV~PLAGl, gtronqly indicating an absence of tran-sients lloro sporif~ l ly, Fig 163 shows delay circuit~ 16032-16046 that each g~ACLat~ a S ms delayod v~r-ion of its input Each of latch~s 16048-16062 ave a ignal on it- input Latche~ 16048-16062 ar- trob d at a c~,mmGn time, n~ar th- ~nd of ach 40 m~
pe~ch fra~e, ~o that each latch ~a~re~ ~ portion of the fram~
~ i by S m- from the portion ~ved by ~m ad~ac~mt latch C _~ ~oY- 16064-16078 e~ch compar~ th~ output of a re~p cti~r~
l~tch to the th~ ld Ltl and adder 16080 ~um- thQ comparator outputs and s~nd- the sum to comparator 16082 for comparison to th~ ol~ L
Fig 16C how~ a circuit for generating LVLY~aG2 ~n Fig 16C, delays 16132-16146 are similar to th- d~lays ~hown in ; ;`
wo95128824 2 ~ 65 ~46 ~ o Is77 FllJ 16B ~nd latche~ 16148-16162 arQ ~imilar to the latche~ ~hown in Flg 16B Comp~rator~ 16164-16178 e~ch comp~re ~n output of a re~poctlvo latch to ths threshold Lt2 ~ 2 Thu~, OR g~te 16180 generatee a true output if any of th~ latched ~ignal originatinq from ~odule 16030 exceed~ the thre~hold Lt2 Inverter 16182 in-v rt~ thc output of OR gat~ 16180 Flg 17 hows a dat~ flow for genQratins par~mQter~ indica-tlve of ahort tsrm energy Short tsrm energy iB me~ured a~ th~
me~n squ~r~ energy (~vorage energy per ~ample) on ~ frame b~si~
well a~ on ~ 5 m~ b~ The ~hort tarm energy 1~ det~rm1 n~d relative to ~ b _1~9 v~.d energy Ebn Ebn i~ initi~lly ~t to a con~t nt Eo ~ tlOO ~c (12)1~2)2 S~ Lly, when c framo 1~
d-t^rmi~~~ to be mode C, Ebn 1~ -t equ~l to (7/8)Ebn + (1/8)Eo Thus, some of the ~ ol-~ employed in the cLrcuit of FIG 17 aro ~d~ptlYe In Plg 17, Et~ - O ~0~ E~n~ Btl - 5, Et2 ' 2 5 ~bn' Et3 1~8~bn~ ~t4 ' Ebn~ Ets ' 0~707 gbn~ ~nd Et6 ~ 16 0 T~- ~hort term energy on ~ 5 ~ b~ provide- an indication of ~_ of ~pe~ch tl~ .L th~ fram~ u~lng 1l ~lngl~ fl~g EFSAGl, ~hich i~ 3 ~1 by tR-ting tho ~hort t-rm ennrgy on ~ 5 m~ b~ go,in-t ~ 1, in_~ count~r ~ ~r the d i~ nd t~-ting the counter'~ fin~l v~lue n-t ~ f~ed th~ hAld C ,-r~nq th~ ~hort term enerqy on ~
fr~ ba~i~ to variou~ thre~hold- provLd~ indication of ab~-nce of ~po-ch ~k ~ .L th~ framo ln the form of ~ev-r~l fl~g~ with varyinq d~gree~ of ~nnf~d~n~e The~ fl~g~ ~ro denoted a~ E~LAt;2, EFLI~G3, EFLAC4, and EF~AG5 _ 20 --- ` 2l ~546 W095/28824 ,. ~- . PCTIUS95/04577 FIG 17 shows d_taflow within mode selector 34 for generAting th~se flag~ Module~ 1~002, 17004, 17006, 17008, 17010, 17015, 1~020, and 17022 each count the energY in a respective 5 NS
subframe of the fr_me currently being ~ esl~d Comp_rators 17030, 17032, 17034, 17036, 170~8, 17040, 17042, and 17044, in combinatlon with addQr 17050, count thQ numbQr of ~ubframe~ h_Ying an enerQ e '~nq Eto ' 0 707Ebn FIGS 18A, 18B, and 18C ~how th~ rro~P~rin~ of ~tep 1060 Node selector 34 f$r~t rlA~ thQ framQ a~ b~_~yL~ d noise (modQ C) or Ypeech (modes A or B) Mode C tond~ to be character-iz~d by low en-rgy, relativQly hlgh D~' 1 8tAtionarity betW~Qn th~ currQnt frame ~nd the pr viou- fram~l, a rel~tive ab~ence of pitch ~tationarity between the c~rrQnt fram~ and the pr~vious framQ, and a high z~ro c ~~n~ rat- P-- ~ ' noL~e ~mode C) i~ d~-lA ~ QithQr on thQ ba-i~ of the bL~o.~; L short term energg flag EFLAG5 alone or by ~ ` 'n~q we~ker ~hort term energY flag~
Er~AG4, ~AG3, ~nd EFLAG2 with oth~r f lag~ indicating high zero ing rat, ab~enc- of pitch, ab~-nce of ~n~ , etc ~ lorQ ~}-- f~ y, if the mod~ of tho proYiou~ fr~ wa~ A or' if EF~AG2 i~ not tru, ~ c'ng ~OC~ to ~t~p 18045 (~t-p 18005) St p 18005 en-ur-- th~t th~ curr~nt frame will not be d- C if th~ previou- frame wa~ modQ A ~he CurrQnt frame i~
~ode C lf (I~CE~G1 and EFI,AG3) i~ tru~ or (IPCFLaG2 _nd EFIAG4) i~ tru~ or EFI AG5 i~ tru- ( ~t~p~ 18010, 18015, and 18020 ) The currQnt frame i~ mod~ C if ~not PITC~FIAGl) and LPCFIAGl and ZC_HIG2~ true (~t-p 18025) or ( tnot PITC~JUl) and (not PIl~ ) and IPCFLAG2 and ZC_~IIG~ true (~t~p 18030) Thu~, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ W095128824 ~ 'i'"; i ` ~ 216~5~6 r~ 1577 the ~,~ J~in~ ~hown in Fig 18A deto~1n~- whether the frAme cor-La~ s to a fir~t de (Mode C), d~ g on whether a speech t is sub~tanti~lly absent from the frame In step 18045, ~ score i~ calculated ~leponrl~nl~ on the mode of thQ previous fr me If the mode of the previous ramQ was mode A, the scor~ is 1 + Lvr~ + eyLAcl + ZC LOW If the prevlouM mode -w~ mode B, the ~core i~ 0 + LVFLAGl + ~FLAGl + ZC ~OW If the mode of the previou~ frame wa~ mode C, the ~ore i~ 2 + LYFLAGl +
EFI,AGl + ZC LOW
If the DdQ of the previou~ fr~me w~ mode C or not LY~FLAG2, the mode of the current fr~me is mode B tst~p 18050) The curr~nt framQ i~ mode A if (rPCP~ PITCHFIAGl) 1~ true, provided thc score L~ not les~ than 2 (~tep~ 18060 and 18055) The current fram~ i- mode A if tLPC~AGl and PI~rcHFLAG2) 1~ tru~ or (LPCFLAG2 and PITCHFLAGl ) is true, provided score i~ not le~ th~n 3 ( ~tep~
18070, 18075, ~nd 18080 ) S~ tly, ~peech encod~r 12 gener~t~- an encoded frame in Ac~ A with one of ~ fir~t coding ~chem~ (~ coding ~chemQ for mod~ C), when th- frame ____ ~ d~ to ths first Dde, and an al-t~rnatlv coding ~che (~ codlng schem~ for mod~ A or B), wh-n th- fr~ doe- not c~ to the fir t mod~ d-- ~-~ in mod- det~il below For mod~ A, only th~ ~econd ~et of lln~ ~p~ctr~
v~ctor ~u~ntiz~t~on indlcQ~ nQ~d to be tr~n~mitted because the first s-t can be ~nferred at the r~ceiver du~ to the slowly vary-ing natur of the voc~l tract shape ~n ~dditlon, th~ fir~t and -cond op n loop pitch e~timate~ ~re qr-nt~ nd transmitted 21 ~5546 wo g~/28824 - -- r~ 4'77 . ;:
b~cause they ~re used to encode the closed loop pltch esti~ate~ in e~ch ~ubframe The qu~ntization of the second open loop pitch estimate is a~ ed using a non-uniform 4-bit quantizer while~
the quantization of the fir~t open loop pitch e~timate i~ ac-1~ d u~ing a dif ferentLal non-uniform 3-bit qu~ntizer Since the vector quantization indice~ of the LSF'~ for the fir~t linear prediction analysis window arQ nelther tran~mitted nor used in mode selection, they need not be c~lcul~ted in mode A Thi-r duce~ the c ,l~ity of the short term predictor ~ection of th~
encoder in thls mode Thi~ reduced lP~ity a~ well a~ the lower blt rate of the short term predictor F~ -t~LA in mode A i5 off~et by f~ter update of all the ~ccit~tion model p~ ~Q ~.
For mode B, both sets of llne spectral f~ r.~ vector qu~n-tlr~t~on mu~t be transm~ttQd because of potential spectral nonstationarity ~lowever, for the fir~t ~et of line spectral fre-y~ we need search only 2 of the 4 cl~ification~ or catego-ries This is because the IRS v~ non-IRS solection v~ries very Jlowiy with tiD~ If the s-cond J-t of lin~ ~pectr~l L ~
~re cho-~n from th~ ~voiced IRS-flltQred c~t-; r~ then the first ~t ca~ be ~ ~' to b~ from ith~r the ~voiced IRS-filt-red- or ~ oiced IRS-filtQr~d~ ~ If the ~econd ~ot of lin ~p-ctral frequencieJ were cho-~n from the ~unvoiced IRS-filtered ,~tog ~, then again the fir~t ~et can be ~,~ L
to bQ from either the ~voiced IRS-filtered~ or ~unvoiced IRS-fllt~r~d c~te, ls If the ~Qcond ~et of lin~ ~pectral frequen-ci~- w~r~ cho-~n from the ~voiced non-~RS-filtered~ category, then _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ . . _ Wo ssl28824 ' " ~ ' ' 2 1 6 5 5 4 6 A ~ ~ Q4 77 the flrst set can be Q~pected to be from either the ~voiced non-IRS-filt.red~ or ~unvoiceA non-IRS filtered~ categorie~ Fin~lly, if the ~econd set of line spe~tral freguencie-D ware chosen from th~ ~'unvoiced non-IRS-filtered~ category, then again the first set can be ~ L~ to be from either the ~voiced non-IRS-flltered~ or ~unvoiced non-IRS-filtered~ CGt~3 1Q~ A~ a re~ult only two cat-egories of LSF ~^oA^~o^~ need be Dearched for the quantization of the flr$t D^et of liAe Dpectral frequencie~ Furthermore, only 25 bitD^ arn n~ded to encode thQ-e ~Iuantizatlon indice~ in-tead of the 26 needed for th^D Decond set of LSF'-, ince the optimal cat-ogory for the first ~et can be coded u-Ding ~u-t 1 blt Por mode B, neith~r of the two open loop pitch e-timate- are tr n-Dmitted ~ince they are not u~ed in guiding the clo-ed loop pltch e~tima-t~-, The higher ,l-Yity involved in - '~ng a- well a- thQ
higher bit rate of the short term predictor F' t~LD in mode B
is , ~ated by a slower update of all the excitation model pa-rameterD .
l~or mode C, only the D^econd Det of lLne ~pectral f..~ r~
vector gu~r~r~t~ indlce~ need to be tran-mitted because for th.
human e_r i- not a~ -n-itive to r_pid ch~nge- in ~ Dhape ~a~at~r ~ for noi~y input- FurthRr, ~uch rapid pectral shape var~A~ are atypic_l for many kind~ of ~', ' noi~e ourc~ Por mode C, n ither of the two op~n loop pitch e-Dtimate~
are tran-~itted since they are not u-Qd in guidAing the clo-ed loop pitch e-tim_tion Th- low~r ~ AY~ty involved a- well a~ th.
lower bit rate of th~ short term predictor pA - te.D in mode C is ` - . 21 65546 WO 95/28824 ' I ~ . C.'C 1'77 --t~d by _ fA~ter upd_te of the fLxed cP~ho~k gain portion of the excitatLon model p_rametQr~.
- The gain qu_nti2ation tablQs are tailored to edch of the modes. Al~o in e_ch mode, the clo~ed loop p~rameter~ are refined uOiAg A delayed de~ n appro~ch. Thi~ delayed d~ isn i~ em-ployed in such a WAy th_t the over_ll codQc dQlay i~ not in-cre~sed. Such A dQlayed de~ n ArFrOA-h is very effective in tr~sltlon reglon~.
In modQ A, the qu~ntlzation indlceO co.,~..dlng to the sec-ond sQt of ~hort term predlctor coQfficlents a~ well a~ the op~n loop pitch e-tim~te~ arQ tr_nOm$tt~d. ~nly the~Q q---nt1- 1 param-t-r~ _ro u~ed in thQ Qxclt~tion ~ ng. The 40-mOec speech framQ is d$~1ded into sev~n O~ ~ . ThQ fir~t si~ _re 5 . 75 mOec in length and ~-lrQnth Lo 5 . 5 mO~c in length . In e~ch ..hf r ~n $nterpol_ted Oet of ~hort tQrm prsdlctor coQfficient~
~re u~ed. The lntQrpolatlon lo dono in thQ a~L~cv . ~1 Ation lag domAin. tl~ing thi~ interpol~t~d ~et of cseff~ n~, a clo~ed loop ~n~lyOi~ by 0~ '--i- a~ u~ed to dQrive the optimum pLtch $nd~, pitch gnin lnd~x, f$~ed _- '~ ind ~, and fixed c~nho~)~ g~in index for Q~ch _ . ThQ clo~d loop pitch in-do~ ~rch r~nq i~ round an ~nt~rpolAted tra~-ctory of th- op n loop pltch Q~tim~tQ~. Th- tr~dQ-off betweQn thQ ~earch r~nqe and the pitch rQ~olutlon 1~ donQ ln ~ ~ynam~c fa~hlon d~-pQnding on thQ cl~ of thQ opQn loop pitch QOtimatQ~. The f$xed _c~ l employO zlnc pulo~l ~h~pe~ whlch arQ r~htAin~d u~ins ~ 25 -i: ! 2 ~ 5 5 5 4 6 WO 95/28824 1 ~ rr4'77 weighted combination of the sinc pulse and a phase shifted VQr-~ion of its Hllbert tr~n~form The fixed c '~ gain Ls guan-tized in a differentLal m~nner The analysis by synthesiq technique that is used to derive the excitation model parameters employs an i~t~rpolated ~et of short term predi ctor coefficients in each , h~ ThQ
d-termination of the optimal set of Q~cit~tion model parameter~
for e~ch subframe is dete~min~ only at the end of each 40 IIID.
frAme bec~u~- of delayed deciD~on In derivlng the excitat~ on model parameters, all the seven ~ 1 L - are a~Du~ed to be of l~ngth 5 ~5 mD or forty-si% DampleD However, for the l_st or -venth Dubframe, thQ end of D,bf updateD DUch a~ the ad~ptLve CO~ update and the updatQ of the loc_l ~hort term predictor tat~ vA-~Ahl~ ~re c~rried out only for a D~'~ leAgth of 5 5 mD or forty-four sampleD
The short term predictor FA ~- or lin-~r prediction fil~
ter p~ram ters are interpolated from 2lubf to m'f The lnterpolAtion iD c~rried out ln the a~ < ~l~tion dos~in The n~Arr--l{ -~ ~ lo~ tlon ?ff~Ci d-rived from th~ ne~
filt~r: ~''{r{~nt- for th~ D~ond llne_r ~_ '{~lon an~lyDi~ win~
dow _re denoted ~1- {~ for th~ pr~vlou~ ~0 m fr~me ~nd by {~2(1)} for th~ current 40 mD frame for O _i<10 with ~_1(0)-~2(0)-1 0 Then th~ lnterpolated ~.L~ Ation coef-fl~ients {~'m(~)} ~re then given by m(f)- 'm ~2(f)~[l~vm~ ~ l(f)~ 1 _m<7,0 < f~ 10, 2~ 65546 ~ wo 95/~824 p~.", . ~4~77 ;
or.in vector notation ~ m VmP2+~l~Vm~P~ m~7.
Here, vm is the interpolating weight for subframe m. The inter-polated lag~ {P~m~}~ are ~ub~e.~ tly con~,..LLad to the short tQrm pr~dlctor filter coQfficient~ {a'm( ~
Th~ choice of interpolating weight~ affect~ voica quality in thi~ mod~ ~iqn1f~c^ntly. For thi~ rea-on, they must be determined c~r~fully. The~ int~rpolating weightJ vm hav- beQn detormin~l for subfram~ m by m~n~m~z1n~ the mean ~qu~r~ error between ~ctual ~hort term ~pectral envelope Sm J(~) And the inturpolated short torm power ~pectral envelope S~m J(~) ov~r all speech frame~ J of a very large speech databa~e. ~n other word~, m is det~rmin~d by ~n~m~ 7ing E, ' ~j 21 l¦S,.,,t~)-S .,J~ 2dt,~.
IS the actual A..loc< .-lAtion: ~f~ for ~ ~f m in ~rame J ar- d~not~d by {~ J(k)}, th n by d~finitlon Sm,Jtw) ~ m J(k) e~~wk 0 ~ k -- 2~ --`~ . ` 21 65546 Woss/2ss24 ` ~ ` ;` r~ Q~77 Sub~tituting the abov~ ~quations into thQ pLe- '~n~ equation, it can b- ~hown thAt minimi2in~ Em is equivalent to min;miZinSJ E~m wher~ ~ m is giv~n by m J k~ [om,Jtk) ~' m,J(k)]2, or in vector notAtion ~ m ~ m,J~~ m,J I 1 2, wher~ p~l- ts the vector norm Sub~tltuting p ~ J
into the sboY ~qu~tion, dlffQrenti~ting with r~pect to vm and ~-ttln~ lt to 2~ero r-~ult~ in -Y~
~; lx~
wh-r~ SJ '2 J~ '-1 J 8nd ~,J 'm,J '-l,J and ' SJ,~,J
i- th- dot product b~tws~n v~ctor~ SJ ~nd ~m J The vslue~ of vm calculsted bY th~ aboY method u~ing a v-ry large ~p~qch databa~e ~r- furth-r fin- tun d by li~t-ning tQ~t~
I!h targ-t ~roctor taC for th adsptlYe ~ narch i~
r lat d to th- ~p -ch Y-ctor ~ in ~ach ~ ~ bY -~taCLZ
H r~ th- quar low~r t~^nrl~- toQplits mstrl~ who-~ first column contsin- th- i~pul~ re~pon~- o~ th- 1nt~pol~ted short t~ t^~ {8 D~(f)~ for th~ ~ ~ ~ snd ~ i~ the veceor rort~n~ng it~ z~ro input ~ n~- Th- tsrSI-t v-ctor taC L- most ~ily cslculat~ ubtr_cting th- s~ro lnput -a~ ~3 ~ ':om _ 29 --, .
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ wo 95/288z4 ! 2 1 6 ~ 5 4 6 ~ 77 the speech vector 8 and filtering the difference by the inver~e ~hort term predlctor with zero inlti_l state~.
The adAptive co~ search in adaptive ~o~ho~lrq 3506 and 3507 employ~ a spectrally weLghted mean ~quare error ~i to mea-3ure the diJtance between a candidate v~ctor rl and the target vector taC as given by ~ i ( tac~ r ~ ) W( tac~P ~rf ) -Here, ~'1 is the a~ociated gAin and ~ is the spectral weighting matri~ iJ a po~itive def initc symmetric toeplit2 matri~c that i~
d~riv~td from the truncated impulJ~ e of the ~ irJhtr~d ~hort t~rm predictor with fllter, ~f1~ t~ ~_ m(i)7 }. The ~, ~rJhtin7 f_ctor 7 iS 0.8. Sub~tituting for the optimum ~i in the abov~ e~preJsion, the distortlon term can be rewritten aJ
T t~l]2 i taCl~taC-.~

wher~ the correlatlon term t~C~Ilrl and ei i~ the energy term rlT~lrl. Only tho~e rAnrl~rlAte~A ar~ c~n~i~' ~ that have a po~ltlve corrnlation. ~he be~t candidate vector~ are the one~
that have po~itive correlations and thc highe~t value~ of t,$,2 ~1 wossl2ss2t i~ 2 ~ ~ ~ 5 4 6 F~ 'Ot577 i The c_ndldate vQctOr rl coLL~ dO to dlfferent pitch te-lays The~e pLtch del_ys in sample~ liQ in the rAnge t 20 ,146 1 Fraction-l pitch dQlays arQ possible but the fractioA~l part ~ is restricted to b~ either 0 00, 0 25, O SO or 0 75 The candidate vector ~OLL ~ n7 to an integer delay L is simply read from the vdaptive ~ o~ l~, which io A collection of the pAot excitttion sampleO For a mixed (intQger plu!v fraction) delay L+f the por-tion of the adAptive cod~ho 1 cQntered _round thQ Oection cor-responding to thQ integer dQlay L io f llterod by a polyphave f 11-tar c~LL~ nA~n~ to fr_ction f T- lete candidatQ vQctOr~
~;OIL v~ Aing to low dQlay VA-1UQJ 1Q~ than a suhfr_me length are complQted ln the same m~nn~r aO sugge~ted by J. C ` 1I Qt al ~uprA Th~ polypha~e fllt~r; ~ nts are derlved from a pro-tOtypQ low p o8 filter drsl~n~i to h_VQ good pa~QhAnA as well as good ~vL~,~b~nd ch racterl~tic~ ~_ch polyph_~e filter ha~ 8 tap~
Tha Ad_ptiv~ c~ Q_rch do~ not s~arch _11 candidate vectorJ For thQ f irst 3 0~ -, a 5-bit sQ_rch range is de-te~;nad by thQ tiQcond quantlzed op~n loop pitch eOtimate P 1 f th~ prevlou~ 40 mr framo _nd th~ flrtlt -nt~ e~ op~n loop pitch -tim_to P 1 of the curr~nt 40 mt~ fr~ If th~ prevlou~ ~od~
w~r~l B, th~n the Y_lUQ of P I 1- talcen to b~ thq la~t ~ ,bf L
pitch d-lay in th~ provlou_ fr_m~ ~or th~ t ~ D.'' -~1~ thi~
S-blt ~-~rch rangs i- d~ by th~ econd qu~nt i ~ ~ open loo~
pltch ~ti~te P 2 Of th~ current 4 0 m~ fr_mQ and th~ flr~t qu~n-tized opan loop pitch e~timAte P l of th~ current 40 m~ frA~
}ror th~ iir-t 3 ti~ this S-bit ~Arch r~nge i~ ~plit in:o 2 4-blt r_ng~ wlth aach r~ngQ c~ntara~A around P 1 and P 1 I f =

~ wo 9~/28824 6 ~ ~ 4 6 P ~ I, ., ~ ,~, 77 the~e two 4-bit r~nge~ overlap, then ~ ~Lngle 5-bit range ia u~ed which is centered around {P' l+P'1}/2. Similarly, for the laat 4 ~ hf --, this 5-bit s~arch range is split into 2 4-bit ranqes with each r~nge centered around P'l and P'2. If these two r-bit ranges overlap, then a single 5-bit range i~ used which is cen-tered ~round ~P'l+P'2}/2.
The search range sQlection also det~rmin~Q what fractional re~olution is needed for the clo~ed loop pitch. Thls de~ired fractional re~olution is deto~insd directly from the quantized open loop pitch estimat~s P' 1 and P~ 1 for the first 3 subframes and from P'l and P'2 for the la~t 4 8..hf ~. If the two deter-mining open loop pLtch ~timatQ~ ar- within 4 intQgQr del~y~ of Qach othQr re~ulting in a ~ingle 5-bit search rangQ, only 8 inte-g~r delay~ ~.. te~d around the mid-point are ~Qarched but frac-tional pitch f portion can ~sume valu~ of 0.00, 0.25, 0.50, or 0.75 and are th~..,fGl~ also searched. Thu~ 3 bit~ are u~ed to ~ncode the integer portion while 2 bit~ are u~ed to encode the fr~ctLonal portion of the clo~ed loop pitch. If thQ two determin-ing open loop pitch estimatQ~ arQ within 8 intQger dQlay~ of each other re~ulting in a ~ingle 5-bit ~arch rangQ, only 16 int~ger d l~y ~ round thQ mid-point aro ~Qarched but fractional pitch f portion can a~sumQ value- of 0.0 or 0.5 and are therefore al~o 8 ~ ~ ~ 1. Thu- 4 bit~ are u~ed to encode thQ intQger portion while 1 bit i~ u~Qd to encod~ th~ fraction~l portion of the clo~ed loop pLtch. If thQ two dQtP~in{n~ open loop pitch e~tinate~ are morQ than 8 integer dQlay~ apart, only lnteger d~l~y~ ., f~0.;
only, ~r~ rched in either the ~lngle 5-blt ~arch r~nge or the WO 95128824 1; ' ! .... 2 1 ~ 5 5 4 6 ~ ~ 1 / " ., s , 77 2 ~.-b$t search ranges tetermined. ThUR all 5 bits are spent in -l{n~ the integer portion of the closed loop pitch.
The ~earch c lr~i ty may be reduced in the ca~e of frac-tional pitch delays by first searching for the optimum inteqer delay ~nd ~earching for the optimum fractional pitch delay only in it~ n~j~hhorhr od. One of the 5-bit indice~, the all zero index, i~ c~ ~ for the all zero adaptivQ co~ m1~ vector. ~his is a~ -ted by trimming the 5-bit or 32 pitch delay search ranqe to a 31 pitch delay search range. A- indlcated before, the search i~ restricted to only positive correlatLon~ and the all zero index is chosen if no such positive correlation is found. Th~ adaptiYe co~ ol~ gain 18 d-tr~m{- ~ after s~arch by quantizing the ratio of the optimum correlation to thQ optimu~ energy u~ing a non-uniform 3-bit quantizer. Thi~ 3-bit quantizer only ha~ po~itive gain values in lt since only po~ltive gaLn~ are pos~ible.
Since delayed ~e~ ion i~ e~nployed, the adaptive codr~hoolr s-arch l,~l r~3 thQ two bQ~t pitch dQlay or lag candidates in all Lt~ . Purtl ~ for ,.~ '~ two to ~i~c, thi~ ha~ to be t~d for th~ two be~t target v~ctor~ by the two bQ~t s-t- of ~citation modQl F L d~riYud for the previou~
in the currQnt frame. ~rhi~ re-ult~ ln two be-t lag can-didat~ alld the as~ociated two adaptiYe ~ r gains for hl bf - on- and in four be~t lag c~ndidat~- and the a~ociated four adaptlve ~odn~ovl~ qain~ for "~bf J~ two to ~i~c at the end of th~ ~earch proce~. In each ca~, the targ-t vector for the flsed :: -':~`- i~ derived by ~ubtractinq th~ ~caled adaptive '~~ Dc'- v~ctor from the target for the ataptive co~ ook ~earch, (~ W095128824 ,: . 2 1 6 5 5 4 6 .~,1/U., _'0~577 . _ ~ i,',"
i-e-~ t~e ~ t~C-P Optropt~ where rOpt i~ the seleeted adaptive ho ~lr vsetor and Popt is the asrociated adaptlve cod~ho~
gain .
In mode A, the fix~d cod~hook eonsists of general excitation pulse shape~ eonstrueted from the dLserete Jinc and co c fune-tlons. The Jfne funetion i~ defLned ar Jlne~n) ' ~frn~,rn~ ~ n - O
~fne(0) - 1 n - O
~nd the co~c funetion i~ defLned ar coJc(n) . I-coJ(rn~ , n - O
~n COJC(0) ' 0 n - O
Wlth the~e d~fLnitions Ln mind, the g 1~-- ' exeltation pUlSQ
~haper are ~O..~.L ,.. Lol ar followr~
Zl ( n ) - A ~fnc( n ) I 1~ co~c( n+l ) ~ s l(n) - A Jfne(n) - B co!rc(n-l) The w~ight~ A and El nr~ eho~-n to ba 0.866 ~nd 0.5 respec-tLvely. With the Jfne and COJC f~ t~n~ timQ alignQd, they cor-rQspond to whnt is known a~ zfne ba~i~ f~nrt~^n~ sO(n). Inform~l i~t ning tQ-t~ ~how that ~ - r~fted pul-- shap~ improv~ voice uality of the ~ynt~ 7~ ~peQeh.
The fised ~ for mode A eon~i~t~ of 2 parts eaeh haYi:lg 45 VectOrJ. Th~ fir~t p~rt eonrirt~ of the pul~e rh~lpe z l(n-~S) and i~ 90 ~ample~ long. The ith veetor i~ ~imply the veetor t!at ~tart~ fro~ the ith c~ entry. The ~eond p~rt eon~i~t~ of pe rl(n-~S) ~nd ~ gO ~ple~ long. ~re ~gain, the W09S/28824 ~ 6 ~ o~ ~ 04 7, ~
ith vector i~ simply the vector that starts from the ith rod~hoo entry. ~oth c~.dPh~Qo~A are further trimmed to reduce all small valuus q~peci~lly near the beginning and end of both cod~hool~ to zero. In addition, w~ note that every even ~ample in either co~l~ho~ is identlcal to zero by definition. All this contribute~
to making the ,~,A~ho.~-~ very ~par~e. In addition, we note that both c~ rQ overlApping with ad~Acent vectors h~vinq all but on~ entry in common.
- The ovqrl Arp~n~ nature and th~ spAr~ity of the ~,o.lrho,~ are ~xploited in the co~l~ho~ arch which u~e- the 8A e di~tortion measure as in the adaptivQ coA~ search. This measure calcu-latQ~ the dl~tance between the fixed co~ target vector t~c ~nd every candidate fixed cod~ vector cl _-lSi ' t t~C-~ lCi ) W ( t~C-~ iCi ) Where W i~ the sAme spectral weight$ng mAtrix u~ed in the adaptive ~o~n~olc search And ~ the optimum value of the gain for that ith ~ lc vector. Once the optimum vQctOr ha~ been ~elected ~or each c~-~ol~, the ~ g~ln mAgnitude is quan-tized out~ide the ~e_rch loop by, i~ g the r_tio of thQ opti-mum corr~lation to the optimum energy by ~ non-uniform 4-bit qu~n-tiz~r in odd ~ nd a 3-bit dlfi~ AI non-uniform qu~n-tiz-r in n~en A--''' . E~oth q--nt~r~ h~ve z~ro gAin a~ on- of th ir entri~. The optimal di~tortion for each ~ th-n c~ lAted and the opti~al .ud~ s-le~te~.
The fixed c~ ol~ inde~c for each ~ in the r~nge 0-44 if th~ optimal c~ from ~ 1~n-45) but i~ mapped to :;
~ W095/28824 ~ ,`` r~ c~ol'77 the range 45-89 ~f the opti~l ~a~ on~ from zl(n-45) By com-bLnLng the fixed ~ hook indLces of two consecutive frames I and J_~ 90I+J, we can encode the re~ultlng index u~ing 13 bits This i~ done for 8 i~ -- 1 and 2, 3 And 4, 5 and 6 For ~ubframe 7, the fixed ~o~l~hook index i8 simply encoded u~ing 7 blts The fixed codebook gALn sLgn i~ encoded u~ing 1 bit Ln all ~
~ 'f ~. Th~ fLxed co~iAhook g~in mAgnLtude i8 encoded u~ing 4 bLts ln 8 h' - 1, 3, 5, 7 ~nd u~Lng 3 blt~ ln r~hf - 2, 4, Duu to delAyed ~e~ilTin~, there _re twa tArqet vector~ t8C for thQ fLxed cocl~ hont~ earch Ln the fLr-t ~ ~nding to the tra be~t l~g c~ndLdate- and theLr .c..... ~,,lLng gaLn~ prov$ded by the c~o-ed loop AdaptLve col~hook seArch. For ~-lhf ~~ two to ~-vQn, there Are four target vector~ c~ to the two be~t A-t~ of excitation model FAr Le,O det~ for the previous 8~ }f ~o far _nd to the two be~t lAAg cAndLd~te~ _nd their g~in~ provided by the ad~ptive ~ hook ~e~rch in the current 9 '' . The fixed co~hook ~e_rch i8 th~,efc ~ cArried out two tlme- Ln _ ~ ~ on and four tLme~ Ln ~--hf ~ two to ix 3ut th~ ty do-~ not ~-- -r- in ~ proportLon_t~ m~nner bec_u~e Ln e~ch _ ~ , the Qnergy ter~ c~!lllcl _re the ~e It i~
only t~ ~n~ Atinn term~ tT~C~ICl th,t _re ~t~f~'~ ~ Ln e~ch of th~ two ~ - -- for s~'' on~ and Ln e~ch of th~ four ~earche~
Ln ~1 ' - two to even Delayed JV Al~ earch helps to smooth the pLtch _nd gain CV~ -- ' A Ln _ C~P coder Delayed ~ i nn ia e~ployed in thi~
-- 3s --wo ssi~2ss24 ~- ? i -. - . 2 ~ 6 5 5 4 6 P~llu~, ~4~77 !
. .
invention in Duch a way that the overall codec delay is not in-creas~d Thus, in every subframe, the cloDed loop pitch search PLVI ~6i~ the ~ best estimates For each of the-e M best estimateS
and N best previou-D nl` f parameter~ IN optimum pitch gi~in indices, f i xed ~ h~nk LndiceD, f ixed ~od~ho~k gain indices, and fixed ~ h,o.~- gain DignD ~re derived At the end of the .~' , the~e ~N solutions are prunad to the L best using cumu-lative S~R for the current 40 m~ frame a~ th~ criteria Por th~
fir~t Dl ~ ~ ~2r ~1 and ~2 are u~-d ~or the laDt ~ hf ~2, N~2 and L~l aro UD~d I'or all other 8 ~hf c- -, 1~2, iN-2 and L-2 are used Tho delayed ~ inn approach i8 particularly ef-fectlve Ln the tran~ltlon of volced to unvoiced and un~roiced to volced r~gionD ThlD delayed ~le~ n i ,~ J~-l re~ultD ln N time~
th~ le~ity of the clo-ed loop pitch sQarch but much le~- than ~N times the ~ ty of the fix~d ' ':~' search in each ~ir ' Thl~ i~ becauDe only the correlatlon termi~ need to be calculated ~N time~ for the fixed codGhon~ in each Dubframe but thia energy terms need to be c~lculated only once Tho optlmal ~ ~L;~ for each L ` ~ are detr~ - I only at th~ end of th- ~.0 m~. frame u-lng ~_ '~~ Th~ pruning of ~1 ltir?n- to L ~1~1Ut;r~n~ 18 ~tored for e~ch ii ~f ~ to enable th~
trac~ bacle An exampl~ of how t ~ c ~ 1 { hr~ 3ho~rn in PIG 20 The dark, th~ck line lndlcate~ th~ optlmal path ob-t~ined by t~_- ' - after the la~t ~ r In mode 8, the quantization lndlce- of both set~ of ~hort t-r~ 1- llctor r- Le~.D are tran~mitted but not thQ open loop pltch e~timat~- Th- 40-mDec speech fra 1~ divlded ~nto five _ 36 --WO95/~8824 2 1 6 5546 P~ . c~ 77 B~ each 8 msec long. As ln mode A, an interpolated set o~
filtQr coefficients is used to derive the pitch index, pitch gain lntQx, fiXQd co~hoo~ indQx, and fixod cod~-ho~i~ gain index in a cloDed loop analysis by syntheDis f ashion . ThQ cloDed loop pitch search is unre~tricted in itD range, and only integer pitch delDy are searched. The fixed ~ D a multi-innovation co~ hool~
with zinc pulse section~ aD well aD Hadamard sections. The zinc pul~e sectionD are well suited for ~ n~ nt ~ while the .lAI'i~-. d 9ection-D are better DUitQd for unvoiced segmQnts. The f$xed cod~hool~ sQarch ~ iB '~fied to take advantage of this .
The higher ln-~ ty lnvolved a~ wall aD tha highQr blt rate of the short term predictor r L6~ in mode E iB ~-Dted by a slower update of the excit~tion model r- ~LD.
For mode ~, th~ 40 mD. Jpoech frame iD diYided into five Dubf -. ~ach subfrDme iB of length 8 mD. or sixty-four ~ampleD. The excitation model parameters in each subframe are the adaptive co~lAh>o~ lndex, th~ adaptive . oAnho~ gain, the fixed ind~, and the fi~c d ~ g~in. Ther- 1D no fiXQd codA~ r gain -Dlgn since it i-D alway- poDitiv~ Dt eD-timateD of thesa ~!- ' ar~ de~ - uDing ~n an~lyDiD by -DyntheDiD
method in each D~ ~ . The overall be~t s-ti~at~ iD determ~ ~Dd at the end of the 40 mD. framQ u~ing a delayed ~ approach Dimil~r to mods A.
The Dhort term predictor r~ te D or lin~ar prsdiction fil-tQr E~- L~ D are interpolated from D~'r to '' in the tlon lag domain. ~he r 1~ ~i cu~co~ tion lags -- 37 _ woss/2ss24 ` 2 ~ 65S46 ~"~, I 77 d-rived from thQ quantized fllter coeffLcient~ fo~ the second lin-~ar prediction ~naly~i~ wintow ~r~ denoted a~ ti)~ for the pre~ious 40 ms. frame. The co~ ... ~..ding lag~ for the fir~t and ~econd linear prediction analysis window~ for the current 40 mls.
f rame are denoted by { P 1 ( f ) } and { r2 ~ f ) ~ re~p~ctively . The - 1; 7~ tion ensure~ that ~ -1 ( ) ~1~ ) ~ 2 ( 0 ) 1- 0 ThQ
int~rpolated autocorrelation lags ~m(f)~ are glven by ~ m(f) ~m p~ )+om ~l(f)+[l-~m-tm]~2(i)~
l~m~-5, 0<~ 10 or in vector not~tion ~ m ~m ~-1+m ~l+tl-~m-t].~2 l< m~-s.
Here ~m and Pm are the interpolating weight~ for a~lb~ m.
Th~ interpolation lag~ {~ m(~)} ar~ ly ....~_ L~i to the ~hort term predictor filter - ~c~Pnt~ {a m(~)}.
Tho choice of interpolating wei~Jhts i~ not ~- critical in thl- mode ~ it i~ in mod- A. ~T~ , they h~v~ be-n deter-mined u~lng th~ 8~ ob~ective crlt~rla a~ in mode A ~nd fine tun-lng t~l~m by li~t~ning te~t~. Th- v~lue~ of "m and ~m whlch m~n~m~-- the ob~ective cr~teri~ ~m c~n be ~hown to be rmC-~B
c2 -AB
S C-r,l,A
_ 38 --W095128824 2 1 6 55 46 P~ 577 where A ~ J I I P-1,J-~2,Jl I
B - S I I ~_l,J-t2,J1 1 2 C - <~-l,J-'2,J~'l,J-'2,J ' Sm ~ ~ <~-l,J ~~2,J~'m,J -'2,J ' ~m "m,J -~2,J~l,J -~2,J ~
Ac before, ~ 1 J dQnote~ the Au~oc~ tion lag vQetor do-rivQd from thQ q ~-nti i filtQr coQffici L~ of the second lin~ar predlction analy~L~ window of fr~me J-l, '1 J dRnote~ the a,~o~Ll~latlon lag vector deriv~d from the quantized filter coef-ficient~ of the fir~t linQar prQdiction analy~is window of fralDe J~ ~2 J denote- th- ~U oc~L.9lAtion lag vQctor derivQd from the filtQr ~ ~ of the ~eond linear prediction ~n~ly~i~ window of frame J, and 'm J d not~- th~ ~ctual A t6~ _lAtinn l~g vQCtOr dQrived from thQ ~peQeh ~ample~ in ~ of frame J
Th~ Ad~ptiv~ CC~IA~L~O~ ~e~reh in modl~ B i~ ~imil_r to th~t in mod~ A in that th~ target veetor for th~ ~Q~rch i~ dQrived in the sam~ mA~n~r and th- di~tortion mea~ure u~ld in thQ ~e~rch i~ the ~am~ However, thero ar~ ~ome diffr--- ~. Only all integer piteh dQl~y- in th~ rang- [20,146] ar~ s-arehed and no fraetional _ 39 --woss/2ss24 ; 2~ 65546 r~l,. 01577 pLtch d~lay~ are searched A~ Ln mode A, only poDitive correla-tion~ are considered in the ~earch and the all z~ro index cor-r~pnn~i~ng to an all zero vector iJ assigned if no po~itive cor-relations are found The optimal adaptive cod~ho~l~ index is en-coded u~ing ~ bit~ The adaptive ~dn~on~- gain, whLch i8 guaran-teed to be po~itive, iD g ~nti ~1 outside the search loop u~ing a 3-bit non-uniform guantizer ThlD quantizer is diff~rent from that u~d in mod~ A
AJ in mode A, del~yed ttQ~f r~o'l i8 employed ~o that ~daptive ~oleho~ earch p vl.~ æe thQ two be~t pitch d~lay candidate~ in all Dl b) . In addition, ln 8~ ~ - two to flve, thlD ha~ to be ~ ' for the two b~t target vector~ ,,co~l by th- two be-t s-t~s of excitation model ~ t~ derived for the previou~
r-' - resulting in 4 set~ of adaptive ~ lndLces ~nd ~ociated gain~ ~t the end of th~ _ ~r . In o~eh c~-e, the targut vector for the fixed ~ earch iD derived by ~ub-tracting the ~caled adaptiYe co~t~ol~ vector from the t~rget of th~ adaptive ~ ' '- veetor Th~ fi~d .: -'-~` in mod~ a 9-bit multi-innovation co~nh~A~ with thre~ nn- Th~ fir~t i~ r' veetor sum ~ctlon and th~ ~eond and third ~ LL - ar- r-l~ted to gener~l-i~ d ~ t~ r pul~- ~hap~ z l(n) ~nd zl(n) rQ~pQetivQly The~e pu~ h~pe- h~ve been defined earlier Th~ fir~t ~eetion of thi~
:~ : and the a~oei~ted seareh ~ b~ed on the pub-lieation by D Lin ~Ultr~-~a~t CISLP Coding U~ing llultl C~ -hoo~
Innovation-~, ICASSP92 W~ notQ that in thl~ seetion, th~r~ are -- ~0 --wo 95n8824 . . 2 ~ 6 5 5 ~ 6 ~ ' 0 1 7, 256 innovatlon vectors and thQ se_rch p~oc~lu.~ gu_rantees ~ po5i-tiYe g_in The Decond _nd third DectionJ have 64 innov_tion vec-torD e_ch _nd thuir sQ_rch p.~ d~.~ can produce both positive ~5 wHll aD nQgAtive gains One - of the multi-innov_tLon ~o~hook is the deter-miniDtic vector-sum code conDL.~L~d from the Had_mard matrix Hm The codo vector of the vector-~um code a~ u~ed in this invention is ~ sed as .

UL ' S ~im v m~n),0 ~ ~15, .. 1 wher~ the ba_iD vector~ vmtn) are ~lhtA1n~ from th- rowD of th-P-' r~-SylveDter mAtrix and ~im ~ ~ 1 The ba~i3 vector~ Are D~lected ba~ed on a 2e r partition of th~ P-' -d mAtrix The cod- vectorD of th I - rd vector-~u~ _ ~' are v~lues and binary valu d cote ~s,~ e Cp~red to previou~ly con~id-ered Alg~'~rAic codes, the HadamArd vector-~um cod-s are con-~.a Lo~ to pOD~ mor- lde_l f , ~ r and ph~e char~cteri~-ticD ThL~ i~ due to the b_si~ v ctor p~rtition ~chem~ u~ed in thi~ r {~ for th~ ~A~- r~ m~tri~ which can be i.,L~ ed a~
unLorm 1 { g of th~ ord~red r rd matris row vec-tor~. In contr_~t, non-unlform F ,l{'"J m thod~ h~vo ~_ 1u {nf~-{gr ro~ult-.
The second section of th~ multi-innovation c~-: ~ conDist~
of the pula~ Dh_p- s l(n-63) and i~ 127 ~mple~ long Th~ ith v ctor of thLs ~-ction i~ ~imply th~ vector th~t ~t~rt- from the ith ntry of thLs ~ction Th~ thLrd s~ctLon consistD of th~

wo ss/2ss24 ~ 2 1 6 5 5 4 6 r~ m ~ ~4~ 77 pUl~Q shapQ z l(n-63) ~nd i8 127 ~ampleg long. HerQ i~gain, thQ
ith vQctor of thi3 ~ection is ~imply thQ vector that start~ from the ith entry of thi~ sQction. Both thQ sQcond and third section~
en~oy th~ adYant~qe~ of an oYerlapping naturQ ~nd spar~ity th~t can be exploited by the s~arch ~L~ Le ~utt as in thQ f Lxed co~ in mode A. A~ indlcated earlier, tho ~earch pr4~ e i~
not restrLctQd to pos$tive corrQlation~ and ~L~Lefore both posi-tiYQ a~ wQll as nQgativQ gains can re~ult in the second and third ~ction~ .
OncQ thQ optimum Yector ha~ boen ~el~-~ for each sQctLon, thQ ~o~rho~ gain magnitudQ is q---n~ 1 outsidQ thQ ~Qarch loop by ql~n~r~-~n~ thQ ratio of thQ optimum correlation to the optimus~
nQrgy by a non-uniform 4-bit q~,~nei~or in ~ ~. Thl~
quantiz~r i~ r~fff '~ for the fir~t ~ection whil~ thQ ~econd and third ~ections U~Q a common quant$zer. All ql~~nt~ ~or~ have zero gain a~ one of their entriQ~. Tho optimal di~tortion for e~ch ~ction is then calculated and th~ optim~l ~Qction is finally ~e-lec~ed .
Th~ fi~d c~l~ol~ ind~c for Q~ch ~ in thQ range 0-255 if th optimal ~ YQctor i~ from thQ Ur' rd s~ction.
If it is f~om ths z_l~n-63) ~ction and tho gain sign i~ po~itiYe, it i~ mapp~d to tho r~nqQ 256-319. ~t i~ from the z 1(n-63) ~c-tion and th~ gain ~ign i~ nQgatil~o~ it i~ mapp~d to the range 320-183. 1~ lt l- ~rr~3 t-- zl(n-~ ) ~ th- 9~ lgn l~ ltive, lt :-- WO 95128824 2 1 6 5 5 4 6 ~ / L~. ~ 77 io mapped to thQ r~ngo 384-447 ~f it i~ from the zl(n-63) ~ec-tion and thQ gain 3ign i~ nQgativQ, it i~s m~pped to the r~nge 448-511 The re~ulting index c~n be encoded u~ing 9 bits The fixed co~ho~L g~in magnitude i3 encoded u~ing 4 bits in ~11 5 hf ~ or modQ C, thQ 40 m~ frame i~ divid~d into five ~L": ~ a~
in mod~ 8 Each _ ~- i8 of lQngth 8 m3 or 64 O~mple~l The excit~tion modQl p~rameter~ in e_ch ~ ~re the ~daptive ~odnh~) index, thQ ad~ptive co~ gain, thQ fixed co~lAh~
index, and 2 fiXQd co~nhoo~ g~in-, one flxed ~od~ho^l~ gain being A--_ ~te~l with each half of the ~ubframe Both are gu r~nteed to be po~itivQ and ~ if~ there io no Oiqn infon~tion ~ociat-d with th m A~ in both mode~ A ~nd B, bQot estimate- of thnOe pa-t~ O ar~ A~tD~m1n~ uOing an ~nalysiO by D~ ~t.fl~l~ method in ~nch - Th~ overall b~ot e-tim~te i~ d~to~ir~ t thQ end of thQ ~0 m~ fr~m~ u~ing ~ del~yed ~ n method idQntic_l to that uo~d in mode- A and B
The ~hort term predictor p te~O or linear pr diction fil-t-r ~ L~n _re int^ pol~ted from a ~ ~ to _ ~' - in the c ~ lag domain in Qxactly the same m~nner _0 in modQ
B Howev~r, th~ Int~rr~latinq weight- ~ nd m a-r different fr th~t u~ d in mod~ B Th-y ~r obt~~~l by u~Lng the proc--dure '~ ~ ~ I for modQ B but u~ing various ~ ~ d noi~
ourc~- ~- t--a i n t nq materi~l .
Th~ _daptlY~ e_rch in mod- C 1- ~ al to that in mod B escept th_t both po~itive a- w ll ~- nQg_tive correla-tlons ~r~ ~llowed in the ~Qarch Th optim~l _daptive ~boo) index i- oncod d u-ing ~ bito ~h~ adaptlY ~ gain, which -- ~,3 --Woss/zss24 ~ - '; 2 ~ 6S546 ~ 4577 could be either posltLve or negative, l~ gllAnt~ -i outside the sQ~rch loop u~lng A 3-blt non-uniform quAntlzer. Thi~ quantizer i5 different from th_t usQd ln eithQr mode A or mode B Ln that it h_s a more re~tricted range And may have negative value~ as well.
By ~llowing both po~itive ~ ~ell _~ neg~tive correlation~ in the sQ~rch loop ~nd by having ~ qu~ntlzQr with ~ re~tr~cted dynamic range, periodic artifacts in the synthesized bA~-~,tLv.u~d noi~e due to the adAptlve co l~ho ~ _re reduced CAnAl~-rAhly. In fact, tho ~daptlvQ C~ Ol~ now beha~reA moro likQ _nother fixed co~iAhoolr.
A~ in mode A And mode B, delAyed ~s~ n i~ e~ployed And the adAptive ,~~ o~ ~e~rch ~ h.- ~ the twv be~t cAndidAte~ in _ll ~ ~ -. In ~dditlon, in L ' ~ - twv to flv~, thi_ ha~ to b~
rQpeated for the twv target vQctOr~ L--' ' by the two be~t s~t~
of excitAtion model rA te~ dQrived for the previou~ g~
re~ulting in 4 ~et~ of adaptive ~A~ ' indlce~ and a~-oci~ted g~ins at thu end of thQ s.~ . In each ca-e, thQ target vector for th~ fixed _c '~': :k ~earch i~ derived by ~ubtracting the ~caled ~d~ptivQ ' ' ~' vQctor from thQ t~rget of thQ adaptlvQ ^'-'~ )~
v~ctor.
Th~ fis~d ~ t in mod C 1- a 8-blt multi-innovatlon '~ '- and i~ 'IC'A1 to th~ v~ctor ~um s~ction in thQ n~od- B fl~t~d multi-innov~tion c~ -. ThQ ~e ~oarch pro-cQdurQ ~ e i in thQ public_tion by D . Lin ~Ultra-Fa~t CELP
Codinq U~ing Nulti-Codshool~ ~nnovation~, ICASSP92, i~ used here.
ThQr~ are 256 ~ ' vQctor~ and thQ soarch p v.~u.~ guar_ntees ~ po~itivo g_ln. ThQ flXQd c~le inde~ i~ Qncod~d u~ing 8 blt~ .
_ _ _ _ _ woss/2ss24 - 2 ~ 65546 r~ Sl?$~77 Once thQ optimum co~0~0~k vector ha- been selected, the opti-mum correlatlon and optimum energy are calculated for the first half of the 8 hf - a~ woll a~ the ~econd half of th~ nubframe separately The ratio of the correlation to the energy in both halve~ are guantized ~n~ r~nd~ntly using a S-blt non-unifor~ quAn-tizer that ha~ zero gain a~ one of it~ ontri-~ The u~e of 2 gain~ per 8 b~ en~ure~ a ~h~ e,.u~u.Lion of the back-qround noi~e Due to the delayed r~r,~r~ n, ther~ are two ~et~ of optimum fixed co~ hor~i~ indice~ and gain~ in ~ one and four ~t~ in two to five The delay~d d~ ~l^n ~ - in modQ C i~
n~ to that u~ed in other mode- A and B The optimal par_m-oter~ for ~ach ~ are ~ L ~-- at the end of the 40 m~
frame u~ing an identical t The bit allocatlon among variou~ p~ L61~ i~ _ ri7ed in Figure~ 21A and 21B for mode A, Ylgure 22 for mode B, and Flg~re 23 for mode C The-e p- ~ are packQd by the packing cir-cu$try 36 of Figure 3 Th ~e I L~c- ar- packed in the ~am~
a~ th-y ar~ tabulated in th~- Flgur~ Thu~ for mod~ A, u~ing the name notation a- in Flgur~- 21A and 21B, th y are packQd into a 168 blt ~ise packet every ~0 ms in thQ fsll ng seqUQnCes ~IODEl, ~SP2, ACGl, ACG3, ACG4, ACG5, ACG7, I~CG2, ACG6, PISCNl, PITC~2, AC~1, SIGNl, FCGl, ACI2, SIGN2, FCG2, ACI3, SIGN3, FC~3, ACI4, SIGN4, FCG4, ACI5, SIGNS, PCG5, ACI6, SIG~6, FCG6, ACI7, SIGN~, PCG7, FCI12, FCI34, ~CI56, AND FCI7 For mode ~, u~2nq th~
a notation a~ in Figur~ 21A and 21B, th~ ~ - L6.. ar- packed into a 168 bit ~is~ pack-t ev ry 40 m;c in the foll~ n~ ~equ-nce2 - ~5 --. _ _ _ _ _ _ _ _ _ _ _ wo ~sn8824 ! 2 1 6 5 5 4 6 r~ m '4'77 MODEl, LSP2, ACGl, ACG2, ACG3, ACG4, ACG5, ACIl, FCGl, FCIl, ACI2, FCG2, FCI2, ACI3, FCG3, FCI3, ACI4, FCG4, FCI4, FCI4, ACI5, FCGS, FCI5, LSPl, and MODE2. For mode C, using the ~ame notation a~ in Figures 21A and 21B, they are packed into a 168 bit size packet evQry 40 m~ in the following ~ MODE1, ~SP2, ACGl, ACG2, ACG3, ACG4, ACGS, ACIl, FCG2_1, FCIl, ACI2, FCG2_2, FCI2, ACI3, FCG2 3, FCI3, ACI4, FCG2_4, FCI4, ACI5, FCG2 S, FCI5, FCGl_l, FCGl 2, FCGl 3, FCGl 4, FCGl 5, and MOD~2. The packing ~-~u~ e ln all three mode~ is elesi~n~d to reduce the sensitivity of an ~rror in th~ mode bit~ MODEl and MODE2.
The p~ck$ng i~ done from the MSB or bit 7 to ~SB in blt 0 from bytQ 1 to byte 21. XODEl occ~r1~ the NSB or bit 7 of byte 1. By te~tLng thi~ blt, we can deter 1ne whether the - -~~p~ech belong~ to mode A or not. I~ it 1~ not mode A, we te~t th~
~ODE2 that o~c~ri~ the LSB or bit 0 of byte 21 to decide between mode B and modQ C.
The speech decoder 46 (FIG. 4) i~ ~hown in FIG. 24 and re-ceiv~ the ~ 9~ speech bit~tr-am in the same orm a~ put out by th~ speech ~ncoder of ~IG. 3. Th~ p~rameter~ ar~ ~nrac~
~fter ~ ning whoth-r th~ roceived mode bit~ ate a 1rJt mode (l~ode C), ~ ~cond mode ~lode 13), or ~ th$rd mode (Xode A).
The~ are then u~ed to D~ iZe the speech. Speech decoder 46 ~ynths~ the part of the ~ign~l c~.L~.~..1ing to the frame, ~ '1ng on the second ~et of filter coeffic$ent~, lnd~-p~n~ nt~y of the fir~t g~t of filter coefflc$ent~ ~md the fir~t and ~econd pitch e~timate~, when the f rame i~ dQto~1 n~d to be the 4 2 1 65546 ~ 77 fir~t mode (mode C); ~ynthesizQs the part of the ~ignal cor-re~pont;n~ to the fr~me, Aep~n~lin5~ on the fir~t and ~econd set~ of fllter coQfficient~, inA~ ~ tly of thQ fir~t and second pitch e~timates, when the frame is de~erm~ned to be the second mode (Mode B); and ~ynthe~i~es a part of the ~ignal c~L.. ~onding to the fram~, dep~"A~n~ on thQ ~-cond set of filter co~ffiri~Qts and the first and ~econd pitch e~timatQs, ~nAApAn i tly of the fir~t ~et of filter ~oeff~ nte, when the frame i~ det~in~d to be the third mode (mode A) In addition, thQ speech decoder receives a cyclic reA~ln~i~nry chQck (CRC) ba-ed bad framQ indicator from the channel decoder 45 (FIG 1) Thi- b~d fr~me indictor fl~g i~ used to trigger the bad frame error m~elking and error ~ ction~ (not ~hown) of th~
decoder The~H can ~l~o be ~ by some built-in error d~-tection ~chem~
Speech decoder 46 tQ~ts thQ ~SB or bit 7 of byte 1 to se~ if the - ~rel speech packet c~ o d~ to mode A OtherwiJe, th~ LS~I or bit 0 of byt~ 21 i- t~t d to ~e if the p~cket cor-r~ to mod- 8 or mod~ C Once thQ corr~ct mod~ of thQ ro-c-ived ~ peech pack~t i~ d~tn~m~-~, th~ }~ t~L~ of tho r~c~iv~d l~p~ch fr~me ar- ~, ' i and u~ed to ~yntheJize the ~peQch In ~ddition, th~ pe~ch decod r reCeivQ- a cyclic redun-d~ncy ch~ck (CRC) b~ed bad frame indicator from th~ channel de-coder 2S in l!'igure 1 Thi~ bad f rame indicator f lag i~ u~ed to trigg~r the b~d fr~m~ m~king and error L6C~ L.r portion~ of peech d-coder Th~ can al~o b~ ~ris, ~ by ~om~ built-in er-ror dQtectlon scheme~
- ~7 _ W0 sS/2ss24 ' ~ ' ~ 2 1 6 5 5 4 6 r~ c ~577 In mode A, the received ~Qcond set of line spectr~l fLe~ y indlee~ ~r~ used to reconstruct the qu~ntized fllter coeffLcients which then are converted to aucoc~r cl~tLon lags In e~ch ~l-h' ~~ the ~t~;c~-L,l~tion laq~ are interpolated using the same weight~ ~ u~ed Ln the encoder for mode A and then cu~cLLed to ~hort t-rm predictor filtor ~ fi~nt~ The open loop pitch indices ~IrQ .~ L~e1 to q -rlti - ~ open loop pitch value~ In ~aeh subframe, the~e open loop valuc-~ Ar~ us~d along with e~ch r~eeivod 5-bit adaptive - '-'- '~ inde% to ' ~^~{r^ the pitch do-lay candidate The ~daptiv~ co~ veetor CULL~ jn~ to thi~
dQl~y i~ de~ ' fr the adaptive ' -~ 10~ in Figur~ 24 The adaptivra c~1rho<,k g~in inde~c for e~ch ~.` '. is u~ed to ob-tain the adaptive c ~l~ galn whieh th~n i- ~pplied to the mul-tiplier 104 to ~eal~ the adaptive ~ veetor The fi~c~d v~etor for e~eh ~ubfr~me i~ irlf~rred from the fi~cQd 101 from the ~eeeived fi%ed ~ lr inde~c ~-oei~ted with that subfra~e ~nd thl- iS ~ealed by the ~ d co~nhool~ g~in, obt~1- ~ from th~ reeeiYc-d fi%~d ~ gnin ind~ nd the ~ign ind~c for thAt .,'f~ , by ~ultlpll-r 102 aoth the ~e~led adap-tiVQ c~ '- veetor ~nd tho ~eal~d fi%ed ~ '- vector are ~ummsd by u~m~r 105 to produce an ~elt~tlon ~ign~l whleh i~ en-hane-d by a plteh prefllter 106 a~ in L A Ger~on and M ~ Ja~uik, ~upr~ t~t1t n slgn~l i- u~ed to d~rivQ the hort term predietor 107 nd the ynt~ speech i5 e~ -ly further ~n~ ad by n glob~l pole-zero filter 109 with built in peetr~l tilt corr-etion ~nd enQrgy r~ z~tion At th~ end of eaeh D~' f~ , thl~ ad~pti~e e~ k iS upd~ted by W0 95/28824 - 2 1 6 5 5 4 6 r~ z,,s, ~ 1'77 the excLtatLon signal a~ indicated by the dotted line in ~lgure 25 .
In mode B, both ~et~ of line spectral frequency indices are used to recon~truct both the fir~t and second sets of quantized f$1ter ~o~ffl~iants whLch 8~ tly are converted to au~ tLon lags. In each Dl ` ' r the~e ~ltoc~ latLon l~g~ are interpolated u~ing exactly the ~ame weight~ aJ used in the encoder in mode B and then converted to short term predictor coeffi~-iants. In each subframe, the received adaptive co~lahoo Lndex i~ used to deriva the adaptLve cod~hoolr vector from the ~daptLve ~ ,ho L- 103 and the rec~Lved fLXQd ~ ~'~ '- index i~
used to derLve thQ fixed co~h~k gain indQx are used Ln each subf rame to retrievQ the adaptive ,~.h.~ gain and the f ixed cori~ho~r gain. The exeit~tion vQCtor L~ L~d by ~caling the adaptivQ -~ veetor by thQ adaptivQ col~hool~ gain u~ing multiplier 10~, Yealing thQ fixed ~vd~ho~O~ vQetor by the fix~d ~od~h~ok gain u~ing multiplier 102, and ~umming them using ~ummer 105: A- Ln mode A, thi- L~ i by th- piteh prQfilter 106 prior to ~..L'--i~ by thQ short te m predietor 107. ThQ synth2-~12ed ~p~Qeh i~ further ~nllr-~l ~ by th~ global polQ-zero po~tflltQr 108. At the end of e~eh - '' , thQ adaptLve h>o~ i- updated by thQ Qxeitatlon sLgnal a~ indie~ted by the dotted line in FlgurQ 2~.
In mode C, thQ reeeLved seeond ~et of lin~ 8p~etral f~
indiee~ arQ u~ed to reeonJtruet the qu~nt~ filter eoefficientJ
~hieh thQn are c~ ed to au~occ LL~,latlon lag~ . ~n each ' f , th~ ~- Locc ~ ~lation lag~ aro int~rpolatQd u~ing th~ Jame _ ~,g _ W095~28824 ; ~ 2 1 65546 r~ cl 77 w~ight~ a~ u~od in the encoder for mode C ant then converted to hort t~rm predictor filtQr coefficients In each subframe the received ataptive co~eho~k index i~ used to derive the adaptivQ
corlr~hook vector from the adaptive co~hool~ 103 and the received fixed ~ index i3 u~ed to derive thQ fixed codr~ho~l~ vector from the fixQd coARh~o~ 101 ThQ adaptivQ c~dr~h~k gain index and th~ fixed co~lrhoolc gAin indice~ are used in e~ch 3ubframe to re-tri~v~ the ad~ptive . ~ Ihc lc gain and the fixed _c~ - g~ins for both hAlve~ of thQ ~ The excitation vector is recon-~ by scaling thQ ~daptivs ~o~R~ook vector by thQ adaptivQ40dAl"oo~- gAin u~ing multiplicr 10J, llcalinq the fir~t h~lf of thQ
fl~ed ~ vQctOr by the fir~t fi~ed ~nl~oA~ g~in using ~ul-tiplier 102 and the s~cond half of the fl~ed ~ v~ctor by th~ ~econd fi~d co~J~hoolc g~in u-inq multipliQr 102, and ~ulmninq th~l scaled adAptiv~ ~nd fi~ed .~n~ok v~ctorJ u-ing ~ummer 105 As in mode~ A and B, this i~ ~nhAn~r~ by thQ pitch prefilter 106 prior thQ synthe~is by the ~hort t~rm prediceor 107 The ~ynthe-sized ~p~ch i- furehor a ~~ by the qlobal pol--zero postfilt~r 108 Th~ r ~ ArA of th ~ pitch prefiltQr and global po~t~llt~r u-ed in e~ch ~odQ ar~l dlfferQnt and are t~ilored to ~ch ~od . At th~ Qnd of each ~ ~ , th~ adaptiv~ iJ
upd~t-d by th~ e~cit~tion ign~l _- indicated by th~ dotted lino in Flgure 2~..
A- an_ltern~tiv~ to the illu~trAt~d 1 t, th~
n mAy be practiced wlth a ~hortQr fra~, ~uch a- ~1 22 5 m~
fr~e, a~ hoYn in Fig 25 With ~uch a fra~, it miqht b~
d~-irAhl~ to proce~- only one LP an_ly~i~ window p~r fra~

wos~/28824 2 1 ~546 Pcrlus9s/o~s77 in~tead of the two LP analysis windows lllustrated. The analysis window might begin after a duration Tb relative to the beginning of the current f rame and extend into the next f rame where the window would end after a duration Te relative to the beginning of the next frame, where Te ~ Tb In other wordJ, the total duration of an analysis window could be longer than the duration of ~
frame, and two consecutiYe windows could, therefore, encompas~ a particular frame. Thus, a current frame could be analyzed by processing the analysis window for the current frame together with the analysis window for the previous frame.
Thu~, the pref erred co~munic~tion sy~tem detects when nois~
i~ the pred i n~nt - t of a signal f rame and encodes a noise-predominated frame differently than for a speech-predomi-nated frame. Thls ~pecial ~n~-oA~ n~ for noise avoids some of the typical artLfacts produced when noi~e 1~ encoded with a scheme optimized for speech. This special ~ncoAing allow improved voice quality in a low rate bit-rate codec systQm.
Additional advantage~ and '{fic~tlon~ will re~dily occur to tho~e s3cillQd in the art. T~ invQntion in it~ broader aspects is therefor~ not limited to the spQcific dQta$1s, representative ap-par~tu~, and illu~trative example~ shown and de~cribed. ~arious modif ic~tion~ and Yariation~ can b~ made to the present invention ~ithout depa~tlnq from the ~cop~ or spir~t of the inventiorl, and it i~ intend~d that t~e pr~sent inYention cover the modifica~ions a~d ~ariAtion3 pro~ided thQ~ co3e with~n th6~ scope of ch~? 2ppende~1 c ~ ~ims and their equi~ent& .
et

Claims (12)

What is claimed is:
1. A method of processing a signal having a speech component, the signal being organized as a plurality of frames, the method comprising the steps, performed for each frame, of:
determining whether the frame corresponds to a first mode, depending on whether the speech component is substantially absent from the frame;
generating an encoded frame in accordance with one of a first coding scheme, when the frame corresponds to the first mode, and an alternative coding scheme, when the frame does not correspond to the first mode; and decoding the encoded frame in accordance with one of the first coding scheme, when the frame corresponds to the first mode, and the alternative coding scheme when the frame does not correspond to the first mode.
2. The method of claim 1 wherein the step of determining includes the substep of:
comparing an energy content of the frame to one or more thresholds.
3. The method of claim 1 wherein the step of determining includes to substeps of:
comparing an energy content of the frame to a one or more thresholds; and subsequently updating one of the thresholds, using the energy content, when the frame corresponds to the first mode.
4. The method of claim 1, wherein the determining step includes the substep of:
comparing a spectral content of the frame to a spectral content of a previous frame.
5. The method of claim 4 wherein the comparing step includes the substeps of:
determining a set of filter coefficients corresponding to the frame; and determining another set of filter coefficients corresponding to a previous frame.
6. The method of claim 1 wherein the determining step includes the substep of:
comparing a fundamental frequency of the frame to a fundamental frequency of a previous frame.
7. The method of claim 1 wherein the step of determining includes the substep of:
comparing a number of zero crossings of the frame to one or more thresholds.
8. The method of claim 1 wherein the step of determining includes the substep of:
measuring transitions in amplitude within the frame.
9. A method of processing a signal having a speech component, the signal being organized as a plurality of frames, the method comprising the steps, performed for each frame, of:
analyzing a first part of the frame to generate a first set of filter coefficients;
analyzing a second part of the frame and a part of a next frame to generate second set of filter coefficients;
analyzing a third part of the frame to generate a first pitch estimate;
analyzing a fourth part of the frame and a part of the next frame to generate a second pitch estimate;
determining whether the frame is a one of a first mode, a second mode, and a third mode, depending on measures of energy content of the frame and spectral content of the frame;
synthesizing a part of the signal corresponding to the frame, depending on the second set of filter coefficients and the first and second pitch estimates, independently of the first set of filter coefficients, when the frame is determined to be the third mode;
synthesizing the part of the signal corresponding to the frame, depending on the first and second sets of filter coefficients, independently of the first and second pitch estimates, when the frame is determined to be the second mode; and synthesizing the part of the signal corresponding to the frame, depending on the second set of filter coefficients, independently of the first set of filter coefficients and the first and second pitch estimates when the frame is determined to be the first mode.
10. The method of claim 9, wherein the determining step includes the substep of:
determining a mode depending on a determined mode of a previous frame.
11. The method of claim 9 wherein the determining step includes the substep of:
determining the mode to be the first mode only when the determined mode of a previous frame is either the first mode or the second mode.
12. The method of claim 9, wherein the determining step includes the substep of:
determining the mode to be the third mode only when the determined mode of a previous frame is either the third mode or the second mode.
CA002165546A 1994-04-15 1995-04-17 Method of encoding a signal containing speech Abandoned CA2165546A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US22788194A 1994-04-15 1994-04-15
US227,881 1994-04-15
US229,271 1994-04-18
US08/229,271 US5734789A (en) 1992-06-01 1994-04-18 Voiced, unvoiced or noise modes in a CELP vocoder

Publications (1)

Publication Number Publication Date
CA2165546A1 true CA2165546A1 (en) 1995-11-02

Family

ID=26921843

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002165546A Abandoned CA2165546A1 (en) 1994-04-15 1995-04-17 Method of encoding a signal containing speech

Country Status (7)

Country Link
US (2) US5734789A (en)
EP (1) EP0704088B1 (en)
AT (1) ATE202232T1 (en)
CA (1) CA2165546A1 (en)
DE (1) DE69521254D1 (en)
FI (1) FI956107A (en)
WO (1) WO1995028824A2 (en)

Families Citing this family (309)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ES2166355T3 (en) * 1991-06-11 2002-04-16 Qualcomm Inc VARIABLE SPEED VOCODIFIER.
TW271524B (en) 1994-08-05 1996-03-01 Qualcomm Inc
US5774856A (en) * 1995-10-02 1998-06-30 Motorola, Inc. User-Customized, low bit-rate speech vocoding method and communication unit for use therewith
CA2188369C (en) * 1995-10-19 2005-01-11 Joachim Stegmann Method and an arrangement for classifying speech signals
DE69629485T2 (en) * 1995-10-20 2004-06-09 America Online, Inc. COMPRESSION SYSTEM FOR REPEATING TONES
JP4005154B2 (en) * 1995-10-26 2007-11-07 ソニー株式会社 Speech decoding method and apparatus
FR2741743B1 (en) * 1995-11-23 1998-01-02 Thomson Csf METHOD AND DEVICE FOR IMPROVING SPEECH INTELLIGIBILITY IN LOW-FLOW VOCODERS
US5956674A (en) * 1995-12-01 1999-09-21 Digital Theater Systems, Inc. Multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels
US5689615A (en) * 1996-01-22 1997-11-18 Rockwell International Corporation Usage of voice activity detection for efficient coding of speech
US5774849A (en) * 1996-01-22 1998-06-30 Rockwell International Corporation Method and apparatus for generating frame voicing decisions of an incoming speech signal
JP3157116B2 (en) * 1996-03-29 2001-04-16 三菱電機株式会社 Audio coding transmission system
GB2312360B (en) * 1996-04-12 2001-01-24 Olympus Optical Co Voice signal coding apparatus
US5937374A (en) * 1996-05-15 1999-08-10 Advanced Micro Devices, Inc. System and method for improved pitch estimation which performs first formant energy removal for a frame using coefficients from a prior frame
US6047254A (en) * 1996-05-15 2000-04-04 Advanced Micro Devices, Inc. System and method for determining a first formant analysis filter and prefiltering a speech signal for improved pitch estimation
US5809459A (en) * 1996-05-21 1998-09-15 Motorola, Inc. Method and apparatus for speech excitation waveform coding using multiple error waveforms
US5751901A (en) 1996-07-31 1998-05-12 Qualcomm Incorporated Method for searching an excitation codebook in a code excited linear prediction (CELP) coder
US7788092B2 (en) * 1996-09-25 2010-08-31 Qualcomm Incorporated Method and apparatus for detecting bad data packets received by a mobile telephone using decoded speech parameters
JP2001501790A (en) * 1996-09-25 2001-02-06 クゥアルコム・インコーポレイテッド Method and apparatus for detecting bad data packets received by a mobile telephone using decoded speech parameters
US6014622A (en) 1996-09-26 2000-01-11 Rockwell Semiconductor Systems, Inc. Low bit rate speech coder using adaptive open-loop subframe pitch lag estimation and vector quantization
US5794182A (en) * 1996-09-30 1998-08-11 Apple Computer, Inc. Linear predictive speech encoding systems with efficient combination pitch coefficients computation
US6192336B1 (en) 1996-09-30 2001-02-20 Apple Computer, Inc. Method and system for searching for an optimal codevector
GB2318029B (en) * 1996-10-01 2000-11-08 Nokia Mobile Phones Ltd Audio coding method and apparatus
FI964975A (en) * 1996-12-12 1998-06-13 Nokia Mobile Phones Ltd Speech coding method and apparatus
US6148282A (en) * 1997-01-02 2000-11-14 Texas Instruments Incorporated Multimodal code-excited linear prediction (CELP) coder and method using peakiness measure
EP0976216B1 (en) * 1997-02-27 2002-11-27 Siemens Aktiengesellschaft Frame-error detection method and device for error masking, specially in gsm transmissions
JP3444131B2 (en) * 1997-02-27 2003-09-08 ヤマハ株式会社 Audio encoding and decoding device
US6167375A (en) * 1997-03-17 2000-12-26 Kabushiki Kaisha Toshiba Method for encoding and decoding a speech signal including background noise
US6064954A (en) * 1997-04-03 2000-05-16 International Business Machines Corp. Digital audio signal coding
KR100198476B1 (en) * 1997-04-23 1999-06-15 윤종용 Quantizer and the method of spectrum without noise
IL120788A (en) * 1997-05-06 2000-07-16 Audiocodes Ltd Systems and methods for encoding and decoding speech for lossy transmission networks
JP3206497B2 (en) * 1997-06-16 2001-09-10 日本電気株式会社 Signal Generation Adaptive Codebook Using Index
DE19729494C2 (en) 1997-07-10 1999-11-04 Grundig Ag Method and arrangement for coding and / or decoding voice signals, in particular for digital dictation machines
JP2001500285A (en) * 1997-07-11 2001-01-09 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Transmitter and decoder with improved speech encoder
WO1999003095A1 (en) * 1997-07-11 1999-01-21 Koninklijke Philips Electronics N.V. Transmitter with an improved harmonic speech encoder
US6058359A (en) * 1998-03-04 2000-05-02 Telefonaktiebolaget L M Ericsson Speech coding including soft adaptability feature
US6253173B1 (en) * 1997-10-20 2001-06-26 Nortel Networks Corporation Split-vector quantization for speech signal involving out-of-sequence regrouping of sub-vectors
US6006179A (en) * 1997-10-28 1999-12-21 America Online, Inc. Audio codec using adaptive sparse vector quantization with subband vector classification
US5966688A (en) * 1997-10-28 1999-10-12 Hughes Electronics Corporation Speech mode based multi-stage vector quantizer
US5999897A (en) * 1997-11-14 1999-12-07 Comsat Corporation Method and apparatus for pitch estimation using perception based analysis by synthesis
JP3357829B2 (en) * 1997-12-24 2002-12-16 株式会社東芝 Audio encoding / decoding method
US6470309B1 (en) * 1998-05-08 2002-10-22 Texas Instruments Incorporated Subframe-based correlation
JP3180762B2 (en) * 1998-05-11 2001-06-25 日本電気株式会社 Audio encoding device and audio decoding device
US6415252B1 (en) * 1998-05-28 2002-07-02 Motorola, Inc. Method and apparatus for coding and decoding speech
US6141638A (en) * 1998-05-28 2000-10-31 Motorola, Inc. Method and apparatus for coding an information signal
US6141639A (en) * 1998-06-05 2000-10-31 Conexant Systems, Inc. Method and apparatus for coding of signals containing speech and background noise
US6249758B1 (en) * 1998-06-30 2001-06-19 Nortel Networks Limited Apparatus and method for coding speech signals by making use of voice/unvoiced characteristics of the speech signals
US6453289B1 (en) 1998-07-24 2002-09-17 Hughes Electronics Corporation Method of noise reduction for speech codecs
JP4308345B2 (en) * 1998-08-21 2009-08-05 パナソニック株式会社 Multi-mode speech encoding apparatus and decoding apparatus
US6330533B2 (en) 1998-08-24 2001-12-11 Conexant Systems, Inc. Speech encoder adaptively applying pitch preprocessing with warping of target signal
US6493665B1 (en) * 1998-08-24 2002-12-10 Conexant Systems, Inc. Speech classification and parameter weighting used in codebook search
US6823303B1 (en) * 1998-08-24 2004-11-23 Conexant Systems, Inc. Speech encoder using voice activity detection in coding noise
US6449590B1 (en) * 1998-08-24 2002-09-10 Conexant Systems, Inc. Speech encoder using warping in long term preprocessing
US7117146B2 (en) * 1998-08-24 2006-10-03 Mindspeed Technologies, Inc. System for improved use of pitch enhancement with subcodebooks
US6104992A (en) * 1998-08-24 2000-08-15 Conexant Systems, Inc. Adaptive gain reduction to produce fixed codebook target signal
US6480822B2 (en) * 1998-08-24 2002-11-12 Conexant Systems, Inc. Low complexity random codebook structure
WO2000011649A1 (en) * 1998-08-24 2000-03-02 Conexant Systems, Inc. Speech encoder using a classifier for smoothing noise coding
US6507814B1 (en) * 1998-08-24 2003-01-14 Conexant Systems, Inc. Pitch determination using speech classification and prior pitch estimation
US6240386B1 (en) * 1998-08-24 2001-05-29 Conexant Systems, Inc. Speech codec employing noise classification for noise compensation
US7072832B1 (en) * 1998-08-24 2006-07-04 Mindspeed Technologies, Inc. System for speech encoding having an adaptive encoding arrangement
US6493666B2 (en) * 1998-09-29 2002-12-10 William M. Wiese, Jr. System and method for processing data from and for multiple channels
DE19845888A1 (en) * 1998-10-06 2000-05-11 Bosch Gmbh Robert Method for coding or decoding speech signal samples as well as encoders or decoders
US6463407B2 (en) 1998-11-13 2002-10-08 Qualcomm Inc. Low bit-rate coding of unvoiced segments of speech
JP3180786B2 (en) * 1998-11-27 2001-06-25 日本電気株式会社 Audio encoding method and audio encoding device
US6691084B2 (en) * 1998-12-21 2004-02-10 Qualcomm Incorporated Multiple mode variable rate speech coding
US6456964B2 (en) * 1998-12-21 2002-09-24 Qualcomm, Incorporated Encoding of periodic speech using prototype waveforms
US6311154B1 (en) 1998-12-30 2001-10-30 Nokia Mobile Phones Limited Adaptive windows for analysis-by-synthesis CELP-type speech coding
US6754265B1 (en) * 1999-02-05 2004-06-22 Honeywell International Inc. VOCODER capable modulator/demodulator
US6681203B1 (en) * 1999-02-26 2004-01-20 Lucent Technologies Inc. Coupled error code protection for multi-mode vocoders
EP1088304A1 (en) * 1999-04-05 2001-04-04 Hughes Electronics Corporation A frequency domain interpolative speech codec system
JP4218134B2 (en) * 1999-06-17 2009-02-04 ソニー株式会社 Decoding apparatus and method, and program providing medium
US6487531B1 (en) 1999-07-06 2002-11-26 Carol A. Tosaya Signal injection coupling into the human vocal tract for robust audible and inaudible voice recognition
US7092881B1 (en) * 1999-07-26 2006-08-15 Lucent Technologies Inc. Parametric speech codec for representing synthetic speech in the presence of background noise
DE69943185D1 (en) * 1999-08-10 2011-03-24 Telogy Networks Inc Background energy estimate
US6535843B1 (en) * 1999-08-18 2003-03-18 At&T Corp. Automatic detection of non-stationarity in speech signals
DE60043601D1 (en) * 1999-08-23 2010-02-04 Panasonic Corp Sprachenkodierer
DE69932460T2 (en) * 1999-09-14 2007-02-08 Fujitsu Ltd., Kawasaki Speech coder / decoder
US6782360B1 (en) * 1999-09-22 2004-08-24 Mindspeed Technologies, Inc. Gain quantization for a CELP speech coder
US6581032B1 (en) * 1999-09-22 2003-06-17 Conexant Systems, Inc. Bitstream protocol for transmission of encoded voice signals
US6604070B1 (en) * 1999-09-22 2003-08-05 Conexant Systems, Inc. System of encoding and decoding speech signals
US7315815B1 (en) 1999-09-22 2008-01-01 Microsoft Corporation LPC-harmonic vocoder with superframe structure
US6959274B1 (en) * 1999-09-22 2005-10-25 Mindspeed Technologies, Inc. Fixed rate speech compression system and method
US6438518B1 (en) * 1999-10-28 2002-08-20 Qualcomm Incorporated Method and apparatus for using coding scheme selection patterns in a predictive speech coder to reduce sensitivity to frame error conditions
GB2357683A (en) * 1999-12-24 2001-06-27 Nokia Mobile Phones Ltd Voiced/unvoiced determination for speech coding
WO2001052241A1 (en) * 2000-01-11 2001-07-19 Matsushita Electric Industrial Co., Ltd. Multi-mode voice encoding device and decoding device
US8645137B2 (en) 2000-03-16 2014-02-04 Apple Inc. Fast, language-independent method for user authentication by voice
WO2001078061A1 (en) * 2000-04-06 2001-10-18 Telefonaktiebolaget Lm Ericsson (Publ) Pitch estimation in a speech signal
EP1143414A1 (en) * 2000-04-06 2001-10-10 TELEFONAKTIEBOLAGET L M ERICSSON (publ) Estimating the pitch of a speech signal using previous estimates
WO2001084536A1 (en) * 2000-04-28 2001-11-08 Deutsche Telekom Ag Method for detecting a voice activity decision (voice activity detector)
US6564182B1 (en) * 2000-05-12 2003-05-13 Conexant Systems, Inc. Look-ahead pitch determination
US20020116186A1 (en) * 2000-09-09 2002-08-22 Adam Strauss Voice activity detector for integrated telecommunications processing
US6850884B2 (en) * 2000-09-15 2005-02-01 Mindspeed Technologies, Inc. Selection of coding parameters based on spectral content of a speech signal
US6842733B1 (en) 2000-09-15 2005-01-11 Mindspeed Technologies, Inc. Signal processing system for filtering spectral content of a signal for speech coding
US7457750B2 (en) * 2000-10-13 2008-11-25 At&T Corp. Systems and methods for dynamic re-configurable speech recognition
US6947888B1 (en) 2000-10-17 2005-09-20 Qualcomm Incorporated Method and apparatus for high performance low bit-rate coding of unvoiced speech
US7171355B1 (en) 2000-10-25 2007-01-30 Broadcom Corporation Method and apparatus for one-stage and two-stage noise feedback coding of speech and audio signals
CN1200403C (en) * 2000-11-30 2005-05-04 松下电器产业株式会社 Vector quantizing device for LPC parameters
US7472059B2 (en) * 2000-12-08 2008-12-30 Qualcomm Incorporated Method and apparatus for robust speech classification
US6633839B2 (en) * 2001-02-02 2003-10-14 Motorola, Inc. Method and apparatus for speech reconstruction in a distributed speech recognition system
DE60233283D1 (en) * 2001-02-27 2009-09-24 Texas Instruments Inc Obfuscation method in case of loss of speech frames and decoder dafer
US6658383B2 (en) 2001-06-26 2003-12-02 Microsoft Corporation Method for coding speech and music signals
US7526431B2 (en) * 2001-09-05 2009-04-28 Voice Signal Technologies, Inc. Speech recognition using ambiguous or phone key spelling and/or filtering
US7505911B2 (en) * 2001-09-05 2009-03-17 Roth Daniel L Combined speech recognition and sound recording
US7225130B2 (en) * 2001-09-05 2007-05-29 Voice Signal Technologies, Inc. Methods, systems, and programming for performing speech recognition
US7467089B2 (en) * 2001-09-05 2008-12-16 Roth Daniel L Combined speech and handwriting recognition
US7444286B2 (en) * 2001-09-05 2008-10-28 Roth Daniel L Speech recognition using re-utterance recognition
US7313526B2 (en) 2001-09-05 2007-12-25 Voice Signal Technologies, Inc. Speech recognition using selectable recognition modes
US7809574B2 (en) 2001-09-05 2010-10-05 Voice Signal Technologies Inc. Word recognition using choice lists
ITFI20010199A1 (en) 2001-10-22 2003-04-22 Riccardo Vieri SYSTEM AND METHOD TO TRANSFORM TEXTUAL COMMUNICATIONS INTO VOICE AND SEND THEM WITH AN INTERNET CONNECTION TO ANY TELEPHONE SYSTEM
US6785645B2 (en) 2001-11-29 2004-08-31 Microsoft Corporation Real-time speech and music classifier
TW589618B (en) * 2001-12-14 2004-06-01 Ind Tech Res Inst Method for determining the pitch mark of speech
US6647366B2 (en) * 2001-12-28 2003-11-11 Microsoft Corporation Rate control strategies for speech and music coding
US7206740B2 (en) * 2002-01-04 2007-04-17 Broadcom Corporation Efficient excitation quantization in noise feedback coding with general noise shaping
US7302387B2 (en) * 2002-06-04 2007-11-27 Texas Instruments Incorporated Modification of fixed codebook search in G.729 Annex E audio coding
JP4433668B2 (en) * 2002-10-31 2010-03-17 日本電気株式会社 Bandwidth expansion apparatus and method
WO2004084467A2 (en) * 2003-03-15 2004-09-30 Mindspeed Technologies, Inc. Recovering an erased voice frame with time warping
KR20050008356A (en) * 2003-07-15 2005-01-21 한국전자통신연구원 Apparatus and method for converting pitch delay using linear prediction in voice transcoding
US7596488B2 (en) * 2003-09-15 2009-09-29 Microsoft Corporation System and method for real-time jitter control and packet-loss concealment in an audio signal
US7412376B2 (en) * 2003-09-10 2008-08-12 Microsoft Corporation System and method for real-time detection and preservation of speech onset in a signal
US20050065787A1 (en) * 2003-09-23 2005-03-24 Jacek Stachurski Hybrid speech coding and system
US7426462B2 (en) * 2003-09-29 2008-09-16 Sony Corporation Fast codebook selection method in audio encoding
US7349842B2 (en) * 2003-09-29 2008-03-25 Sony Corporation Rate-distortion control scheme in audio encoding
US7325023B2 (en) * 2003-09-29 2008-01-29 Sony Corporation Method of making a window type decision based on MDCT data in audio encoding
US7283968B2 (en) 2003-09-29 2007-10-16 Sony Corporation Method for grouping short windows in audio encoding
FR2867649A1 (en) * 2003-12-10 2005-09-16 France Telecom OPTIMIZED MULTIPLE CODING METHOD
US8473286B2 (en) * 2004-02-26 2013-06-25 Broadcom Corporation Noise feedback coding system and method for providing generalized noise shaping within a simple filter structure
US7668712B2 (en) * 2004-03-31 2010-02-23 Microsoft Corporation Audio encoding and decoding with intra frames and adaptive forward error correction
US8712768B2 (en) * 2004-05-25 2014-04-29 Nokia Corporation System and method for enhanced artificial bandwidth expansion
US8788265B2 (en) * 2004-05-25 2014-07-22 Nokia Solutions And Networks Oy System and method for babble noise detection
JP5010823B2 (en) 2004-10-14 2012-08-29 三星エスディアイ株式会社 POLYMER ELECTROLYTE MEMBRANE FOR DIRECT OXIDATION FUEL CELL, ITS MANUFACTURING METHOD, AND DIRECT OXIDATION FUEL CELL SYSTEM INCLUDING THE SAME
US7707034B2 (en) * 2005-05-31 2010-04-27 Microsoft Corporation Audio codec post-filter
US7177804B2 (en) * 2005-05-31 2007-02-13 Microsoft Corporation Sub-band voice codec with multi-stage codebooks and redundant coding
US7831421B2 (en) * 2005-05-31 2010-11-09 Microsoft Corporation Robust decoder
KR101223559B1 (en) * 2005-06-24 2013-01-22 삼성에스디아이 주식회사 Method of preparing polymer membrane for fuel cell
US20100131276A1 (en) * 2005-07-14 2010-05-27 Koninklijke Philips Electronics, N.V. Audio signal synthesis
US8677377B2 (en) 2005-09-08 2014-03-18 Apple Inc. Method and apparatus for building an intelligent automated assistant
US7633076B2 (en) * 2005-09-30 2009-12-15 Apple Inc. Automated response to and sensing of user activity in portable devices
US8630849B2 (en) * 2005-11-15 2014-01-14 Samsung Electronics Co., Ltd. Coefficient splitting structure for vector quantization bit allocation and dequantization
KR100766896B1 (en) * 2005-11-29 2007-10-15 삼성에스디아이 주식회사 Polymer electrolyte for fuel cell and fuel cell system comprising same
MX2008009088A (en) * 2006-01-18 2009-01-27 Lg Electronics Inc Apparatus and method for encoding and decoding signal.
JP3981399B1 (en) * 2006-03-10 2007-09-26 松下電器産業株式会社 Fixed codebook search apparatus and fixed codebook search method
US20070188841A1 (en) * 2006-02-10 2007-08-16 Ntera, Inc. Method and system for lowering the drive potential of an electrochromic device
AU2011247874B2 (en) * 2006-03-10 2012-03-15 Iii Holdings 12, Llc Fixed codebook searching apparatus and fixed codebook searching method
ES2347825T3 (en) * 2006-03-20 2010-11-04 Mindspeed Technologies, Inc. ATTENTION OF THE TONE RECORD IN OPEN LOOP.
KR100900438B1 (en) * 2006-04-25 2009-06-01 삼성전자주식회사 Apparatus and method for voice packet recovery
US8712766B2 (en) * 2006-05-16 2014-04-29 Motorola Mobility Llc Method and system for coding an information signal using closed loop adaptive bit allocation
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
KR100788706B1 (en) * 2006-11-28 2007-12-26 삼성전자주식회사 Method for encoding and decoding of broadband voice signal
US20080129520A1 (en) * 2006-12-01 2008-06-05 Apple Computer, Inc. Electronic device with enhanced audio feedback
US7805308B2 (en) * 2007-01-19 2010-09-28 Microsoft Corporation Hidden trajectory modeling with differential cepstra for speech recognition
DE602008001787D1 (en) * 2007-02-12 2010-08-26 Dolby Lab Licensing Corp IMPROVED RELATIONSHIP BETWEEN LANGUAGE TO NON-LINGUISTIC AUDIO CONTENT FOR ELDERLY OR HARMFUL ACCOMPANIMENTS
JP5530720B2 (en) 2007-02-26 2014-06-25 ドルビー ラボラトリーズ ライセンシング コーポレイション Speech enhancement method, apparatus, and computer-readable recording medium for entertainment audio
US8977255B2 (en) 2007-04-03 2015-03-10 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
CN101308651B (en) * 2007-05-17 2011-05-04 展讯通信(上海)有限公司 Detection method of audio transient signal
US9053089B2 (en) * 2007-10-02 2015-06-09 Apple Inc. Part-of-speech tagging using latent analogy
KR101449431B1 (en) * 2007-10-09 2014-10-14 삼성전자주식회사 Method and apparatus for encoding scalable wideband audio signal
US8620662B2 (en) 2007-11-20 2013-12-31 Apple Inc. Context-aware unit selection
US10002189B2 (en) * 2007-12-20 2018-06-19 Apple Inc. Method and apparatus for searching using an active ontology
US9330720B2 (en) * 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US20090252913A1 (en) * 2008-01-14 2009-10-08 Military Wraps Research And Development, Inc. Quick-change visual deception systems and methods
US8065143B2 (en) 2008-02-22 2011-11-22 Apple Inc. Providing text input using speech data and non-speech data
US8996376B2 (en) 2008-04-05 2015-03-31 Apple Inc. Intelligent text-to-speech conversion
CN101261836B (en) * 2008-04-25 2011-03-30 清华大学 Method for enhancing excitation signal naturalism based on judgment and processing of transition frames
US10496753B2 (en) 2010-01-18 2019-12-03 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US8464150B2 (en) 2008-06-07 2013-06-11 Apple Inc. Automatic language identification for dynamic text processing
KR20100006492A (en) 2008-07-09 2010-01-19 삼성전자주식회사 Method and apparatus for deciding encoding mode
US20100030549A1 (en) 2008-07-31 2010-02-04 Lee Michael M Mobile device having human language translation capability with positional feedback
US8768702B2 (en) 2008-09-05 2014-07-01 Apple Inc. Multi-tiered voice feedback in an electronic device
US8898568B2 (en) * 2008-09-09 2014-11-25 Apple Inc. Audio user interface
US8583418B2 (en) 2008-09-29 2013-11-12 Apple Inc. Systems and methods of detecting language and natural language strings for text to speech synthesis
US8712776B2 (en) * 2008-09-29 2014-04-29 Apple Inc. Systems and methods for selective text to speech synthesis
US8676904B2 (en) 2008-10-02 2014-03-18 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US9959870B2 (en) 2008-12-11 2018-05-01 Apple Inc. Speech recognition involving a mobile device
US8862252B2 (en) 2009-01-30 2014-10-14 Apple Inc. Audio user interface for displayless electronic device
US8380507B2 (en) 2009-03-09 2013-02-19 Apple Inc. Systems and methods for determining the language to use for speech generated by a text to speech engine
US10706373B2 (en) 2011-06-03 2020-07-07 Apple Inc. Performing actions associated with task items that represent tasks to perform
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US10241644B2 (en) 2011-06-03 2019-03-26 Apple Inc. Actionable reminder entries
US10540976B2 (en) * 2009-06-05 2020-01-21 Apple Inc. Contextual voice commands
US9431006B2 (en) 2009-07-02 2016-08-30 Apple Inc. Methods and apparatuses for automatic speech recognition
US8682649B2 (en) * 2009-11-12 2014-03-25 Apple Inc. Sentiment prediction from textual data
US8781822B2 (en) * 2009-12-22 2014-07-15 Qualcomm Incorporated Audio and speech processing with optimal bit-allocation for constant bit rate applications
US8600743B2 (en) * 2010-01-06 2013-12-03 Apple Inc. Noise profile determination for voice-related feature
CN102687199B (en) 2010-01-08 2015-11-25 日本电信电话株式会社 Coding method, coding/decoding method, code device, decoding device
US8381107B2 (en) 2010-01-13 2013-02-19 Apple Inc. Adaptive audio feedback system and method
US8311838B2 (en) 2010-01-13 2012-11-13 Apple Inc. Devices and methods for identifying a prompt corresponding to a voice input in a sequence of prompts
US10553209B2 (en) 2010-01-18 2020-02-04 Apple Inc. Systems and methods for hands-free notification summaries
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
US10705794B2 (en) 2010-01-18 2020-07-07 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US10679605B2 (en) 2010-01-18 2020-06-09 Apple Inc. Hands-free list-reading by intelligent automated assistant
WO2011089450A2 (en) 2010-01-25 2011-07-28 Andrew Peter Nelson Jerram Apparatuses, methods and systems for a digital conversation management platform
US8682667B2 (en) 2010-02-25 2014-03-25 Apple Inc. User profiling for selecting user specific voice input processing information
US8713021B2 (en) 2010-07-07 2014-04-29 Apple Inc. Unsupervised document clustering using latent semantic density analysis
US8719006B2 (en) 2010-08-27 2014-05-06 Apple Inc. Combined statistical and rule-based part-of-speech tagging for text-to-speech synthesis
US8719014B2 (en) 2010-09-27 2014-05-06 Apple Inc. Electronic device with text error correction based on voice recognition data
US10515147B2 (en) 2010-12-22 2019-12-24 Apple Inc. Using statistical language models for contextual lookup
US10762293B2 (en) 2010-12-22 2020-09-01 Apple Inc. Using parts-of-speech tagging and named entity recognition for spelling correction
US8781836B2 (en) 2011-02-22 2014-07-15 Apple Inc. Hearing assistance system for providing consistent human speech
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US8990074B2 (en) 2011-05-24 2015-03-24 Qualcomm Incorporated Noise-robust speech coding mode classification
US10057736B2 (en) 2011-06-03 2018-08-21 Apple Inc. Active transport based notifications
US10672399B2 (en) 2011-06-03 2020-06-02 Apple Inc. Switching between text data and audio data based on a mapping
US8812294B2 (en) 2011-06-21 2014-08-19 Apple Inc. Translating phrases from one language into another using an order-based set of declarative rules
JP5752324B2 (en) * 2011-07-07 2015-07-22 ニュアンス コミュニケーションズ, インコーポレイテッド Single channel suppression of impulsive interference in noisy speech signals.
US8706472B2 (en) 2011-08-11 2014-04-22 Apple Inc. Method for disambiguating multiple readings in language conversion
US8994660B2 (en) 2011-08-29 2015-03-31 Apple Inc. Text correction processing
US8762156B2 (en) 2011-09-28 2014-06-24 Apple Inc. Speech recognition repair using contextual information
US10134385B2 (en) 2012-03-02 2018-11-20 Apple Inc. Systems and methods for name pronunciation
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US9280610B2 (en) 2012-05-14 2016-03-08 Apple Inc. Crowd sourcing information to fulfill user requests
US10417037B2 (en) 2012-05-15 2019-09-17 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US8775442B2 (en) 2012-05-15 2014-07-08 Apple Inc. Semantic search using a single-source semantic model
WO2013185109A2 (en) 2012-06-08 2013-12-12 Apple Inc. Systems and methods for recognizing textual identifiers within a plurality of words
US9721563B2 (en) 2012-06-08 2017-08-01 Apple Inc. Name recognition system
US9495129B2 (en) 2012-06-29 2016-11-15 Apple Inc. Device, method, and user interface for voice-activated navigation and browsing of a document
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
US9547647B2 (en) 2012-09-19 2017-01-17 Apple Inc. Voice-based media searching
US8935167B2 (en) 2012-09-25 2015-01-13 Apple Inc. Exemplar-based latent perceptual modeling for automatic speech recognition
EP2947650A1 (en) 2013-01-18 2015-11-25 Kabushiki Kaisha Toshiba Speech synthesizer, electronic watermark information detection device, speech synthesis method, electronic watermark information detection method, speech synthesis program, and electronic watermark information detection program
EP2954514B1 (en) 2013-02-07 2021-03-31 Apple Inc. Voice trigger for a digital assistant
US9977779B2 (en) 2013-03-14 2018-05-22 Apple Inc. Automatic supplementation of word correction dictionaries
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US10642574B2 (en) 2013-03-14 2020-05-05 Apple Inc. Device, method, and graphical user interface for outputting captions
US10572476B2 (en) 2013-03-14 2020-02-25 Apple Inc. Refining a search based on schedule items
US10652394B2 (en) 2013-03-14 2020-05-12 Apple Inc. System and method for processing voicemail
US9733821B2 (en) 2013-03-14 2017-08-15 Apple Inc. Voice control to diagnose inadvertent activation of accessibility features
AU2014233517B2 (en) 2013-03-15 2017-05-25 Apple Inc. Training an at least partial voice command system
WO2014144579A1 (en) 2013-03-15 2014-09-18 Apple Inc. System and method for updating an adaptive speech recognition model
CN105144133B (en) 2013-03-15 2020-11-20 苹果公司 Context-sensitive handling of interrupts
WO2014144395A2 (en) 2013-03-15 2014-09-18 Apple Inc. User training by intelligent digital assistant
US10748529B1 (en) 2013-03-15 2020-08-18 Apple Inc. Voice activated device for use with a voice-based digital assistant
WO2014197336A1 (en) 2013-06-07 2014-12-11 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
WO2014197334A2 (en) 2013-06-07 2014-12-11 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
WO2014197335A1 (en) 2013-06-08 2014-12-11 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
DE112014002747T5 (en) 2013-06-09 2016-03-03 Apple Inc. Apparatus, method and graphical user interface for enabling conversation persistence over two or more instances of a digital assistant
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
KR101809808B1 (en) 2013-06-13 2017-12-15 애플 인크. System and method for emergency calls initiated by voice command
AU2014306221B2 (en) 2013-08-06 2017-04-06 Apple Inc. Auto-activating smart responses based on activities from remote devices
US10296160B2 (en) 2013-12-06 2019-05-21 Apple Inc. Method for extracting salient dialog usage from live data
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US10592095B2 (en) 2014-05-23 2020-03-17 Apple Inc. Instantaneous speaking of content on touch devices
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US10078631B2 (en) 2014-05-30 2018-09-18 Apple Inc. Entropy-guided text prediction using combined word and character n-gram language models
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
AU2015266863B2 (en) 2014-05-30 2018-03-15 Apple Inc. Multi-command single utterance input method
US10289433B2 (en) 2014-05-30 2019-05-14 Apple Inc. Domain specific language for encoding assistant dialog
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US10659851B2 (en) 2014-06-30 2020-05-19 Apple Inc. Real-time digital assistant knowledge updates
US10446141B2 (en) 2014-08-28 2019-10-15 Apple Inc. Automatic speech recognition based on user feedback
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10789041B2 (en) 2014-09-12 2020-09-29 Apple Inc. Dynamic thresholds for always listening speech trigger
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US10552013B2 (en) 2014-12-02 2020-02-04 Apple Inc. Data detection
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US9467569B2 (en) * 2015-03-05 2016-10-11 Raytheon Company Methods and apparatus for reducing audio conference noise using voice quality measures
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US10567477B2 (en) 2015-03-08 2020-02-18 Apple Inc. Virtual assistant continuity
US9899019B2 (en) 2015-03-18 2018-02-20 Apple Inc. Systems and methods for structured stem and suffix language models
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US10127220B2 (en) 2015-06-04 2018-11-13 Apple Inc. Language identification from short strings
US10101822B2 (en) 2015-06-05 2018-10-16 Apple Inc. Language input correction
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US10255907B2 (en) 2015-06-07 2019-04-09 Apple Inc. Automatic accent detection using acoustic models
US10186254B2 (en) 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US20170069306A1 (en) * 2015-09-04 2017-03-09 Foundation of the Idiap Research Institute (IDIAP) Signal processing method and apparatus based on structured sparsity of phonological features
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
DK179309B1 (en) 2016-06-09 2018-04-23 Apple Inc Intelligent automated assistant in a home environment
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10586535B2 (en) 2016-06-10 2020-03-10 Apple Inc. Intelligent digital assistant in a multi-tasking environment
DK201670540A1 (en) 2016-06-11 2018-01-08 Apple Inc Application integration with a digital assistant
DK179415B1 (en) 2016-06-11 2018-06-14 Apple Inc Intelligent device arbitration and control
DK179049B1 (en) 2016-06-11 2017-09-18 Apple Inc Data driven natural language event detection and classification
DK179343B1 (en) 2016-06-11 2018-05-14 Apple Inc Intelligent task discovery
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
WO2018133951A1 (en) * 2017-01-23 2018-07-26 Huawei Technologies Co., Ltd. An apparatus and method for enhancing a wanted component in a signal
DK179745B1 (en) 2017-05-12 2019-05-01 Apple Inc. SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT
DK201770431A1 (en) 2017-05-15 2018-12-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
CN110782906B (en) * 2018-07-30 2022-08-05 南京中感微电子有限公司 Audio data recovery method and device and Bluetooth equipment

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4058676A (en) * 1975-07-07 1977-11-15 International Communication Sciences Speech analysis and synthesis system
US4771465A (en) * 1986-09-11 1988-09-13 American Telephone And Telegraph Company, At&T Bell Laboratories Digital speech sinusoidal vocoder with transmission of only subset of harmonics
US5125030A (en) * 1987-04-13 1992-06-23 Kokusai Denshin Denwa Co., Ltd. Speech signal coding/decoding system based on the type of speech signal
JP2609752B2 (en) * 1990-10-09 1997-05-14 三菱電機株式会社 Voice / in-band data identification device
US5293449A (en) * 1990-11-23 1994-03-08 Comsat Corporation Analysis-by-synthesis 2,4 kbps linear predictive speech codec
US5495555A (en) * 1992-06-01 1996-02-27 Hughes Aircraft Company High quality low bit rate celp-based speech codec
US5341456A (en) * 1992-12-02 1994-08-23 Qualcomm Incorporated Method for determining speech encoding rate in a variable rate vocoder
US5459814A (en) * 1993-03-26 1995-10-17 Hughes Aircraft Company Voice activity detector for speech signals in variable background noise

Also Published As

Publication number Publication date
WO1995028824A2 (en) 1995-11-02
FI956107A (en) 1996-01-08
ATE202232T1 (en) 2001-06-15
WO1995028824A3 (en) 1995-11-16
EP0704088A1 (en) 1996-04-03
DE69521254D1 (en) 2001-07-19
EP0704088B1 (en) 2001-06-13
US5596676A (en) 1997-01-21
FI956107A0 (en) 1995-12-19
US5734789A (en) 1998-03-31

Similar Documents

Publication Publication Date Title
CA2165546A1 (en) Method of encoding a signal containing speech
EP2154679B1 (en) Method and apparatus for speech coding
US5732389A (en) Voiced/unvoiced classification of speech for excitation codebook selection in celp speech decoding during frame erasures
Spanias Speech coding: A tutorial review
US6480822B2 (en) Low complexity random codebook structure
CN1112671C (en) Method of adapting noise masking level in analysis-by-synthesis speech coder employing short-team perceptual weichting filter
US5699485A (en) Pitch delay modification during frame erasures
US20020016711A1 (en) Encoding of periodic speech using prototype waveforms
KR19990006262A (en) Speech coding method based on digital speech compression algorithm
EP1420391B1 (en) Generalized analysis-by-synthesis speech coding method, and coder implementing such method
JPH10207498A (en) Input voice coding method by multi-mode code exciting linear prediction and its coder
US20030195746A1 (en) Speech coding/decoding method and apparatus
EP1204092A2 (en) Speech decoder capable of decoding background noise signal with high quality
CN1113586A (en) Removal of swirl artifacts from CELP based speech coders
Yong et al. Efficient encoding of the long-term predictor in vector excitation coders
Yeldener et al. A mixed sinusoidally excited linear prediction coder at 4 kb/s and below
Miki et al. A pitch synchronous innovation CELP (PSI-CELP) coder for 2-4 kbit/s
Burnett et al. A mixed prototype waveform/CELP coder for sub 3 kbit/s
Juan et al. An 8-kb/s conjugate-structure algebraic CELP (CS-ACELP) speech coding
Ma et al. 400bps High-Quality Speech Coding Algorithm
JPH06130996A (en) Code excitation linear predictive encoding and decoding device
KR100269357B1 (en) Speech recognition method
Taniguchi et al. Principal axis extracting vector excitation coding: high quality speech at 8 kb/s
Jung et al. A cascaded algebraic codebook structure to improve the performance of speech coder
KR100346732B1 (en) Noise code book preparation and linear prediction coding/decoding method using noise code book and apparatus therefor

Legal Events

Date Code Title Description
EEER Examination request
FZDE Discontinued