CA2165546A1

CA2165546A1 - Method of encoding a signal containing speech

Info

Publication number: CA2165546A1
Application number: CA002165546A
Authority: CA
Inventors: Kumar Swaminathan; Kalyan Ganesan; Prabhat K. Gupta
Original assignee: Individual
Current assignee: DirecTV Group Inc
Priority date: 1994-04-15
Filing date: 1995-04-17
Publication date: 1995-11-02
Also published as: WO1995028824A2; FI956107A; ATE202232T1; WO1995028824A3; EP0704088A1; DE69521254D1; EP0704088B1; US5596676A; FI956107A0; US5734789A

Abstract

A method of encoding a signal containing speech is employed in a bit rate Codebook Excited Linear Predictor (CELP) communication system. The system includes a transmitter that organizes a signal containing speech into frames of 40 millisecond duration, and classifies each frame as one of three modes: voiced and stationary, unvoiced or transient, and background noise.

Description

wo gsl28824 2 1 6 5 5 4 6 A ~I~,J.~, _,10 1_77 METHOD OF ENCODING A SIGNAL CONTAINING SPEECH
BACKGROUND OF THE INVENTION
Fi~ld of th~ ~nv~ntion q~he pr~ent 1,~ n ~ 1 ly relate~ to a ~othod of encod-lnq ~ ~Lgn~l cont~ining ~peech ~nd more part1r~ y to ~ method ~ploylng a line~r pr~dictor to encod~ a ~lqn~l.
De~crlDtion of the Related ~rt A ~odern _ Ir~tlon technique e~ploy~ a C~ Excited L~ln~ns Pr~dictLon (C~P) coder. Th~ c~ 1 a t~_le r~ ini~q nrclt~tlon vnctOr~ for ~ nS~ by ~ lln ~r pr~dic-tlv~ fLlter. ITho t~chnigue lnvolv~ p~stltlonLng an lnput ~ign~l lnto ~ultlpl~ portLon~ ~nd, for ~ch portion, ~~~rrhi-~g tho for the v~ctor th~t ,~r lu ~ ~ filter output slgnal th~t i~ clo~e~t to the lnput ~lgn~l.

~ ` f ~ 2 1 6 55 46 wo s~2ss24 1 ~I/L~ _ 1577 Tha typlc~l CI~P technique may di-tort portion~ of the input 3ignal dominAted by noiDe becauDe the ~ el~ ~nd thQ linear pre-dictivQ filtQr thAt may be optimum for ~peech m~y be inappropri~te f or noi n~T~ r~ smQ~ o~
~ t i~ an ob~-ct of thQ pre~ent Lnv-ntlon to provlde ~ method of ~nro~l~ng _ ~Lgn~l containlng both Dpeech _nd noiDe whlle avoiding ~om~ of the di~tortionD irL ~l. ~d by typical CEI,P encod-ing techniquQD
Additional ob~ectives And advantAge~ of thQ invention will b~
~et forth in the deDcription th_t follows _nd in pArt will be ob-ViouD from the deocrLption, or ~y be le_rned by practlc~ of th~
invQntiOn ThQ ob~ect- and advAnt~guD of the inv~nt$on m~y be and att ined by meanD of the irD~ -Al~tie~ and combi-n_tion3 p~rt~ lA~ly pointed out ln the ~E~ ' claimD
To _chlav~ th ob~ectD And in ~ r~ wlth the purpo~ of thu inv~ntlon, _~ d And broadly ~ hQr in, ~ method of pro~n~ n~ a ~ l havlng ~ peech ,t, th~ ign~l being org~nizod a- a plur~llty of frcm~-, 1D u- d Th~ mQthod compri~-~
thQ ~t-p~ ' for each fr~me, of dQt-~m~n~-~ whQthQr the frAme ~ y -~ to a firDt mode, ~ q on whether the spQech AI~t1Ally ~bDent from th- fr~me~ g-n~r~tlng an ncod~d fr~e in ~~: - with one of a firDt coding Dcheme, when thQ frAme c~ 1D to the fir-t mode, and A Decond coding ~ch~m~ when th~ fr~me doeD not ~ Cy~A~ to th~ firDt mode; and dc~ o~1ng the encoded frame in ~c ~ - e with on~ of th~ fLr~t .

woss/2ss24 r~ 5~0l 77 codlng ~cheme, when the fr~me C~IL~ to the ~Ir-t mc~é, ~nd thQ ~econd codlng ~cheme when the fr~me doe~ not COL' ~YC,A.'I to the fir-t mod~
Rl2T1~P r ~ ~ o~ T~S DR~DGS
~ he forqgo;n~ And other ob~ect-, Aspect- ~nd _dv_nt~qe- will be ~atter u~d~L~L~ from the followlnq det~iled de-cription of ~
preferr~d ` ~ L of the invention wlth reforence to the drav-inqs, in which I
FIG l 18 _ block di_qram of a tr~n~mitter in ~ wlrele~ com-munic_tion sy~tem Acc~r~i{nq to a pr~ferred A ' ~ t of the in-v~ntion;
~ IG 2 is ~ block di~gr~m of ~ receiver in ~ wir~la~- com-munic_tion ~y~tem Accor~l1n~ to the p.~f.L._d ~ i t of the invention;
FIG 3 i- block diAgram of th- encoder in the tran-mitter Jhown in FIG . l;
FIG 4 i- ~ bloc~c dlagr~m of the decod~r in the receiv-r shown in FIG. 2 ~ TG 5A i~ a ti~ng dlagrA showing th~ Alla t of linear predictlon ~m~ly~s window- in th~ encoder shown ln FIG 3;

; `;- `~ 2 ~ 65546 WO95/28824 p~,""~ c~o1-77 rIG~ 5~ timing dl_grA~ ~howLng the ~ , t of pit~h prediction ~n~ly~i~ windows for open loop pitch prediction in the encoder Yhown Ln FIG 3;
FIG 6 and 68 _re a f lowchart illustr_ting the 26-blt line spectral ~ vector quAnti2atlon proce-- performed by th-encoder of l! ~G 3;
FIG ~ is a flowchart illustrAting the op~_tinn of ~ pitch tr~l cklng algorithm;
FTG 8 i~ _ block diagra~ showing in more det_il the open loop pitch e~tlm~tion of the encoder shown in FIG 3;
FIG g i- a f ~ t illu~tr~ting th- oper~tion of thn modi-fied pitch i 'ng algorithm i ,1~ by th- op~n loop pitch ~tim tion ~hown in F$G B;
PIG 10 i~ _ fl~ t ~howing the ~__ m~ ' -9 ~ ~ r - by the mode i~t^~m~nA~ n module ~hown in ~IG 3;
FIG 11 is a dataflow di_gra~ showing a part of the proce~-ing of a ~tep of det~ininq spectr_l ~tationarity ~r~lue~ shown ir~
FIG 10;

wo ss/zssz4 Pcr/usss/04s77 ~IG 12 1- a dataflow diagram showing anothQr part of the ~e~-in~ of the step of det~ininq spectral statlonarity v~l-u~;
FIG 13 18 a dataflow diaqram showing ~nother part of the proces~ing of the ~tep of det~"nin;nq ~pectral ~t_tlonarity val-u~ 5 FIG 14 i~ a dataflow diagram ~howing th~ pro~ nq of the stop of det~ n;~J pltch stationarity value~ ~hown in FIG 10;
FIG 15 is a ~A~fl~ dlagram showlng the pro~a~ln~ of the ~t-p of g~nerating z~ro cro~ing rat~ valu~ ~hown ln FIG 10;
FIG 16 is a dataflow dl_gram showlng th~ p~u~e~~~nq of the ~tep of det~n~q level grA~i~^nt value~ ln YIG 10;
FIG 17 1~ a d~t~ dlagram showing tho p,~c ~-in7 of tha _top of date~n~ng Ahort-t~rm energy value- ~hown in FIG 10;
~ IGS. 18~, 18B and 18C are a fl~ t of detn~in~n~ the moda b~- ~d on th~ ~ U~d value- a~ hown in YIG 10;
FIG. 19 i- a S~locl~ dlagram showing in mor~ det~il the ~ tlon of th~ e~ccltatlon l~ng c~rcultry o~ the encodet ~hown in PIG 3;
_ 5 _ 2 1 6 ~ 5 4 6 w0 ss/2ss24 r~l~L ./~ ~s77 PIGS 20 1J a diagram lllustratLng a proce~Lng of the ~ncod~r ~how Ln FLg 3;
FIGS 21~ ant 21B are a chart of speech coder ~ ~er~ for mod~ A;
FIGS 22 LJ a chart of ~peech coder parameter~ for mode A;
FIG 23 L~ a chart of spe~ch coder paramet~r~ for mode A;
~ IG 24 Ls a block dLagram Lllu~tratlng a ~_ _ e ~ i nq of the ~peech decoder ghowA ln FIG 4; and PIG 25 Ls a timing diagram showing ~n alternative ~1~, t of llnear predictlon analy~l~ window-~n DEscRIPq!~ON OF A r~rSr~, ~M~nr~T~vuq~ OF ~HE lh.r~
FIG 1 ~how~ the tr~n~mitter of the i.,af~ tion~y~t~ Analoq-to-dlgltal (AtD) ~ ,La~ 11 Rample- analog ~peech fro~ a t~lq~h~ - hand-~t at an 8 1~}~ rate, ~_,L. to digltal value- and tupplie~ the dlgital v~lue- to the speech en-cod~r 12 Channel encoder 13 further ~ncode~ th~ signal, a~ may be requlred ln a digltal ~ r ~ 1 rtlom~ ~y tem, and ~p-pll~ a r~ultlng encoded bit ~tr~am to a modulator 14 Digital-to-~n~log (DtA) converter 15 c~ L~ the output of th~ modulator wo g5n8824 P~
1~, to Ph_~- Shit ~ying (PS~) ~ignal~ Radlo fr~ (RFl up cv ~ .L&r 16 amplifLe~ and fL~q,_n ~ multiplie~ the PS~ ~iignals and ~upplie~ thQ amplified ~lgnal~ to anttinna 17 A low-pa~, AntiAliA~i"q, filtQr (not thown) filt-r~ tho ~na-log speech signal input to A/D converter 11 A high-pa~ cont ordQr blqu~d, filter (not ~hown~ filter~ th~ digitized ~ample~
fsom A/D Co~, LLt ll Th- tran~f~r function i~
l 2z-1 +z-2 HE~p(Z) ' 1 -1 . 8891Z-i +0 . 89503Z-2 The hiqh pa~i filt~r attQnuate~ D C or hum contamination nay occur in the i n~ -q ~peech sign~l FIG 2 Hhow~ th~ receivQr of tho L_~f3'_ld ~Ation Jy~~
tem RF down CV~ LL~ 22 receive~ a ~ignal from antQnna 21 and hoteLv~ tho ~ign_l to An i I~te -tL~.~ !) . A/D
cv ~ LL r 23 cv ~, L~ the ~F signAl to ~ digital bit ~tre_m, znd ~d 1 Ator 24 ' ' 1 Ate~ the re~ulting ~it ~tre~m At thi~
point the reVQr~Q of the ;~i~7 proce~ ln th- trAn~mitter talc~
plac- Ch_nn~l decodQr 2S _nd ~pe-ch d~cod~r 26 p~rform '-- 'ing O/A cv,~Les 27 ,~ ~e-i--- _mllog ~p~ch from th~ output of thQ
~peech decoder ISuch of th~ p~cer~ hed in thi~ ~! f ~Ation i~
f ' by a guneral purpo~ ~ign_l ~ a ;"~ progrAm DL~t t~ To facilitate a de~cript$on of th- ~ .f~L..I com-munic~tlon ~y~tem, howeYer, th~ p.~r.. ~ r ~c~tion ~y~tem L~
illustrat~d in t~rm~ of block and circuit fl~ On~ of ordi-n~ry ~kill in the a~t could re~dlly e - ~ the~e ~I~r, int~
progrllm st~t -- for a pLa-e~--. , `` 2 1 ~5546 W0 98/28824 ~ : . J ~ 4~77 FIG. 3 ~how~ th~ encod-r 12 of PIG. 1 ln ~or~ detall, lnclud-lng an audlo PL~ or 31, lln~r pr dlctl~re (t.P) analy~i~ aAd quantization module 32, and open loop pitch e~timation module 33.
Xodule 34 analyze~ each frame of thQ siqnal to determlne whether th~ fr me 1~ mode A, mode B, or modQ C, a~ de~crLbed in more de-t~il bQlow. Xodul~ 35 pArfo~ excitatlon m '~ n~ 'in7 on th~ mode d~t~ l by module 3~. Pr_ 36 ~ --L- com-pros~ed ~peech blt~.
FIG. 4 shows the decoder 26 of Y~G. 2, ~ n7 a ~.oc~.~o~
41 for llnr~rlr~n7 of compressed spe~ch bit~, module 42 for .xclta-tlon ~ignal reconstruction, filter 43, ~peech ~ynthe~l~ fllter ~, and global po~t f ilter 45 .
PIG. 5A ~hows linear predlctlon analy~ls wLndows. Th- pre-ferred ~ tion y~t.m employ~ 40 m~. ~peech frame~. For ~ach frame, modul~ 32 ~ LP (lin-ar ~ rtlo-~) analy~i~ on two 30 ms. windows that are spaced apart by 20 m~. Th~s fLr~t LP
window 1~ c. \~ A at the middle, and the second LP window i~ cen-t~red at th- l~adlng edg~ of th~ ~p~ch f ra~e ~uch that the s~conc;
LP window est~nd~ 15 m~. into tho n~st framo. In oth-r word~, modul~ 32 an~lyz~s a fir~t part of th~ frame (~P window 1) to qen-~r~t- ~ flr~t ~t of fllter '~{r~ t~ and analyz~ a ~econd p~rt of th~ frame and ~ part of a n-st fram (LP wlndow 2) to gen~
rat~ a ~cond set of filter ~
rIG. 5B ~how~ pltch analy~i~ window~. For .each frame, module 32 p~-f~- pltch analysi~ on two 37.62S m~. wLndow~. ThR fir~t pitch analy~is wlndow i~ caAt~L~ at the middl~, and the ~econd pitch analy~is wlndow is cer.te ~d at the l~adlng edge of the woss/2ss24 2 1 6554 6 ~ 77 ~pe~ch frame Duch that thQ ocond pit~h analy~1- window extond~
18 8125 m- lnto the ne~t fr me In other word~, module 32 tn~-A third part of the fr~me (pitch analysi~ window 1) to gen-~rate ~ f~rDt pitch e~timato ant analyzeD a fourth part of the frAme and a part of the ne~t frame (pitch analy-i~ window 2) to generate a Decond pitch e~timat~
~ odul~ 32 employ~ ~ultiplication by ~ Hamming window followeo by a tenth order au~ G-,O lation ~athod of ~ tnaly~L- Nith thi-method of I~P ~naly~iK, module 32 obtalns optimal filter coQf-ficient~ and optimal roflectlon coeffl~-1s~t- In additlon, the re~idual enorgy after LP an~lyDis is alDo readily obtained ~nd, when ~A~ ei as a frtction of thfJ speech energy of the windowed LP ~n~ly-iD buffnr, i~ denoted t- 31 for th~ first LP wLndow ~nd a2 for the second rP wlndow The~e output~ of tho rP analy~i-are uDed ~,' lft,~ tly in the mode ~el~ n algorith~ a~ me~sures of ~pectr~l stationarity, as '- hf~i in ~ore detail below Aft~r LP analy-i~, module 32 ~ th ~r-~' ~ the f~lter coet'f~r~ for the fir-t r~ window, and for th- Decond LP win-dow, by 25 ~z, con~ert~ the ~ rl- ~ to ten line Dpectr~l fre~
tLSF), and ~ th?S~ t n lin~ Dp.~ctr~l f.~ n~ ie~
with a 26-bit LS~ vector ql:~nt~tion (VQ), a~ '- hed below llodule 32 employ- t 26-bit vector qutnt~7~t~on (VQ) for e~ch s t of ten LSFD ~hl- VQ provid.~D good and robuDt ~lLg -nr~
~cro~ a wide range of h~nd-et- ~nd D~ r~ S-partte VQ
co~ are ~ ~' for IRS filt-red tnd ~fltt unfilt.?red (~non-IRs-filtere?d ) speech ~-t~r~Al Tl~e ~nT~-nt1~i LSP vf~ctor 1~ qu-ne~ by th~ S flltered VQ ttble- as well t~ th- fltt _ g _ WO 95/28824 ` 2 1 ~ ~ 5 4 6 PCT/US95/04577 unfLlterQd~ VQ table- The optimum clas~iflcation i~ selected on th~ ba~ls of the cepstral dl~tortlon mea~ure Withln each cla~Lflcatlon, the vector quantlzation i~ carrled out ~lultiple candltates for each split vector are chosen on the basil~ of energy welghtet mean ~quare error, and an overall optimal selectlon i~
mado within each cla~-iflcatlon on th~ ba-l~ of tho cep~tral dlstortlon mea~ure among all comblnation- of cantLdate~ After the optimum c1A~1fi~ation is cho~Qn, thQ q -nt1 ~ llne spectral L,e~l,.s~cles ar~ ~o.~ ~ to filter coeff1~i~nt~
21ore ~ 1fir~11y, module 32 quantlze- the ten line spectr~l frequencles for both sets with a 26-bit multl-cod~bool~ spllt vec-tor quantlzer that clA~ifie~ the ~nT~-nt~?ed llne spectral fre-qu~ncy vector a- a ~voicQd IRS-fLltered,- ~unvolcet IRS-flltered,~
~volcad non-IRS-flltQred,~ and "unvolcQd non-IRS-flltered~ v~ctor, where ~RS~ r~fer~ to Ln~ '~At~ cfla_ ~e ~y~t~m fllter a~
r -ifi~i by CC~q~T, B1U8 ~OOk, RQC.P.4~.
FIG 6 show an outllne of thQ LSF vector guantizatlon pro-c~ odule 32 employ~ ~ spllt vector q ~ ~ for each cla~-lflcatlon, 5n~ 5~"~ a 3-4-3 pllt ve~ctor qu~ntlzer for the volc~d IRS-fllter d~ and th~ ~volced non-IRS-flltQred~ categorie~
51 and S3 T'ne flr-t three LSF- u~e an 8-blt: ' ' ln functior modul~ 55 and 57, th~ ne~ct four LSF- u~- a 10-blt ~ Ln functlon modulQ- 59 and 61, and the la~t thre~e LSFs use a 6-bit co~l~hook ln functlon modulQ~ 63 and 65. For thQ ~unvoiced IRS-fllt~r~td- ~nd tho ~unvoiced non-IRS-filter~d~ categorl~ 52 ~nd 54~ a 3-3-4 lspl$t vector quantizQr Ls u~ d The flrst threst LSF~ USQ a 7-bit ~ in functlon slodules 56 and 58, th- ne~t - : - 21 65546 wo ss/2ss24 . ~ ~ s77 thr~o LSF~ u~ aA 8-blt vector ~ in function module~ 60 and 62, and the last four LSFs U8f, a 9-b$t co~l^~^,ol~ ln function mod-ule~ 6~. And 66 Prom e~ch spllt vector ,o~ ol~, the three be~ft candLdAte~ arQ selected in functLon module~ 67, 6a, 69, and 70 uJing the energy ~_~qht- me~n ~qu_re error crltQrLa The fnerqy welghting reflects the po~Qr lev~l of the spectrAl envelo~ at ~ch l1n~ ~p~ctral f~l r The thre~ be~t candldAte~ for each of the three spl1t vector~ re~ult in a tot_l of twenty-~evQn com-b1n~tLons for each ~;c~f ~ The search 1~ constr~lned so that at le~st one combln_tlon would re~ult in ~n ordered ~et of LSF~
Thls i~ usu~lly a very mlld con~tr~lnt impo~ed on the ~earch The optimum combln~tion of these twenty-~even comb1natlons 1~ ~elected in functlon module 71 rie,p_n~lfn~ on the cepstral dl~tortlon mea-~ure Flnally, the optim~l C~tQgory or ~lA~1ff~etlon is deter-mined _l-o on the ba~i~ of the cep~tr~ll dl~tortlon me~ure The quAnt1- ~ LSFs ~re c~ L-~ to filter co~fff^f-nt- and then to . ,~oc~,Lcl~tion l~q~ for lnterpol_tlon y~
The re~ultlng LSF vector q.~-ntf --r 8chem~ 1~ not only eff~c-tive acro~s nL -~--r~ but al-o acro~ v~rylng degree~ of IRS fil-tering which mod~l- the fnfl ~ ~~ of th- h~nd~et ~ - Th~
: -~--' of th v~ctor ql~-ntf7~r- ~r train~d fro~ a ~1~cty talker spe-ch 'f't^~--G u~1n~ fl~t a~ w~ IRS f~ I ~h~pLn~ Thl~
i~ ~~~lgn~f to provide consl~tent ~nd good pc,~ 9 _cro~ sev-fr~l spe_ker~ And ~Icro~ v_rlou- h-- ~sC~ The average log ~pec-tral distortlon ~Acro~ the entlre TIA h~lf r_te d~t~ba~e i~ ~p-prwcim~tely 1 2 dB for IRS flltered ~peech d_ta ~nd Arr~ teiy 1.3 dB for non-IRS flltered speech d~t~l.

`. 2~ 65 4 wo ss/2ss24 5 6 i ~"1 ~c l~77 Two e~timAte- of the pltch ~re deto m1-- per fr~e ~t lnter-ral~ of 20 m ec ThQs~ opQn loop pLtch e~tim~te~ ~re u~ed in mode ~slection and to encode the clo~ed loop pitch an~ly-$- Lf th~ ~e-lected mode i~ a ~, nAntly voicQd mods Module 33 deto-m~ the two pitch e~tLmate~ from the two pitch ~n~lysL~ wlndow~ ~~ lhsd _bore ln connection w$th FIG 5B
using ~ 1fiod form of the pitch tr~cking ~lgorithm shown in FIG 7 Thi~ pitch Q~timation ~lgorithm m~k~- an initi~l pitch ~-tim_te in function module 73 u-ing ~n error function calcul~ted for ~11 v~lue~ in the set {(22 0, 22 5, , 11~ 5~, follow_d by pitch tr~cking to yield ~n o~r-r~ll optimum pitch r~lu~ Function module 74 employs look-bAck pitch tr_cking u~ing the error func-tion~ and pitch e~timatQs of the preriou~ two pitch ~n~ly~is win-dow~ Function module 75 employ~ look-~he~d pltch tracking using thQ ~rror function- of th- two future pitch analy~i~ window~ D--cision modul~ 76 _--eq pitch e~tim~te~ ng on look-bJck ~nd look-~hQ_d pitch trAcking to yiald ~n ov-r_ll optimum pitch rlllue ~t output ~ The pitch e~tim~tion ~lgorithm ~hown ln FIG
tha error function~ of two futurO pitch ~naly~i~ win-dow~ for it~ look-ah~d pitc~ tr~cking ~nd thu- ~ del~y of 40 IlU In order to aroid thi~ ponalty, th L_~f __ ~ co~-1r~t1~7n ~y~tem employ~ ~ 1f1r~t~1 of the pitch e~tLmation ~lgorithm of YIG 7 ~ IG 8 ~how~ th~ open loop pitch e~t~ 33 of rIG 3 Lnmore d~tail Pitch ~n~ly-i~ window~ on- ~nd two ~r~ input to re-~pQCtiV~ Co_putQ Qrror function- 331 And 332 Th~ output~ of tho~ error functlon comput~tion ~r~ input to ~ rgf1- L of ' 1 G5~46 WO95/28824 P~,11~J.,._'0~'77 p~t pltch eJtimate- 333, and the roflned pitch e-timate- are i~ent to both look b~ck and look ah-ad pitch tr~r1r{n5t 33~. and 335 for pitch window one The output~ of the pitch tr~lring circuits are input to ~elector 336 which select the open loop pitch on~ as the f is~t output The ~elected op~n loop pltch one l- alJo lnput to a look b~ck pitch trJ~cking circuit for pLtch window two whlch out-puts the open loop pitch two Fig 9 how~ the - 'i f i9d pitch tr~r--~ng algorlthm imple-mented by th- pitch estim tion circuitry of FIG 8 The ~~fi~
p$tch eJtl~ t~n algorithm Qmploy- the sam error function as in the Fig 7 algorithm in each pitch an~ly-i~ window, but the pitch tracking scheme i- ~ltered Prlor to pitch t-arl~ ng for either the first or second pitch analysis window, the pre~ious two pitch ~stimate- of the two previous pitch analy i- window are ref ined in function modul~ 81 and 82, re-pectively, with both look-back pitch ~_--'n5t and look-ahead pitch tracking u-ing the ~rror func-tion- of the current two pitch analy~iJ wlndow~ ThiJ i- followed' by look-back pitch trl-r--in~ in fu~ction modul~ 83 for th~ fir~t pitch analy~i~ window using th- r~fined pitch ~timate- and error fllnrri~n~ of th~ two prl~rious pitch an~ly-i~ window ~ook-ahe~d pitch i 'n~ for th~ fir-t pitch annly iJ windo~ in function modul- 8~ i- li2ited to u-ing th- rror function of the second pitch an~ly~i~ window The two e-timate- ar- _ red in deri~ior module 8S to yield an o~-r~ll best pitch e-timat~ for the fir~t pitch analy i~ window For the -cond pitch analy~ window, look-back pitch i ' 'n~t i8 carried out in function modul~ 86 as well a~ th~ pitch estimate of the first pitch analyJis window and _ 13 --f~ 21 6~546 W0 9512882J r~ . ' 1;77 it~ rror function No look-ahead pitch ~r^cl~nrJ i~ u~d for thi~
~econd pltch analy~i~ window wlth th~ re~ult that the look-back pltch e~tLmate 1 taken to bQ the overall be-t pLtch e~ti~te at output 87 PIG 10 show~ the modn d~termLnatlon procP~in7 performed by mode selector 34 . DerPn~t~ n~ on spectral st~tionarlty, pltch ~tationarity, ahort t~rm energy, Ahort tQrm level gradient, and zero cros~lng r~te of each 40 m~ frame, m ode ~lector 34 cla~
fie~ each fr_me lnto one of threo modQ-~ volcQd _nd statlonary mode (Mode A), unvolced or ~rAn~ nt mode (~lode 8), ~nd b~ J
nol~e mode (~odQ C) !Sore speciflcally, mode ~elector 34 gener-ates two loglc~l values, each indicating spectr~l st~tionarity or ~imi1~rity of ~pectr_l content between the currently ~L. e~
fram~ and the prevlou~ frame (St-p 1010) Node selector 34 g~n~r ~tes tw- logicAl v~lue~ indlcating pltch tation~rity, ~imilArity of f lnri tal f~ le~, between the ~ y ~ e~?i fr~Q
and th~ pr~vlou~ fram~ (Step 1020) ~lode ~1ect~?~ 34 gennr~te~
two loglcal value- indlcating th~l zero, ~r ~~lng rat~ of tho cur-r~ntly ~ EI frame (step 1030), a r~te in~l-- - by thQ
h~gher ~ ~ ~ ~ of tho fram~ r~l~tiv~ to the lower of th~ frame ModQ ~slector 3~ gQnQr_te~ twq loglcal v~luQ~ ind$catlng lQvel ~ '~Pnt- within th~ currently y: ~?~ fr_me (step 1030) ~lode ~ Lo~ 34, ~.ta- flve logical valu~- lndicating short-term energy of the currently pro-c~-~ed frame (Step 1050) Su~ ly, mode selector 34 deter-mine~ the mode of thQ frame to be modQ A, moda a, or mode C, de-pendlng on the value~ gener~ted in Step~ 1010-1050 tStep 1060) -- 1~. --2 f 6 ~ 5 4 6 wo ss/2ss24 r~ 0 1~77 F~G 11 1~ a block dlagr~m ~howinq a proce~ of Step 1010 of FIG 10 ln mor- detail The pro~q~in7 of F~G 11 dQtermLne~ a cepstral dl~tortlon ln dB Module 1110 convert~ the guantized f Llter coef f icient~ of window 2 of the current f rame lnto the lag domain, and module 1120 convert- the quantizQd fllter coefflclont~
of window 2 of tho previou~ f rame into thQ laq domaln ~(odule 1130 lnterpolatQ- the output- of moduls~ 1110 and 1120, and ~odule 11~.0 cv ~.Ls the output of modhle 1130 back lnto fllter co-~fici~n-e Modulo 1150 co.,~ .,L~ the output from module 11~0 into the c~pstral domaln, ar~d module 1160 c~ Ls the llnTlAnt1 7ed fil~
- ter coefilclent~ from window 1 of tho current frame lnto the cnp~tral do~aLn ModulQ 11~0 gnnerate~ the cep~tril dl~tortion dc from th~ outputs of 1150 and 1160 PIG 12 ~how~ genQratlon of ~pectral ~tatlonarlty value LPCFIAGl, whieh 18 a r~latlv~ly ~trong 1n 1~r~eor of ~pectral ~tatlonarlty for the fr_me ~lode ~elector 3~ ~ LPCFLAGl u-lng a ~ 'nA~ n of tw~ te~-hn~ -- for - n~ pectral ~tationarity The flrst technlgue ~ the c-p~tral dl~tor-tlon dc u-ing compar_tor~ 1210 and 1220 In Flg 12, th- dtl t` h~ input to comparator 1210 1- -~ 0 and th~ dt2 th~ ld inpue to comparator 1220 1~ -6.0 ~ he seeond tr-~n~T~ i5 ba-ed on thQ ~ l energy after Il?C analy l-, ~::A~ ai a~ a fraetion of the LPC analy~ peech buffer ~p~etral energy Thl~ nergy 1~ a ~ v~..L of LPC analysl-, a- ~9~ above ThQ ~1 lnput to eomparator 1230 i- th- ~J~ energy for th~ filt~r ::9~1c~ t of window 1 and the ~2 input to comparator 1240 1- th~ r~trl~ l energy of 21 6~546 WO 9~/28824 P~ .J.. 1'77 the flltQr coefficientA of window 2. The tl input to compara-torJ 1230 ~nd 1240 i- a thr~hold equ~l to 0 . 25 .
PIG. 13 how~ dataflow within mode ~olQctor 34 for a genera-tion of spQctral 3tationarity valuQ f lag LPCFLllG2, ~hich i~ a rel~tiYeiy weak indicator of ~pectral stationarity. The proce~-lng shown in FIG. 13 i- ~imil~r to that ~hown in FIG. 12, e~cept th~t LPCP~AG2 i~ ba~d on a rQlativoly r~la~ced s~t of thre~hold~.
~he dt2 input to comparator 1310 i~ -6.0, thQ dt3 input to com-parator 1320 i~ -4.0, the dt~ input to comp~rator 1350 i~ -2.0, the .~tl input to comparator~ 1330 ~nd 1340 i~ a thrQ~hold 0.25, and the ~t2 to comparators 1360 and 1370 i~ 0.15.
Mode selector 34 mea~ure~ pLtch se~tinn~ity u~ing both the opQn loop pitch value~ of the currQnt fr mQ, denoted a~ Pl for pltch window 1 and P2 for pitch window 2, and th~ open loop pitch valu~ of window 2 of th~ pr~vlou~ fr~o donoted by P_l. A lowor rangQ of pitch value~ (PLlPUl) ~nd an upper r~ngQ of pltch valuQ-( PL 2PU2 ) ar PLl MIN (~ P2) - Pt P~l llIN (P_l, P2) + Pt PL2 ~A~ (P_l, P2) Pt PU2 IIA~ (P_l, P2) + Pt, wh~r- Pt 1~ 8Ø If tho t ro r~nge~ arn - o rl~1ngr i.o., PL~
~ PU~ ~ then only a weak indicator of pitch ~tation~rity, dQnoted by PITCXPLAG2, is E ~ i hle ~nd P~TCHPLAC2 i~ ~Qt if Pl liQ~ withir~
~ither thn lower rango (PL1, PUl) or upp~r ran~o (PL2, PU2). If 2~ 65546 wo ss/2ss24 ~ 577 the two rang-~ are overlapping, i ~, PL2 ~ PUl, a ~trong indic~-tor of piteh ~tationarity, denoted by PITC~FLAGl, i~ po~ihi~ and i~ set if P1 lie~ within the r~ng- (PL~ PU) ~ where PL ' ~P-l+p2)~2 2pt P ~ ~P IP )/2 1 2P
FIG 1~ ~how~ a dat~flow for gener~tinq PTTC~FLAGl and PITCHFLAG2 wlthin mode ~le~tor 34 Nodule 14005 ~ ~ te3 ~n output equal to the input having the larg-~t value, and module 14010, - t211 an output equal to the input having th~ ~mall~t value~ Nodule 1420 generates an output that i~ an averags of ~hq v~lue~ of the two input~ Module~ 14030, 14035, 14040, 140~5, 14050 ~nd 14055 aro adder- Module~ 14080, 14025 and 1~090 are AD gates Nodule 1408? L~ an inYerter Nodule~ 14065, 14070, ~nd 140?5 are eaeh logic bloc3c~ generating a true output when (C~B)~(C~A) The clrcult of FIG 14 ~l-o ~ r~l~Ah~l1ty value~ V 1 Vl, and V2, eaeh indicatlng wh ther th value~ P 1' Pl, and P2, r~peetiv-ly, ar~ r liable Typlc~llly, th-~- r^l~ah~l~ty valu~
~re a ~ ~ L of th- pltch calculatlon algorith~ Th circuit ~hown ln FIG 14, t~- fal~e v~lue~ for PIq~G 1 and PITC~}J~G 2 lf any of the~ f lag~ V 1 ' Yl ' V2 ~ ar~ f al~- Pro-e-~lng of th~-e rQl~h~l~ty value~ i~ opt~
FIG 15 ~how~ dataflow wlthln mode ~ 34, for g~neratin~
two loglc~l valu~ indleatlng a zQro c_ ~ng rate for the fr~
Nodul-~ 15002, 15004, 15006, 15008, 15010, 15012, 1501J and 15016 wo ss/2ss24 2 1 6 5 5 4 6 ~ 77 ach count th~l numher of zQro ~ i nq~ ln a re~pectiv~ 5 mil-D~ l f~ - of the fram~ currently being ~,~cE~ei For ~camplc, module 15006 countJ the num_er of 2ero LOD~n~ of the ~ignal o~lrri"~ from th~ time 10 millir~ ' from the beginning of the frame to the time lS m~ from the beqinning of th~ frame Comparators lS018, 15020, 15022, 1402~, 15026, 15028, 15030, an~i 15032 in comblnation with adder 15035, g~n_L ,te a ~ralue indlcating the numher of 5 m~llir~ ~ (IIS) ~' r - haYing zero cro~ing~
of ~ lS C tos 15040 Qt~ the fl~g ZC_BOW when the number of ~uch ~--hf ~ leDs than 2, and the comparator 1503~ set~
the flag ZC HIGH when the numher of such 8 hf ~ is greater than 5 The irDalu~ ZCt input to comparatorD 15018-15032 is lS, the valuc Ztl lnput to to 150~0 i~ 2, and th- ~alue Zt2 input to comparator 15037 i~ 5 rlgD 16A, 16B, and 16C how a d~ta flow for gonerating two logical Yalue~ indicati~r~ of ~hort t~rm lev~ Mod~
l-ctor 34 - _D ~hort t~rm l~r l ~ , an indication of t ~n.i~nt~ within a frame, u-ing ~ ~~ filtered ver~ion of th~ - -' input signal amplitude ISodule 16005 g~nerate~ the ~ l t~ ralue of th input Dign~l S(n), module 16010 - - it~
input ~ignnl, and 1~ fllt-r 16015 ~ e~ ~ ~ignal Al,ln) th~t, ~t t~ in~tant n, iD- e ~ i by A~,(n) - (63/64)AI~(n~ (1/64)C(I D(n)¦ ) where the -~irg function C( ) i~ th~ ~I-law function _ 18 --21 6~46 WO 95128824 i i ~ p~ 0 ~'77 in CCIqT G 711 Delay 16025 generates an output that iB a 10 ms-delayed ~rer~lon of it~ Lnput and subtractor 16027 generate~ a dlf-f~renes bQtween AI,~n) and the AL~n~ ~odule 16030 generate~ a ~ignal that Ls an absolute value of its input ~ ery S ms, mode ~elector 34 compares AL~n~ with that of 10 m~ ago and, if the differ--nce ~ n)-A~(n-80)¦ ~xceeds a ~ixod relaxed th ~ t~ a counter ( In th~ preceding ex-pression, 80 c~L,~ ~ ds to 8 samples per ~sS times 10 ~ As shown in Fig 16C, Lf this difference does not ~ceed a relatively stringent threshold ~Lt2 ~ 32) for any ~ mode sslector ~3 s-ts LVBFLAG2, wQakly indicating ~m ab~onc~ of t~n-~nt~ A~
hown in ~ig 16B, if th~ ~ di6 exceed~ ~I more relax~d th l1ho~ Ltl - 10) for no more than one _ - (Lt3 - 2) mode ~-l9cl a- 34 getg LV~PLAGl, gtronqly indicating an absence of tran-sients lloro sporif~ l ly, Fig 163 shows delay circuit~ 16032-16046 that each g~ACLat~ a S ms delayod v~r-ion of its input Each of latch~s 16048-16062 ave a ignal on it- input Latche~ 16048-16062 ar- trob d at a c~,mmGn time, n~ar th- ~nd of ach 40 m~
pe~ch fra~e, ~o that each latch ~a~re~ ~ portion of the fram~
~ i by S m- from the portion ~ved by ~m ad~ac~mt latch C _~ ~oY- 16064-16078 e~ch compar~ th~ output of a re~p cti~r~
l~tch to the th~ ld Ltl and adder 16080 ~um- thQ comparator outputs and s~nd- the sum to comparator 16082 for comparison to th~ ol~ L
Fig 16C how~ a circuit for generating LVLY~aG2 ~n Fig 16C, delays 16132-16146 are similar to th- d~lays ~hown in ; ;`
wo95128824 2 ~ 65 ~46 ~ o Is77 FllJ 16B ~nd latche~ 16148-16162 arQ ~imilar to the latche~ ~hown in Flg 16B Comp~rator~ 16164-16178 e~ch comp~re ~n output of a re~poctlvo latch to ths threshold Lt2 ~ 2 Thu~, OR g~te 16180 generatee a true output if any of th~ latched ~ignal originatinq from ~odule 16030 exceed~ the thre~hold Lt2 Inverter 16182 in-v rt~ thc output of OR gat~ 16180 Flg 17 hows a dat~ flow for genQratins par~mQter~ indica-tlve of ahort tsrm energy Short tsrm energy iB me~ured a~ th~
me~n squ~r~ energy (~vorage energy per ~ample) on ~ frame b~si~
well a~ on ~ 5 m~ b~ The ~hort tarm energy 1~ det~rm1 n~d relative to ~ b _1~9 v~.d energy Ebn Ebn i~ initi~lly ~t to a con~t nt Eo ~ tlOO ~c (12)1~2)2 S~ Lly, when c framo 1~
d-t^rmi~~~ to be mode C, Ebn 1~ -t equ~l to (7/8)Ebn + (1/8)Eo Thus, some of the ~ ol-~ employed in the cLrcuit of FIG 17 aro ~d~ptlYe In Plg 17, Et~ - O ~0~ E~n~ Btl - 5, Et2 ' 2 5 ~bn' Et3 1~8~bn~ ~t4 ' Ebn~ Ets ' 0~707 gbn~ ~nd Et6 ~ 16 0 T~- ~hort term energy on ~ 5 ~ b~ provide- an indication of ~_ of ~pe~ch tl~ .L th~ fram~ u~lng 1l ~lngl~ fl~g EFSAGl, ~hich i~ 3 ~1 by tR-ting tho ~hort t-rm ennrgy on ~ 5 m~ b~ go,in-t ~ 1, in_~ count~r ~ ~r the d i~ nd t~-ting the counter'~ fin~l v~lue n-t ~ f~ed th~ hAld C ,-r~nq th~ ~hort term enerqy on ~
fr~ ba~i~ to variou~ thre~hold- provLd~ indication of ab~-nce of ~po-ch ~k ~ .L th~ framo ln the form of ~ev-r~l fl~g~ with varyinq d~gree~ of ~nnf~d~n~e The~ fl~g~ ~ro denoted a~ E~LAt;2, EFLI~G3, EFLAC4, and EF~AG5 _ 20 --- ` 2l ~546 W095/28824 ,. ~- . PCTIUS95/04577 FIG 17 shows d_taflow within mode selector 34 for generAting th~se flag~ Module~ 1~002, 17004, 17006, 17008, 17010, 17015, 1~020, and 17022 each count the energY in a respective 5 NS
subframe of the fr_me currently being ~ esl~d Comp_rators 17030, 17032, 17034, 17036, 170~8, 17040, 17042, and 17044, in combinatlon with addQr 17050, count thQ numbQr of ~ubframe~ h_Ying an enerQ e '~nq Eto ' 0 707Ebn FIGS 18A, 18B, and 18C ~how th~ rro~P~rin~ of ~tep 1060 Node selector 34 f$r~t rlA~ thQ framQ a~ b~_~yL~ d noise (modQ C) or Ypeech (modes A or B) Mode C tond~ to be character-iz~d by low en-rgy, relativQly hlgh D~' 1 8tAtionarity betW~Qn th~ currQnt frame ~nd the pr viou- fram~l, a rel~tive ab~ence of pitch ~tationarity between the c~rrQnt fram~ and the pr~vious framQ, and a high z~ro c ~~n~ rat- P-- ~ ' noL~e ~mode C) i~ d~-lA ~ QithQr on thQ ba-i~ of the bL~o.~; L short term energg flag EFLAG5 alone or by ~ ` 'n~q we~ker ~hort term energY flag~
Er~AG4, ~AG3, ~nd EFLAG2 with oth~r f lag~ indicating high zero ing rat, ab~enc- of pitch, ab~-nce of ~n~ , etc ~ lorQ ~}-- f~ y, if the mod~ of tho proYiou~ fr~ wa~ A or' if EF~AG2 i~ not tru, ~ c'ng ~OC~ to ~t~p 18045 (~t-p 18005) St p 18005 en-ur-- th~t th~ curr~nt frame will not be d- C if th~ previou- frame wa~ modQ A ~he CurrQnt frame i~
~ode C lf (I~CE~G1 and EFI,AG3) i~ tru~ or (IPCFLaG2 _nd EFIAG4) i~ tru~ or EFI AG5 i~ tru- ( ~t~p~ 18010, 18015, and 18020 ) The currQnt frame i~ mod~ C if ~not PITC~FIAGl) and LPCFIAGl and ZC_HIG2~ true (~t-p 18025) or ( tnot PITC~JUl) and (not PIl~ ) and IPCFLAG2 and ZC_~IIG~ true (~t~p 18030) Thu~, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ W095128824 ~ 'i'"; i ` ~ 216~5~6 r~ 1577 the ~,~ J~in~ ~hown in Fig 18A deto~1n~- whether the frAme cor-La~ s to a fir~t de (Mode C), d~ g on whether a speech t is sub~tanti~lly absent from the frame In step 18045, ~ score i~ calculated ~leponrl~nl~ on the mode of thQ previous fr me If the mode of the previous ramQ was mode A, the scor~ is 1 + Lvr~ + eyLAcl + ZC LOW If the prevlouM mode -w~ mode B, the ~core i~ 0 + LVFLAGl + ~FLAGl + ZC ~OW If the mode of the previou~ frame wa~ mode C, the ~ore i~ 2 + LYFLAGl +
EFI,AGl + ZC LOW
If the DdQ of the previou~ fr~me w~ mode C or not LY~FLAG2, the mode of the current fr~me is mode B tst~p 18050) The curr~nt framQ i~ mode A if (rPCP~ PITCHFIAGl) 1~ true, provided thc score L~ not les~ than 2 (~tep~ 18060 and 18055) The current fram~ i- mode A if tLPC~AGl and PI~rcHFLAG2) 1~ tru~ or (LPCFLAG2 and PITCHFLAGl ) is true, provided score i~ not le~ th~n 3 ( ~tep~
18070, 18075, ~nd 18080 ) S~ tly, ~peech encod~r 12 gener~t~- an encoded frame in Ac~ A with one of ~ fir~t coding ~chem~ (~ coding ~chemQ for mod~ C), when th- frame ____ ~ d~ to ths first Dde, and an al-t~rnatlv coding ~che (~ codlng schem~ for mod~ A or B), wh-n th- fr~ doe- not c~ to the fir t mod~ d-- ~-~ in mod- det~il below For mod~ A, only th~ ~econd ~et of lln~ ~p~ctr~
v~ctor ~u~ntiz~t~on indlcQ~ nQ~d to be tr~n~mitted because the first s-t can be ~nferred at the r~ceiver du~ to the slowly vary-ing natur of the voc~l tract shape ~n ~dditlon, th~ fir~t and -cond op n loop pitch e~timate~ ~re qr-nt~ nd transmitted 21 ~5546 wo g~/28824 - -- r~ 4'77 . ;:
b~cause they ~re used to encode the closed loop pltch esti~ate~ in e~ch ~ubframe The qu~ntization of the second open loop pitch estimate is a~ ed using a non-uniform 4-bit quantizer while~
the quantization of the fir~t open loop pitch e~timate i~ ac-1~ d u~ing a dif ferentLal non-uniform 3-bit qu~ntizer Since the vector quantization indice~ of the LSF'~ for the fir~t linear prediction analysis window arQ nelther tran~mitted nor used in mode selection, they need not be c~lcul~ted in mode A Thi-r duce~ the c ,l~ity of the short term predictor ~ection of th~
encoder in thls mode Thi~ reduced lP~ity a~ well a~ the lower blt rate of the short term predictor F~ -t~LA in mode A i5 off~et by f~ter update of all the ~ccit~tion model p~ ~Q ~.
For mode B, both sets of llne spectral f~ r.~ vector qu~n-tlr~t~on mu~t be transm~ttQd because of potential spectral nonstationarity ~lowever, for the fir~t ~et of line spectral fre-y~ we need search only 2 of the 4 cl~ification~ or catego-ries This is because the IRS v~ non-IRS solection v~ries very Jlowiy with tiD~ If the s-cond J-t of lin~ ~pectr~l L ~
~re cho-~n from th~ ~voiced IRS-flltQred c~t-; r~ then the first ~t ca~ be ~ ~' to b~ from ith~r the ~voiced IRS-filt-red- or ~ oiced IRS-filtQr~d~ ~ If the ~econd ~ot of lin ~p-ctral frequencieJ were cho-~n from the ~unvoiced IRS-filtered ,~tog ~, then again the fir~t ~et can be ~,~ L
to bQ from either the ~voiced IRS-filtered~ or ~unvoiced IRS-fllt~r~d c~te, ls If the ~Qcond ~et of lin~ ~pectral frequen-ci~- w~r~ cho-~n from the ~voiced non-~RS-filtered~ category, then _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ . . _ Wo ssl28824 ' " ~ ' ' 2 1 6 5 5 4 6 A ~ ~ Q4 77 the flrst set can be Q~pected to be from either the ~voiced non-IRS-filt.red~ or ~unvoiceA non-IRS filtered~ categorie~ Fin~lly, if the ~econd set of line spe~tral freguencie-D ware chosen from th~ ~'unvoiced non-IRS-filtered~ category, then again the first set can be ~ L~ to be from either the ~voiced non-IRS-flltered~ or ~unvoiced non-IRS-filtered~ CGt~3 1Q~ A~ a re~ult only two cat-egories of LSF ~^oA^~o^~ need be Dearched for the quantization of the flr$t D^et of liAe Dpectral frequencie~ Furthermore, only 25 bitD^ arn n~ded to encode thQ-e ~Iuantizatlon indice~ in-tead of the 26 needed for th^D Decond set of LSF'-, ince the optimal cat-ogory for the first ~et can be coded u-Ding ~u-t 1 blt Por mode B, neith~r of the two open loop pitch e-timate- are tr n-Dmitted ~ince they are not u~ed in guiding the clo-ed loop pltch e~tima-t~-, The higher ,l-Yity involved in - '~ng a- well a- thQ
higher bit rate of the short term predictor F' t~LD in mode B
is , ~ated by a slower update of all the excitation model pa-rameterD .
l~or mode C, only the D^econd Det of lLne ~pectral f..~ r~
vector gu~r~r~t~ indlce~ need to be tran-mitted because for th.
human e_r i- not a~ -n-itive to r_pid ch~nge- in ~ Dhape ~a~at~r ~ for noi~y input- FurthRr, ~uch rapid pectral shape var~A~ are atypic_l for many kind~ of ~', ' noi~e ourc~ Por mode C, n ither of the two op~n loop pitch e-Dtimate~
are tran-~itted since they are not u-Qd in guidAing the clo-ed loop pitch e-tim_tion Th- low~r ~ AY~ty involved a- well a~ th.
lower bit rate of th~ short term predictor pA - te.D in mode C is ` - . 21 65546 WO 95/28824 ' I ~ . C.'C 1'77 --t~d by _ fA~ter upd_te of the fLxed cP~ho~k gain portion of the excitatLon model p_rametQr~.
- The gain qu_nti2ation tablQs are tailored to edch of the modes. Al~o in e_ch mode, the clo~ed loop p~rameter~ are refined uOiAg A delayed de~ n appro~ch. Thi~ delayed d~ isn i~ em-ployed in such a WAy th_t the over_ll codQc dQlay i~ not in-cre~sed. Such A dQlayed de~ n ArFrOA-h is very effective in tr~sltlon reglon~.
In modQ A, the qu~ntlzation indlceO co.,~..dlng to the sec-ond sQt of ~hort term predlctor coQfficlents a~ well a~ the op~n loop pitch e-tim~te~ arQ tr_nOm$tt~d. ~nly the~Q q---nt1- 1 param-t-r~ _ro u~ed in thQ Qxclt~tion ~ ng. The 40-mOec speech framQ is d$~1ded into sev~n O~ ~ . ThQ fir~t si~ _re 5 . 75 mOec in length and ~-lrQnth Lo 5 . 5 mO~c in length . In e~ch ..hf r ~n $nterpol_ted Oet of ~hort tQrm prsdlctor coQfficient~
~re u~ed. The lntQrpolatlon lo dono in thQ a~L~cv . ~1 Ation lag domAin. tl~ing thi~ interpol~t~d ~et of cseff~ n~, a clo~ed loop ~n~lyOi~ by 0~ '--i- a~ u~ed to dQrive the optimum pLtch $nd~, pitch gnin lnd~x, f$~ed _- '~ ind ~, and fixed c~nho~)~ g~in index for Q~ch _ . ThQ clo~d loop pitch in-do~ ~rch r~nq i~ round an ~nt~rpolAted tra~-ctory of th- op n loop pltch Q~tim~tQ~. Th- tr~dQ-off betweQn thQ ~earch r~nqe and the pitch rQ~olutlon 1~ donQ ln ~ ~ynam~c fa~hlon d~-pQnding on thQ cl~ of thQ opQn loop pitch QOtimatQ~. The f$xed _c~ l employO zlnc pulo~l ~h~pe~ whlch arQ r~htAin~d u~ins ~ 25 -i: ! 2 ~ 5 5 5 4 6 WO 95/28824 1 ~ rr4'77 weighted combination of the sinc pulse and a phase shifted VQr-~ion of its Hllbert tr~n~form The fixed c '~ gain Ls guan-tized in a differentLal m~nner The analysis by synthesiq technique that is used to derive the excitation model parameters employs an i~t~rpolated ~et of short term predi ctor coefficients in each , h~ ThQ
d-termination of the optimal set of Q~cit~tion model parameter~
for e~ch subframe is dete~min~ only at the end of each 40 IIID.
frAme bec~u~- of delayed deciD~on In derivlng the excitat~ on model parameters, all the seven ~ 1 L - are a~Du~ed to be of l~ngth 5 ~5 mD or forty-si% DampleD However, for the l_st or -venth Dubframe, thQ end of D,bf updateD DUch a~ the ad~ptLve CO~ update and the updatQ of the loc_l ~hort term predictor tat~ vA-~Ahl~ ~re c~rried out only for a D~'~ leAgth of 5 5 mD or forty-four sampleD
The short term predictor FA ~- or lin-~r prediction fil~
ter p~ram ters are interpolated from 2lubf to m'f The lnterpolAtion iD c~rried out ln the a~ < ~l~tion dos~in The n~Arr--l{ -~ ~ lo~ tlon ?ff~Ci d-rived from th~ ne~
filt~r: ~''{r{~nt- for th~ D~ond llne_r ~_ '{~lon an~lyDi~ win~
dow _re denoted ~1- {~ for th~ pr~vlou~ ~0 m fr~me ~nd by {~2(1)} for th~ current 40 mD frame for O _i<10 with ~_1(0)-~2(0)-1 0 Then th~ lnterpolated ~.L~ Ation coef-fl~ients {~'m(~)} ~re then given by m(f)- 'm ~2(f)~[l~vm~ ~ l(f)~ 1 _m<7,0 < f~ 10, 2~ 65546 ~ wo 95/~824 p~.", . ~4~77 ;
or.in vector notation ~ m VmP2+~l~Vm~P~ m~7.
Here, vm is the interpolating weight for subframe m. The inter-polated lag~ {P~m~}~ are ~ub~e.~ tly con~,..LLad to the short tQrm pr~dlctor filter coQfficient~ {a'm( ~
Th~ choice of interpolating weight~ affect~ voica quality in thi~ mod~ ~iqn1f~c^ntly. For thi~ rea-on, they must be determined c~r~fully. The~ int~rpolating weightJ vm hav- beQn detormin~l for subfram~ m by m~n~m~z1n~ the mean ~qu~r~ error between ~ctual ~hort term ~pectral envelope Sm J(~) And the inturpolated short torm power ~pectral envelope S~m J(~) ov~r all speech frame~ J of a very large speech databa~e. ~n other word~, m is det~rmin~d by ~n~m~ 7ing E, ' ~j 21 l¦S,.,,t~)-S .,J~ 2dt,~.
IS the actual A..loc< .-lAtion: ~f~ for ~ ~f m in ~rame J ar- d~not~d by {~ J(k)}, th n by d~finitlon Sm,Jtw) ~ m J(k) e~~wk 0 ~ k -- 2~ --`~ . ` 21 65546 Woss/2ss24 ` ~ ` ;` r~ Q~77 Sub~tituting the abov~ ~quations into thQ pLe- '~n~ equation, it can b- ~hown thAt minimi2in~ Em is equivalent to min;miZinSJ E~m wher~ ~ m is giv~n by m J k~ [om,Jtk) ~' m,J(k)]2, or in vector notAtion ~ m ~ m,J~~ m,J I 1 2, wher~ p~l- ts the vector norm Sub~tltuting p ~ J
into the sboY ~qu~tion, dlffQrenti~ting with r~pect to vm and ~-ttln~ lt to 2~ero r-~ult~ in -Y~
~; lx~
wh-r~ SJ '2 J~ '-1 J 8nd ~,J 'm,J '-l,J and ' SJ,~,J
i- th- dot product b~tws~n v~ctor~ SJ ~nd ~m J The vslue~ of vm calculsted bY th~ aboY method u~ing a v-ry large ~p~qch databa~e ~r- furth-r fin- tun d by li~t-ning tQ~t~
I!h targ-t ~roctor taC for th adsptlYe ~ narch i~
r lat d to th- ~p -ch Y-ctor ~ in ~ach ~ ~ bY -~taCLZ
H r~ th- quar low~r t~^nrl~- toQplits mstrl~ who-~ first column contsin- th- i~pul~ re~pon~- o~ th- 1nt~pol~ted short t~ t^~ {8 D~(f)~ for th~ ~ ~ ~ snd ~ i~ the veceor rort~n~ng it~ z~ro input ~ n~- Th- tsrSI-t v-ctor taC L- most ~ily cslculat~ ubtr_cting th- s~ro lnput -a~ ~3 ~ ':om _ 29 --, .
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ wo 95/288z4 ! 2 1 6 ~ 5 4 6 ~ 77 the speech vector 8 and filtering the difference by the inver~e ~hort term predlctor with zero inlti_l state~.
The adAptive co~ search in adaptive ~o~ho~lrq 3506 and 3507 employ~ a spectrally weLghted mean ~quare error ~i to mea-3ure the diJtance between a candidate v~ctor rl and the target vector taC as given by ~ i ( tac~ r ~ ) W( tac~P ~rf ) -Here, ~'1 is the a~ociated gAin and ~ is the spectral weighting matri~ iJ a po~itive def initc symmetric toeplit2 matri~c that i~
d~riv~td from the truncated impulJ~ e of the ~ irJhtr~d ~hort t~rm predictor with fllter, ~f1~ t~ ~_ m(i)7 }. The ~, ~rJhtin7 f_ctor 7 iS 0.8. Sub~tituting for the optimum ~i in the abov~ e~preJsion, the distortlon term can be rewritten aJ
T t~l]2 i taCl~taC-.~

wher~ the correlatlon term t~C~Ilrl and ei i~ the energy term rlT~lrl. Only tho~e rAnrl~rlAte~A ar~ c~n~i~' ~ that have a po~ltlve corrnlation. ~he be~t candidate vector~ are the one~
that have po~itive correlations and thc highe~t value~ of t,$,2 ~1 wossl2ss2t i~ 2 ~ ~ ~ 5 4 6 F~ 'Ot577 i The c_ndldate vQctOr rl coLL~ dO to dlfferent pitch te-lays The~e pLtch del_ys in sample~ liQ in the rAnge t 20 ,146 1 Fraction-l pitch dQlays arQ possible but the fractioA~l part ~ is restricted to b~ either 0 00, 0 25, O SO or 0 75 The candidate vector ~OLL ~ n7 to an integer delay L is simply read from the vdaptive ~ o~ l~, which io A collection of the pAot excitttion sampleO For a mixed (intQger plu!v fraction) delay L+f the por-tion of the adAptive cod~ho 1 cQntered _round thQ Oection cor-responding to thQ integer dQlay L io f llterod by a polyphave f 11-tar c~LL~ nA~n~ to fr_ction f T- lete candidatQ vQctOr~
~;OIL v~ Aing to low dQlay VA-1UQJ 1Q~ than a suhfr_me length are complQted ln the same m~nn~r aO sugge~ted by J. C ` 1I Qt al ~uprA Th~ polypha~e fllt~r; ~ nts are derlved from a pro-tOtypQ low p o8 filter drsl~n~i to h_VQ good pa~QhAnA as well as good ~vL~,~b~nd ch racterl~tic~ ~_ch polyph_~e filter ha~ 8 tap~
Tha Ad_ptiv~ c~ Q_rch do~ not s~arch _11 candidate vectorJ For thQ f irst 3 0~ -, a 5-bit sQ_rch range is de-te~;nad by thQ tiQcond quantlzed op~n loop pitch eOtimate P 1 f th~ prevlou~ 40 mr framo _nd th~ flrtlt -nt~ e~ op~n loop pitch -tim_to P 1 of the curr~nt 40 mt~ fr~ If th~ prevlou~ ~od~
w~r~l B, th~n the Y_lUQ of P I 1- talcen to b~ thq la~t ~ ,bf L
pitch d-lay in th~ provlou_ fr_m~ ~or th~ t ~ D.'' -~1~ thi~
S-blt ~-~rch rangs i- d~ by th~ econd qu~nt i ~ ~ open loo~
pltch ~ti~te P 2 Of th~ current 4 0 m~ fr_mQ and th~ flr~t qu~n-tized opan loop pitch e~timAte P l of th~ current 40 m~ frA~
}ror th~ iir-t 3 ti~ this S-bit ~Arch r~nge i~ ~plit in:o 2 4-blt r_ng~ wlth aach r~ngQ c~ntara~A around P 1 and P 1 I f =

~ wo 9~/28824 6 ~ ~ 4 6 P ~ I, ., ~ ,~, 77 the~e two 4-bit r~nge~ overlap, then ~ ~Lngle 5-bit range ia u~ed which is centered around {P' l+P'1}/2. Similarly, for the laat 4 ~ hf --, this 5-bit s~arch range is split into 2 4-bit ranqes with each r~nge centered around P'l and P'2. If these two r-bit ranges overlap, then a single 5-bit range i~ used which is cen-tered ~round ~P'l+P'2}/2.
The search range sQlection also det~rmin~Q what fractional re~olution is needed for the clo~ed loop pitch. Thls de~ired fractional re~olution is deto~insd directly from the quantized open loop pitch estimat~s P' 1 and P~ 1 for the first 3 subframes and from P'l and P'2 for the la~t 4 8..hf ~. If the two deter-mining open loop pLtch ~timatQ~ ar- within 4 intQgQr del~y~ of Qach othQr re~ulting in a ~ingle 5-bit search rangQ, only 8 inte-g~r delay~ ~.. te~d around the mid-point are ~Qarched but frac-tional pitch f portion can ~sume valu~ of 0.00, 0.25, 0.50, or 0.75 and are th~..,fGl~ also searched. Thu~ 3 bit~ are u~ed to ~ncode the integer portion while 2 bit~ are u~ed to encode the fr~ctLonal portion of the clo~ed loop pitch. If thQ two determin-ing open loop pitch estimatQ~ arQ within 8 intQger dQlay~ of each other re~ulting in a ~ingle 5-bit ~arch rangQ, only 16 int~ger d l~y ~ round thQ mid-point aro ~Qarched but fractional pitch f portion can a~sumQ value- of 0.0 or 0.5 and are therefore al~o 8 ~ ~ ~ 1. Thu- 4 bit~ are u~ed to encode thQ intQger portion while 1 bit i~ u~Qd to encod~ th~ fraction~l portion of the clo~ed loop pLtch. If thQ two dQtP~in{n~ open loop pitch e~tinate~ are morQ than 8 integer dQlay~ apart, only lnteger d~l~y~ ., f~0.;
only, ~r~ rched in either the ~lngle 5-blt ~arch r~nge or the WO 95128824 1; ' ! .... 2 1 ~ 5 5 4 6 ~ ~ 1 / " ., s , 77 2 ~.-b$t search ranges tetermined. ThUR all 5 bits are spent in -l{n~ the integer portion of the closed loop pitch.
The ~earch c lr~i ty may be reduced in the ca~e of frac-tional pitch delays by first searching for the optimum inteqer delay ~nd ~earching for the optimum fractional pitch delay only in it~ n~j~hhorhr od. One of the 5-bit indice~, the all zero index, i~ c~ ~ for the all zero adaptivQ co~ m1~ vector. ~his is a~ -ted by trimming the 5-bit or 32 pitch delay search ranqe to a 31 pitch delay search range. A- indlcated before, the search i~ restricted to only positive correlatLon~ and the all zero index is chosen if no such positive correlation is found. Th~ adaptiYe co~ ol~ gain 18 d-tr~m{- ~ after s~arch by quantizing the ratio of the optimum correlation to thQ optimu~ energy u~ing a non-uniform 3-bit quantizer. Thi~ 3-bit quantizer only ha~ po~itive gain values in lt since only po~ltive gaLn~ are pos~ible.
Since delayed ~e~ ion i~ e~nployed, the adaptive codr~hoolr s-arch l,~l r~3 thQ two bQ~t pitch dQlay or lag candidates in all Lt~ . Purtl ~ for ,.~ '~ two to ~i~c, thi~ ha~ to be t~d for th~ two be~t target v~ctor~ by the two bQ~t s-t- of ~citation modQl F L d~riYud for the previou~
in the currQnt frame. ~rhi~ re-ult~ ln two be-t lag can-didat~ alld the as~ociated two adaptiYe ~ r gains for hl bf - on- and in four be~t lag c~ndidat~- and the a~ociated four adaptlve ~odn~ovl~ qain~ for "~bf J~ two to ~i~c at the end of th~ ~earch proce~. In each ca~, the targ-t vector for the flsed :: -':~`- i~ derived by ~ubtractinq th~ ~caled adaptive '~~ Dc'- v~ctor from the target for the ataptive co~ ook ~earch, (~ W095128824 ,: . 2 1 6 5 5 4 6 .~,1/U., _'0~577 . _ ~ i,',"
i-e-~ t~e ~ t~C-P Optropt~ where rOpt i~ the seleeted adaptive ho ~lr vsetor and Popt is the asrociated adaptlve cod~ho~
gain .
In mode A, the fix~d cod~hook eonsists of general excitation pulse shape~ eonstrueted from the dLserete Jinc and co c fune-tlons. The Jfne funetion i~ defLned ar Jlne~n) ' ~frn~,rn~ ~ n - O
~fne(0) - 1 n - O
~nd the co~c funetion i~ defLned ar coJc(n) . I-coJ(rn~ , n - O
~n COJC(0) ' 0 n - O
Wlth the~e d~fLnitions Ln mind, the g 1~-- ' exeltation pUlSQ
~haper are ~O..~.L ,.. Lol ar followr~
Zl ( n ) - A ~fnc( n ) I 1~ co~c( n+l ) ~ s l(n) - A Jfne(n) - B co!rc(n-l) The w~ight~ A and El nr~ eho~-n to ba 0.866 ~nd 0.5 respec-tLvely. With the Jfne and COJC f~ t~n~ timQ alignQd, they cor-rQspond to whnt is known a~ zfne ba~i~ f~nrt~^n~ sO(n). Inform~l i~t ning tQ-t~ ~how that ~ - r~fted pul-- shap~ improv~ voice uality of the ~ynt~ 7~ ~peQeh.
The fised ~ for mode A eon~i~t~ of 2 parts eaeh haYi:lg 45 VectOrJ. Th~ fir~t p~rt eonrirt~ of the pul~e rh~lpe z l(n-~S) and i~ 90 ~ample~ long. The ith veetor i~ ~imply the veetor t!at ~tart~ fro~ the ith c~ entry. The ~eond p~rt eon~i~t~ of pe rl(n-~S) ~nd ~ gO ~ple~ long. ~re ~gain, the W09S/28824 ~ 6 ~ o~ ~ 04 7, ~
ith vector i~ simply the vector that starts from the ith rod~hoo entry. ~oth c~.dPh~Qo~A are further trimmed to reduce all small valuus q~peci~lly near the beginning and end of both cod~hool~ to zero. In addition, w~ note that every even ~ample in either co~l~ho~ is identlcal to zero by definition. All this contribute~
to making the ,~,A~ho.~-~ very ~par~e. In addition, we note that both c~ rQ overlApping with ad~Acent vectors h~vinq all but on~ entry in common.
- The ovqrl Arp~n~ nature and th~ spAr~ity of the ~,o.lrho,~ are ~xploited in the co~l~ho~ arch which u~e- the 8A e di~tortion measure as in the adaptivQ coA~ search. This measure calcu-latQ~ the dl~tance between the fixed co~ target vector t~c ~nd every candidate fixed cod~ vector cl _-lSi ' t t~C-~ lCi ) W ( t~C-~ iCi ) Where W i~ the sAme spectral weight$ng mAtrix u~ed in the adaptive ~o~n~olc search And ~ the optimum value of the gain for that ith ~ lc vector. Once the optimum vQctOr ha~ been ~elected ~or each c~-~ol~, the ~ g~ln mAgnitude is quan-tized out~ide the ~e_rch loop by, i~ g the r_tio of thQ opti-mum corr~lation to the optimum energy by ~ non-uniform 4-bit qu~n-tiz~r in odd ~ nd a 3-bit dlfi~ AI non-uniform qu~n-tiz-r in n~en A--''' . E~oth q--nt~r~ h~ve z~ro gAin a~ on- of th ir entri~. The optimal di~tortion for each ~ th-n c~ lAted and the opti~al .ud~ s-le~te~.
The fixed c~ ol~ inde~c for each ~ in the r~nge 0-44 if th~ optimal c~ from ~ 1~n-45) but i~ mapped to :;
~ W095/28824 ~ ,`` r~ c~ol'77 the range 45-89 ~f the opti~l ~a~ on~ from zl(n-45) By com-bLnLng the fixed ~ hook indLces of two consecutive frames I and J_~ 90I+J, we can encode the re~ultlng index u~ing 13 bits This i~ done for 8 i~ -- 1 and 2, 3 And 4, 5 and 6 For ~ubframe 7, the fixed ~o~l~hook index i8 simply encoded u~ing 7 blts The fixed codebook gALn sLgn i~ encoded u~ing 1 bit Ln all ~
~ 'f ~. Th~ fLxed co~iAhook g~in mAgnLtude i8 encoded u~ing 4 bLts ln 8 h' - 1, 3, 5, 7 ~nd u~Lng 3 blt~ ln r~hf - 2, 4, Duu to delAyed ~e~ilTin~, there _re twa tArqet vector~ t8C for thQ fLxed cocl~ hont~ earch Ln the fLr-t ~ ~nding to the tra be~t l~g c~ndLdate- and theLr .c..... ~,,lLng gaLn~ prov$ded by the c~o-ed loop AdaptLve col~hook seArch. For ~-lhf ~~ two to ~-vQn, there Are four target vector~ c~ to the two be~t A-t~ of excitation model FAr Le,O det~ for the previous 8~ }f ~o far _nd to the two be~t lAAg cAndLd~te~ _nd their g~in~ provided by the ad~ptive ~ hook ~e~rch in the current 9 '' . The fixed co~hook ~e_rch i8 th~,efc ~ cArried out two tlme- Ln _ ~ ~ on and four tLme~ Ln ~--hf ~ two to ix 3ut th~ ty do-~ not ~-- -r- in ~ proportLon_t~ m~nner bec_u~e Ln e~ch _ ~ , the Qnergy ter~ c~!lllcl _re the ~e It i~
only t~ ~n~ Atinn term~ tT~C~ICl th,t _re ~t~f~'~ ~ Ln e~ch of th~ two ~ - -- for s~'' on~ and Ln e~ch of th~ four ~earche~
Ln ~1 ' - two to even Delayed JV Al~ earch helps to smooth the pLtch _nd gain CV~ -- ' A Ln _ C~P coder Delayed ~ i nn ia e~ployed in thi~
-- 3s --wo ssi~2ss24 ~- ? i -. - . 2 ~ 6 5 5 4 6 P~llu~, ~4~77 !
. .
invention in Duch a way that the overall codec delay is not in-creas~d Thus, in every subframe, the cloDed loop pitch search PLVI ~6i~ the ~ best estimates For each of the-e M best estimateS
and N best previou-D nl` f parameter~ IN optimum pitch gi~in indices, f i xed ~ h~nk LndiceD, f ixed ~od~ho~k gain indices, and fixed ~ h,o.~- gain DignD ~re derived At the end of the .~' , the~e ~N solutions are prunad to the L best using cumu-lative S~R for the current 40 m~ frame a~ th~ criteria Por th~
fir~t Dl ~ ~ ~2r ~1 and ~2 are u~-d ~or the laDt ~ hf ~2, N~2 and L~l aro UD~d I'or all other 8 ~hf c- -, 1~2, iN-2 and L-2 are used Tho delayed ~ inn approach i8 particularly ef-fectlve Ln the tran~ltlon of volced to unvoiced and un~roiced to volced r~gionD ThlD delayed ~le~ n i ,~ J~-l re~ultD ln N time~
th~ le~ity of the clo-ed loop pitch sQarch but much le~- than ~N times the ~ ty of the fix~d ' ':~' search in each ~ir ' Thl~ i~ becauDe only the correlatlon termi~ need to be calculated ~N time~ for the fixed codGhon~ in each Dubframe but thia energy terms need to be c~lculated only once Tho optlmal ~ ~L;~ for each L ` ~ are detr~ - I only at th~ end of th- ~.0 m~. frame u-lng ~_ '~~ Th~ pruning of ~1 ltir?n- to L ~1~1Ut;r~n~ 18 ~tored for e~ch ii ~f ~ to enable th~
trac~ bacle An exampl~ of how t ~ c ~ 1 { hr~ 3ho~rn in PIG 20 The dark, th~ck line lndlcate~ th~ optlmal path ob-t~ined by t~_- ' - after the la~t ~ r In mode 8, the quantization lndlce- of both set~ of ~hort t-r~ 1- llctor r- Le~.D are tran~mitted but not thQ open loop pltch e~timat~- Th- 40-mDec speech fra 1~ divlded ~nto five _ 36 --WO95/~8824 2 1 6 5546 P~ . c~ 77 B~ each 8 msec long. As ln mode A, an interpolated set o~
filtQr coefficients is used to derive the pitch index, pitch gain lntQx, fiXQd co~hoo~ indQx, and fixod cod~-ho~i~ gain index in a cloDed loop analysis by syntheDis f ashion . ThQ cloDed loop pitch search is unre~tricted in itD range, and only integer pitch delDy are searched. The fixed ~ D a multi-innovation co~ hool~
with zinc pulse section~ aD well aD Hadamard sections. The zinc pul~e sectionD are well suited for ~ n~ nt ~ while the .lAI'i~-. d 9ection-D are better DUitQd for unvoiced segmQnts. The f$xed cod~hool~ sQarch ~ iB '~fied to take advantage of this .
The higher ln-~ ty lnvolved a~ wall aD tha highQr blt rate of the short term predictor r L6~ in mode E iB ~-Dted by a slower update of the excit~tion model r- ~LD.
For mode ~, th~ 40 mD. Jpoech frame iD diYided into five Dubf -. ~ach subfrDme iB of length 8 mD. or sixty-four ~ampleD. The excitation model parameters in each subframe are the adaptive co~lAh>o~ lndex, th~ adaptive . oAnho~ gain, the fixed ind~, and the fi~c d ~ g~in. Ther- 1D no fiXQd codA~ r gain -Dlgn since it i-D alway- poDitiv~ Dt eD-timateD of thesa ~!- ' ar~ de~ - uDing ~n an~lyDiD by -DyntheDiD
method in each D~ ~ . The overall be~t s-ti~at~ iD determ~ ~Dd at the end of the 40 mD. framQ u~ing a delayed ~ approach Dimil~r to mods A.
The Dhort term predictor r~ te D or lin~ar prsdiction fil-tQr E~- L~ D are interpolated from D~'r to '' in the tlon lag domain. ~he r 1~ ~i cu~co~ tion lags -- 37 _ woss/2ss24 ` 2 ~ 65S46 ~"~, I 77 d-rived from thQ quantized fllter coeffLcient~ fo~ the second lin-~ar prediction ~naly~i~ wintow ~r~ denoted a~ ti)~ for the pre~ious 40 ms. frame. The co~ ... ~..ding lag~ for the fir~t and ~econd linear prediction analysis window~ for the current 40 mls.
f rame are denoted by { P 1 ( f ) } and { r2 ~ f ) ~ re~p~ctively . The - 1; 7~ tion ensure~ that ~ -1 ( ) ~1~ ) ~ 2 ( 0 ) 1- 0 ThQ
int~rpolated autocorrelation lags ~m(f)~ are glven by ~ m(f) ~m p~ )+om ~l(f)+[l-~m-tm]~2(i)~
l~m~-5, 0<~ 10 or in vector not~tion ~ m ~m ~-1+m ~l+tl-~m-t].~2 l< m~-s.
Here ~m and Pm are the interpolating weight~ for a~lb~ m.
Th~ interpolation lag~ {~ m(~)} ar~ ly ....~_ L~i to the ~hort term predictor filter - ~c~Pnt~ {a m(~)}.
Tho choice of interpolating wei~Jhts i~ not ~- critical in thl- mode ~ it i~ in mod- A. ~T~ , they h~v~ be-n deter-mined u~lng th~ 8~ ob~ective crlt~rla a~ in mode A ~nd fine tun-lng t~l~m by li~t~ning te~t~. Th- v~lue~ of "m and ~m whlch m~n~m~-- the ob~ective cr~teri~ ~m c~n be ~hown to be rmC-~B
c2 -AB
S C-r,l,A
_ 38 --W095128824 2 1 6 55 46 P~ 577 where A ~ J I I P-1,J-~2,Jl I
B - S I I ~_l,J-t2,J1 1 2 C - <~-l,J-'2,J~'l,J-'2,J ' Sm ~ ~ <~-l,J ~~2,J~'m,J -'2,J ' ~m "m,J -~2,J~l,J -~2,J ~
Ac before, ~ 1 J dQnote~ the Au~oc~ tion lag vQetor do-rivQd from thQ q ~-nti i filtQr coQffici L~ of the second lin~ar predlction analy~L~ window of fr~me J-l, '1 J dRnote~ the a,~o~Ll~latlon lag vector deriv~d from the quantized filter coef-ficient~ of the fir~t linQar prQdiction analy~is window of fralDe J~ ~2 J denote- th- ~U oc~L.9lAtion lag vQctor derivQd from the filtQr ~ ~ of the ~eond linear prediction ~n~ly~i~ window of frame J, and 'm J d not~- th~ ~ctual A t6~ _lAtinn l~g vQCtOr dQrived from thQ ~peQeh ~ample~ in ~ of frame J
Th~ Ad~ptiv~ CC~IA~L~O~ ~e~reh in modl~ B i~ ~imil_r to th~t in mod~ A in that th~ target veetor for th~ ~Q~rch i~ dQrived in the sam~ mA~n~r and th- di~tortion mea~ure u~ld in thQ ~e~rch i~ the ~am~ However, thero ar~ ~ome diffr--- ~. Only all integer piteh dQl~y- in th~ rang- [20,146] ar~ s-arehed and no fraetional _ 39 --woss/2ss24 ; 2~ 65546 r~l,. 01577 pLtch d~lay~ are searched A~ Ln mode A, only poDitive correla-tion~ are considered in the ~earch and the all z~ro index cor-r~pnn~i~ng to an all zero vector iJ assigned if no po~itive cor-relations are found The optimal adaptive cod~ho~l~ index is en-coded u~ing ~ bit~ The adaptive ~dn~on~- gain, whLch i8 guaran-teed to be po~itive, iD g ~nti ~1 outside the search loop u~ing a 3-bit non-uniform guantizer ThlD quantizer is diff~rent from that u~d in mod~ A
AJ in mode A, del~yed ttQ~f r~o'l i8 employed ~o that ~daptive ~oleho~ earch p vl.~ æe thQ two be~t pitch d~lay candidate~ in all Dl b) . In addition, ln 8~ ~ - two to flve, thlD ha~ to be ~ ' for the two b~t target vector~ ,,co~l by th- two be-t s-t~s of excitation model ~ t~ derived for the previou~
r-' - resulting in 4 set~ of adaptive ~ lndLces ~nd ~ociated gain~ ~t the end of th~ _ ~r . In o~eh c~-e, the targut vector for the fixed ~ earch iD derived by ~ub-tracting the ~caled adaptiYe co~t~ol~ vector from the t~rget of th~ adaptive ~ ' '- veetor Th~ fi~d .: -'-~` in mod~ a 9-bit multi-innovation co~nh~A~ with thre~ nn- Th~ fir~t i~ r' veetor sum ~ctlon and th~ ~eond and third ~ LL - ar- r-l~ted to gener~l-i~ d ~ t~ r pul~- ~hap~ z l(n) ~nd zl(n) rQ~pQetivQly The~e pu~ h~pe- h~ve been defined earlier Th~ fir~t ~eetion of thi~
:~ : and the a~oei~ted seareh ~ b~ed on the pub-lieation by D Lin ~Ultr~-~a~t CISLP Coding U~ing llultl C~ -hoo~
Innovation-~, ICASSP92 W~ notQ that in thl~ seetion, th~r~ are -- ~0 --wo 95n8824 . . 2 ~ 6 5 5 ~ 6 ~ ' 0 1 7, 256 innovatlon vectors and thQ se_rch p~oc~lu.~ gu_rantees ~ po5i-tiYe g_in The Decond _nd third DectionJ have 64 innov_tion vec-torD e_ch _nd thuir sQ_rch p.~ d~.~ can produce both positive ~5 wHll aD nQgAtive gains One - of the multi-innov_tLon ~o~hook is the deter-miniDtic vector-sum code conDL.~L~d from the Had_mard matrix Hm The codo vector of the vector-~um code a~ u~ed in this invention is ~ sed as .

UL ' S ~im v m~n),0 ~ ~15, .. 1 wher~ the ba_iD vector~ vmtn) are ~lhtA1n~ from th- rowD of th-P-' r~-SylveDter mAtrix and ~im ~ ~ 1 The ba~i3 vector~ Are D~lected ba~ed on a 2e r partition of th~ P-' -d mAtrix The cod- vectorD of th I - rd vector-~u~ _ ~' are v~lues and binary valu d cote ~s,~ e Cp~red to previou~ly con~id-ered Alg~'~rAic codes, the HadamArd vector-~um cod-s are con-~.a Lo~ to pOD~ mor- lde_l f , ~ r and ph~e char~cteri~-ticD ThL~ i~ due to the b_si~ v ctor p~rtition ~chem~ u~ed in thi~ r {~ for th~ ~A~- r~ m~tri~ which can be i.,L~ ed a~
unLorm 1 { g of th~ ord~red r rd matris row vec-tor~. In contr_~t, non-unlform F ,l{'"J m thod~ h~vo ~_ 1u {nf~-{gr ro~ult-.
The second section of th~ multi-innovation c~-: ~ conDist~
of the pula~ Dh_p- s l(n-63) and i~ 127 ~mple~ long Th~ ith v ctor of thLs ~-ction i~ ~imply th~ vector th~t ~t~rt- from the ith ntry of thLs ~ction Th~ thLrd s~ctLon consistD of th~

wo ss/2ss24 ~ 2 1 6 5 5 4 6 r~ m ~ ~4~ 77 pUl~Q shapQ z l(n-63) ~nd i8 127 ~ampleg long. HerQ i~gain, thQ
ith vQctor of thi3 ~ection is ~imply thQ vector that start~ from the ith entry of thi~ sQction. Both thQ sQcond and third section~
en~oy th~ adYant~qe~ of an oYerlapping naturQ ~nd spar~ity th~t can be exploited by the s~arch ~L~ Le ~utt as in thQ f Lxed co~ in mode A. A~ indlcated earlier, tho ~earch pr4~ e i~
not restrLctQd to pos$tive corrQlation~ and ~L~Lefore both posi-tiYQ a~ wQll as nQgativQ gains can re~ult in the second and third ~ction~ .
OncQ thQ optimum Yector ha~ boen ~el~-~ for each sQctLon, thQ ~o~rho~ gain magnitudQ is q---n~ 1 outsidQ thQ ~Qarch loop by ql~n~r~-~n~ thQ ratio of thQ optimum correlation to the optimus~
nQrgy by a non-uniform 4-bit q~,~nei~or in ~ ~. Thl~
quantiz~r i~ r~fff '~ for the fir~t ~ection whil~ thQ ~econd and third ~ections U~Q a common quant$zer. All ql~~nt~ ~or~ have zero gain a~ one of their entriQ~. Tho optimal di~tortion for e~ch ~ction is then calculated and th~ optim~l ~Qction is finally ~e-lec~ed .
Th~ fi~d c~l~ol~ ind~c for Q~ch ~ in thQ range 0-255 if th optimal ~ YQctor i~ from thQ Ur' rd s~ction.
If it is f~om ths z_l~n-63) ~ction and tho gain sign i~ po~itiYe, it i~ mapp~d to tho r~nqQ 256-319. ~t i~ from the z 1(n-63) ~c-tion and th~ gain ~ign i~ nQgatil~o~ it i~ mapp~d to the range 320-183. 1~ lt l- ~rr~3 t-- zl(n-~ ) ~ th- 9~ lgn l~ ltive, lt :-- WO 95128824 2 1 6 5 5 4 6 ~ / L~. ~ 77 io mapped to thQ r~ngo 384-447 ~f it i~ from the zl(n-63) ~ec-tion and thQ gain 3ign i~ nQgativQ, it i~s m~pped to the r~nge 448-511 The re~ulting index c~n be encoded u~ing 9 bits The fixed co~ho~L g~in magnitude i3 encoded u~ing 4 bits in ~11 5 hf ~ or modQ C, thQ 40 m~ frame i~ divid~d into five ~L": ~ a~
in mod~ 8 Each _ ~- i8 of lQngth 8 m3 or 64 O~mple~l The excit~tion modQl p~rameter~ in e_ch ~ ~re the ~daptive ~odnh~) index, thQ ad~ptive co~ gain, thQ fixed co~lAh~
index, and 2 fiXQd co~nhoo~ g~in-, one flxed ~od~ho^l~ gain being A--_ ~te~l with each half of the ~ubframe Both are gu r~nteed to be po~itivQ and ~ if~ there io no Oiqn infon~tion ~ociat-d with th m A~ in both mode~ A ~nd B, bQot estimate- of thnOe pa-t~ O ar~ A~tD~m1n~ uOing an ~nalysiO by D~ ~t.fl~l~ method in ~nch - Th~ overall b~ot e-tim~te i~ d~to~ir~ t thQ end of thQ ~0 m~ fr~m~ u~ing ~ del~yed ~ n method idQntic_l to that uo~d in mode- A and B
The ~hort term predictor p te~O or linear pr diction fil-t-r ~ L~n _re int^ pol~ted from a ~ ~ to _ ~' - in the c ~ lag domain in Qxactly the same m~nner _0 in modQ
B Howev~r, th~ Int~rr~latinq weight- ~ nd m a-r different fr th~t u~ d in mod~ B Th-y ~r obt~~~l by u~Lng the proc--dure '~ ~ ~ I for modQ B but u~ing various ~ ~ d noi~
ourc~- ~- t--a i n t nq materi~l .
Th~ _daptlY~ e_rch in mod- C 1- ~ al to that in mod B escept th_t both po~itive a- w ll ~- nQg_tive correla-tlons ~r~ ~llowed in the ~Qarch Th optim~l _daptive ~boo) index i- oncod d u-ing ~ bito ~h~ adaptlY ~ gain, which -- ~,3 --Woss/zss24 ~ - '; 2 ~ 6S546 ~ 4577 could be either posltLve or negative, l~ gllAnt~ -i outside the sQ~rch loop u~lng A 3-blt non-uniform quAntlzer. Thi~ quantizer i5 different from th_t usQd ln eithQr mode A or mode B Ln that it h_s a more re~tricted range And may have negative value~ as well.
By ~llowing both po~itive ~ ~ell _~ neg~tive correlation~ in the sQ~rch loop ~nd by having ~ qu~ntlzQr with ~ re~tr~cted dynamic range, periodic artifacts in the synthesized bA~-~,tLv.u~d noi~e due to the adAptlve co l~ho ~ _re reduced CAnAl~-rAhly. In fact, tho ~daptlvQ C~ Ol~ now beha~reA moro likQ _nother fixed co~iAhoolr.
A~ in mode A And mode B, delAyed ~s~ n i~ e~ployed And the adAptive ,~~ o~ ~e~rch ~ h.- ~ the twv be~t cAndidAte~ in _ll ~ ~ -. In ~dditlon, in L ' ~ - twv to flv~, thi_ ha~ to b~
rQpeated for the twv target vQctOr~ L--' ' by the two be~t s~t~
of excitAtion model rA te~ dQrived for the previou~ g~
re~ulting in 4 ~et~ of adaptive ~A~ ' indlce~ and a~-oci~ted g~ins at thu end of thQ s.~ . In each ca-e, thQ target vector for th~ fixed _c '~': :k ~earch i~ derived by ~ubtracting the ~caled ~d~ptivQ ' ' ~' vQctor from thQ t~rget of thQ adaptlvQ ^'-'~ )~
v~ctor.
Th~ fis~d ~ t in mod C 1- a 8-blt multi-innovatlon '~ '- and i~ 'IC'A1 to th~ v~ctor ~um s~ction in thQ n~od- B fl~t~d multi-innov~tion c~ -. ThQ ~e ~oarch pro-cQdurQ ~ e i in thQ public_tion by D . Lin ~Ultra-Fa~t CELP
Codinq U~ing Nulti-Codshool~ ~nnovation~, ICASSP92, i~ used here.
ThQr~ are 256 ~ ' vQctor~ and thQ soarch p v.~u.~ guar_ntees ~ po~itivo g_ln. ThQ flXQd c~le inde~ i~ Qncod~d u~ing 8 blt~ .
_ _ _ _ _ woss/2ss24 - 2 ~ 65546 r~ Sl?$~77 Once thQ optimum co~0~0~k vector ha- been selected, the opti-mum correlatlon and optimum energy are calculated for the first half of the 8 hf - a~ woll a~ the ~econd half of th~ nubframe separately The ratio of the correlation to the energy in both halve~ are guantized ~n~ r~nd~ntly using a S-blt non-unifor~ quAn-tizer that ha~ zero gain a~ one of it~ ontri-~ The u~e of 2 gain~ per 8 b~ en~ure~ a ~h~ e,.u~u.Lion of the back-qround noi~e Due to the delayed r~r,~r~ n, ther~ are two ~et~ of optimum fixed co~ hor~i~ indice~ and gain~ in ~ one and four ~t~ in two to five The delay~d d~ ~l^n ~ - in modQ C i~
n~ to that u~ed in other mode- A and B The optimal par_m-oter~ for ~ach ~ are ~ L ~-- at the end of the 40 m~
frame u~ing an identical t The bit allocatlon among variou~ p~ L61~ i~ _ ri7ed in Figure~ 21A and 21B for mode A, Ylgure 22 for mode B, and Flg~re 23 for mode C The-e p- ~ are packQd by the packing cir-cu$try 36 of Figure 3 Th ~e I L~c- ar- packed in the ~am~
a~ th-y ar~ tabulated in th~- Flgur~ Thu~ for mod~ A, u~ing the name notation a- in Flgur~- 21A and 21B, th y are packQd into a 168 blt ~ise packet every ~0 ms in thQ fsll ng seqUQnCes ~IODEl, ~SP2, ACGl, ACG3, ACG4, ACG5, ACG7, I~CG2, ACG6, PISCNl, PITC~2, AC~1, SIGNl, FCGl, ACI2, SIGN2, FCG2, ACI3, SIGN3, FC~3, ACI4, SIGN4, FCG4, ACI5, SIGNS, PCG5, ACI6, SIG~6, FCG6, ACI7, SIGN~, PCG7, FCI12, FCI34, ~CI56, AND FCI7 For mode ~, u~2nq th~
a notation a~ in Figur~ 21A and 21B, th~ ~ - L6.. ar- packed into a 168 bit ~is~ pack-t ev ry 40 m;c in the foll~ n~ ~equ-nce2 - ~5 --. _ _ _ _ _ _ _ _ _ _ _ wo ~sn8824 ! 2 1 6 5 5 4 6 r~ m '4'77 MODEl, LSP2, ACGl, ACG2, ACG3, ACG4, ACG5, ACIl, FCGl, FCIl, ACI2, FCG2, FCI2, ACI3, FCG3, FCI3, ACI4, FCG4, FCI4, FCI4, ACI5, FCGS, FCI5, LSPl, and MODE2. For mode C, using the ~ame notation a~ in Figures 21A and 21B, they are packed into a 168 bit size packet evQry 40 m~ in the following ~ MODE1, ~SP2, ACGl, ACG2, ACG3, ACG4, ACGS, ACIl, FCG2_1, FCIl, ACI2, FCG2_2, FCI2, ACI3, FCG2 3, FCI3, ACI4, FCG2_4, FCI4, ACI5, FCG2 S, FCI5, FCGl_l, FCGl 2, FCGl 3, FCGl 4, FCGl 5, and MOD~2. The packing ~-~u~ e ln all three mode~ is elesi~n~d to reduce the sensitivity of an ~rror in th~ mode bit~ MODEl and MODE2.
The p~ck$ng i~ done from the MSB or bit 7 to ~SB in blt 0 from bytQ 1 to byte 21. XODEl occ~r1~ the NSB or bit 7 of byte 1. By te~tLng thi~ blt, we can deter 1ne whether the - -~~p~ech belong~ to mode A or not. I~ it 1~ not mode A, we te~t th~
~ODE2 that o~c~ri~ the LSB or bit 0 of byte 21 to decide between mode B and modQ C.
The speech decoder 46 (FIG. 4) i~ ~hown in FIG. 24 and re-ceiv~ the ~ 9~ speech bit~tr-am in the same orm a~ put out by th~ speech ~ncoder of ~IG. 3. Th~ p~rameter~ ar~ ~nrac~
~fter ~ ning whoth-r th~ roceived mode bit~ ate a 1rJt mode (l~ode C), ~ ~cond mode ~lode 13), or ~ th$rd mode (Xode A).
The~ are then u~ed to D~ iZe the speech. Speech decoder 46 ~ynths~ the part of the ~ign~l c~.L~.~..1ing to the frame, ~ '1ng on the second ~et of filter coeffic$ent~, lnd~-p~n~ nt~y of the fir~t g~t of filter coefflc$ent~ ~md the fir~t and ~econd pitch e~timate~, when the f rame i~ dQto~1 n~d to be the 4 2 1 65546 ~ 77 fir~t mode (mode C); ~ynthesizQs the part of the ~ignal cor-re~pont;n~ to the fr~me, Aep~n~lin5~ on the fir~t and ~econd set~ of fllter coQfficient~, inA~ ~ tly of thQ fir~t and second pitch e~timates, when the frame is de~erm~ned to be the second mode (Mode B); and ~ynthe~i~es a part of the ~ignal c~L.. ~onding to the fram~, dep~"A~n~ on thQ ~-cond set of filter co~ffiri~Qts and the first and ~econd pitch e~timatQs, ~nAApAn i tly of the fir~t ~et of filter ~oeff~ nte, when the frame i~ det~in~d to be the third mode (mode A) In addition, thQ speech decoder receives a cyclic reA~ln~i~nry chQck (CRC) ba-ed bad framQ indicator from the channel decoder 45 (FIG 1) Thi- b~d fr~me indictor fl~g i~ used to trigger the bad frame error m~elking and error ~ ction~ (not ~hown) of th~
decoder The~H can ~l~o be ~ by some built-in error d~-tection ~chem~
Speech decoder 46 tQ~ts thQ ~SB or bit 7 of byte 1 to se~ if the - ~rel speech packet c~ o d~ to mode A OtherwiJe, th~ LS~I or bit 0 of byt~ 21 i- t~t d to ~e if the p~cket cor-r~ to mod- 8 or mod~ C Once thQ corr~ct mod~ of thQ ro-c-ived ~ peech pack~t i~ d~tn~m~-~, th~ }~ t~L~ of tho r~c~iv~d l~p~ch fr~me ar- ~, ' i and u~ed to ~yntheJize the ~peQch In ~ddition, th~ pe~ch decod r reCeivQ- a cyclic redun-d~ncy ch~ck (CRC) b~ed bad frame indicator from th~ channel de-coder 2S in l!'igure 1 Thi~ bad f rame indicator f lag i~ u~ed to trigg~r the b~d fr~m~ m~king and error L6C~ L.r portion~ of peech d-coder Th~ can al~o b~ ~ris, ~ by ~om~ built-in er-ror dQtectlon scheme~
- ~7 _ W0 sS/2ss24 ' ~ ' ~ 2 1 6 5 5 4 6 r~ c ~577 In mode A, the received ~Qcond set of line spectr~l fLe~ y indlee~ ~r~ used to reconstruct the qu~ntized fllter coeffLcients which then are converted to aucoc~r cl~tLon lags In e~ch ~l-h' ~~ the ~t~;c~-L,l~tion laq~ are interpolated using the same weight~ ~ u~ed Ln the encoder for mode A and then cu~cLLed to ~hort t-rm predictor filtor ~ fi~nt~ The open loop pitch indices ~IrQ .~ L~e1 to q -rlti - ~ open loop pitch value~ In ~aeh subframe, the~e open loop valuc-~ Ar~ us~d along with e~ch r~eeivod 5-bit adaptive - '-'- '~ inde% to ' ~^~{r^ the pitch do-lay candidate The ~daptiv~ co~ veetor CULL~ jn~ to thi~
dQl~y i~ de~ ' fr the adaptive ' -~ 10~ in Figur~ 24 The adaptivra c~1rho<,k g~in inde~c for e~ch ~.` '. is u~ed to ob-tain the adaptive c ~l~ galn whieh th~n i- ~pplied to the mul-tiplier 104 to ~eal~ the adaptive ~ veetor The fi~c~d v~etor for e~eh ~ubfr~me i~ irlf~rred from the fi~cQd 101 from the ~eeeived fi%ed ~ lr inde~c ~-oei~ted with that subfra~e ~nd thl- iS ~ealed by the ~ d co~nhool~ g~in, obt~1- ~ from th~ reeeiYc-d fi%~d ~ gnin ind~ nd the ~ign ind~c for thAt .,'f~ , by ~ultlpll-r 102 aoth the ~e~led adap-tiVQ c~ '- veetor ~nd tho ~eal~d fi%ed ~ '- vector are ~ummsd by u~m~r 105 to produce an ~elt~tlon ~ign~l whleh i~ en-hane-d by a plteh prefllter 106 a~ in L A Ger~on and M ~ Ja~uik, ~upr~ t~t1t n slgn~l i- u~ed to d~rivQ the hort term predietor 107 nd the ynt~ speech i5 e~ -ly further ~n~ ad by n glob~l pole-zero filter 109 with built in peetr~l tilt corr-etion ~nd enQrgy r~ z~tion At th~ end of eaeh D~' f~ , thl~ ad~pti~e e~ k iS upd~ted by W0 95/28824 - 2 1 6 5 5 4 6 r~ z,,s, ~ 1'77 the excLtatLon signal a~ indicated by the dotted line in ~lgure 25 .
In mode B, both ~et~ of line spectral frequency indices are used to recon~truct both the fir~t and second sets of quantized f$1ter ~o~ffl~iants whLch 8~ tly are converted to au~ tLon lags. In each Dl ` ' r the~e ~ltoc~ latLon l~g~ are interpolated u~ing exactly the ~ame weight~ aJ used in the encoder in mode B and then converted to short term predictor coeffi~-iants. In each subframe, the received adaptive co~lahoo Lndex i~ used to deriva the adaptLve cod~hoolr vector from the ~daptLve ~ ,ho L- 103 and the rec~Lved fLXQd ~ ~'~ '- index i~
used to derLve thQ fixed co~h~k gain indQx are used Ln each subf rame to retrievQ the adaptive ,~.h.~ gain and the f ixed cori~ho~r gain. The exeit~tion vQCtor L~ L~d by ~caling the adaptivQ -~ veetor by thQ adaptivQ col~hool~ gain u~ing multiplier 10~, Yealing thQ fixed ~vd~ho~O~ vQetor by the fix~d ~od~h~ok gain u~ing multiplier 102, and ~umming them using ~ummer 105: A- Ln mode A, thi- L~ i by th- piteh prQfilter 106 prior to ~..L'--i~ by thQ short te m predietor 107. ThQ synth2-~12ed ~p~Qeh i~ further ~nllr-~l ~ by th~ global polQ-zero po~tflltQr 108. At the end of e~eh - '' , thQ adaptLve h>o~ i- updated by thQ Qxeitatlon sLgnal a~ indie~ted by the dotted line in FlgurQ 2~.
In mode C, thQ reeeLved seeond ~et of lin~ 8p~etral f~
indiee~ arQ u~ed to reeonJtruet the qu~nt~ filter eoefficientJ
~hieh thQn are c~ ed to au~occ LL~,latlon lag~ . ~n each ' f , th~ ~- Locc ~ ~lation lag~ aro int~rpolatQd u~ing th~ Jame _ ~,g _ W095~28824 ; ~ 2 1 65546 r~ cl 77 w~ight~ a~ u~od in the encoder for mode C ant then converted to hort t~rm predictor filtQr coefficients In each subframe the received ataptive co~eho~k index i~ used to derive the adaptivQ
corlr~hook vector from the adaptive co~hool~ 103 and the received fixed ~ index i3 u~ed to derive thQ fixed codr~ho~l~ vector from the fixQd coARh~o~ 101 ThQ adaptivQ c~dr~h~k gain index and th~ fixed co~lrhoolc gAin indice~ are used in e~ch 3ubframe to re-tri~v~ the ad~ptive . ~ Ihc lc gain and the fixed _c~ - g~ins for both hAlve~ of thQ ~ The excitation vector is recon-~ by scaling thQ ~daptivs ~o~R~ook vector by thQ adaptivQ40dAl"oo~- gAin u~ing multiplicr 10J, llcalinq the fir~t h~lf of thQ
fl~ed ~ vQctOr by the fir~t fi~ed ~nl~oA~ g~in using ~ul-tiplier 102 and the s~cond half of the fl~ed ~ v~ctor by th~ ~econd fi~d co~J~hoolc g~in u-inq multipliQr 102, and ~ulmninq th~l scaled adAptiv~ ~nd fi~ed .~n~ok v~ctorJ u-ing ~ummer 105 As in mode~ A and B, this i~ ~nhAn~r~ by thQ pitch prefilter 106 prior thQ synthe~is by the ~hort t~rm prediceor 107 The ~ynthe-sized ~p~ch i- furehor a ~~ by the qlobal pol--zero postfilt~r 108 Th~ r ~ ArA of th ~ pitch prefiltQr and global po~t~llt~r u-ed in e~ch ~odQ ar~l dlfferQnt and are t~ilored to ~ch ~od . At th~ Qnd of each ~ ~ , th~ adaptiv~ iJ
upd~t-d by th~ e~cit~tion ign~l _- indicated by th~ dotted lino in Flgure 2~..
A- an_ltern~tiv~ to the illu~trAt~d 1 t, th~
n mAy be practiced wlth a ~hortQr fra~, ~uch a- ~1 22 5 m~
fr~e, a~ hoYn in Fig 25 With ~uch a fra~, it miqht b~
d~-irAhl~ to proce~- only one LP an_ly~i~ window p~r fra~

wos~/28824 2 1 ~546 Pcrlus9s/o~s77 in~tead of the two LP analysis windows lllustrated. The analysis window might begin after a duration Tb relative to the beginning of the current f rame and extend into the next f rame where the window would end after a duration Te relative to the beginning of the next frame, where Te ~ Tb In other wordJ, the total duration of an analysis window could be longer than the duration of ~
frame, and two consecutiYe windows could, therefore, encompas~ a particular frame. Thus, a current frame could be analyzed by processing the analysis window for the current frame together with the analysis window for the previous frame.
Thu~, the pref erred co~munic~tion sy~tem detects when nois~
i~ the pred i n~nt - t of a signal f rame and encodes a noise-predominated frame differently than for a speech-predomi-nated frame. Thls ~pecial ~n~-oA~ n~ for noise avoids some of the typical artLfacts produced when noi~e 1~ encoded with a scheme optimized for speech. This special ~ncoAing allow improved voice quality in a low rate bit-rate codec systQm.
Additional advantage~ and '{fic~tlon~ will re~dily occur to tho~e s3cillQd in the art. T~ invQntion in it~ broader aspects is therefor~ not limited to the spQcific dQta$1s, representative ap-par~tu~, and illu~trative example~ shown and de~cribed. ~arious modif ic~tion~ and Yariation~ can b~ made to the present invention ~ithout depa~tlnq from the ~cop~ or spir~t of the inventiorl, and it i~ intend~d that t~e pr~sent inYention cover the modifica~ions a~d ~ariAtion3 pro~ided thQ~ co3e with~n th6~ scope of ch~? 2ppende~1 c ~ ~ims and their equi~ent& .
et

Claims

What is claimed is:

1. A method of processing a signal having a speech component, the signal being organized as a plurality of frames, the method comprising the steps, performed for each frame, of:
determining whether the frame corresponds to a first mode, depending on whether the speech component is substantially absent from the frame;
generating an encoded frame in accordance with one of a first coding scheme, when the frame corresponds to the first mode, and an alternative coding scheme, when the frame does not correspond to the first mode; and decoding the encoded frame in accordance with one of the first coding scheme, when the frame corresponds to the first mode, and the alternative coding scheme when the frame does not correspond to the first mode.

2. The method of claim 1 wherein the step of determining includes the substep of:
comparing an energy content of the frame to one or more thresholds.

3. The method of claim 1 wherein the step of determining includes to substeps of:
comparing an energy content of the frame to a one or more thresholds; and subsequently updating one of the thresholds, using the energy content, when the frame corresponds to the first mode.

4. The method of claim 1, wherein the determining step includes the substep of:
comparing a spectral content of the frame to a spectral content of a previous frame.

5. The method of claim 4 wherein the comparing step includes the substeps of:
determining a set of filter coefficients corresponding to the frame; and determining another set of filter coefficients corresponding to a previous frame.

6. The method of claim 1 wherein the determining step includes the substep of:
comparing a fundamental frequency of the frame to a fundamental frequency of a previous frame.

7. The method of claim 1 wherein the step of determining includes the substep of:
comparing a number of zero crossings of the frame to one or more thresholds.

8. The method of claim 1 wherein the step of determining includes the substep of:
measuring transitions in amplitude within the frame.

9. A method of processing a signal having a speech component, the signal being organized as a plurality of frames, the method comprising the steps, performed for each frame, of:
analyzing a first part of the frame to generate a first set of filter coefficients;
analyzing a second part of the frame and a part of a next frame to generate second set of filter coefficients;
analyzing a third part of the frame to generate a first pitch estimate;
analyzing a fourth part of the frame and a part of the next frame to generate a second pitch estimate;
determining whether the frame is a one of a first mode, a second mode, and a third mode, depending on measures of energy content of the frame and spectral content of the frame;
synthesizing a part of the signal corresponding to the frame, depending on the second set of filter coefficients and the first and second pitch estimates, independently of the first set of filter coefficients, when the frame is determined to be the third mode;
synthesizing the part of the signal corresponding to the frame, depending on the first and second sets of filter coefficients, independently of the first and second pitch estimates, when the frame is determined to be the second mode; and synthesizing the part of the signal corresponding to the frame, depending on the second set of filter coefficients, independently of the first set of filter coefficients and the first and second pitch estimates when the frame is determined to be the first mode.

10. The method of claim 9, wherein the determining step includes the substep of:
determining a mode depending on a determined mode of a previous frame.

11. The method of claim 9 wherein the determining step includes the substep of:
determining the mode to be the first mode only when the determined mode of a previous frame is either the first mode or the second mode.

12. The method of claim 9, wherein the determining step includes the substep of:
determining the mode to be the third mode only when the determined mode of a previous frame is either the third mode or the second mode.