CN103474074B - Pitch estimation method and apparatus - Google Patents

Pitch estimation method and apparatus Download PDF

Info

Publication number
CN103474074B
CN103474074B CN201310409433.8A CN201310409433A CN103474074B CN 103474074 B CN103474074 B CN 103474074B CN 201310409433 A CN201310409433 A CN 201310409433A CN 103474074 B CN103474074 B CN 103474074B
Authority
CN
China
Prior art keywords
pitch period
maximum
normalized autocorrelation
autocorrelation functions
voice signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310409433.8A
Other languages
Chinese (zh)
Other versions
CN103474074A (en
Inventor
闫建新
张勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Guangsheng Research And Development Institute Co ltd
Original Assignee
Shenzhen Rising Source Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Rising Source Technology Co ltd filed Critical Shenzhen Rising Source Technology Co ltd
Priority to CN201310409433.8A priority Critical patent/CN103474074B/en
Publication of CN103474074A publication Critical patent/CN103474074A/en
Application granted granted Critical
Publication of CN103474074B publication Critical patent/CN103474074B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention relates to a kind of pitch estimation method and apparatus. Described device comprises: Signal Pretreatment unit, normalized autocorrelation functions computing unit and pitch period post-processing unit. Described method comprises: S1, the pretreatment of voice signal being removed to DC component, perceptual weighting and signal down-sampling; S2, calculate the normalized autocorrelation functions value of described pretreated voice signal; S3, determine the maximum in described normalized autocorrelation functions value in pitch period hunting zone, pitch period candidate value corresponding described maximum is defined as to the pitch period estimated value of described voice signal. The present invention has overcome frequency multiplication and half mistake frequently in pitch period estimation preferably, has promoted the noise robustness of pitch period method of estimation, has reduced the computational complexity of algorithm simultaneously, has improved corresponding DAB/voice coding efficiency. The present invention can be applicable to the pitch search in various voice coding/decoding algorithms, has applicability widely.

Description

Pitch estimation method and apparatus
Technical field
The present invention relates to speech coding technology, more particularly, relate to a kind of pitch estimation methodAnd device.
Background technology
Pitch period refers to the cycle of vocal cord vibration when people pronounces. Pitch period be in voice coding one importantProblem, its accuracy will directly have influence on coding quality and the efficiency of speech coder. Fundamental tone week accuratelyPhase property is analyzed, and can in speech, effectively remove redundancy, reduces the bit number of coding, realizesLow bit rate high-quality speech coding. But, due to the particularity of voice, the accurate search meeting of pitch periodFace following difficulty:
(1) voice signal variation is very complicated, and glottal excitation waveform is not a periodic pulse train completely,And when being, becomes in the cycle of speech waveform.
(2) do not have the such periodicity of vocal cord vibration in the beginning and end part of voice, some is clearThe transition sound such as voiced sound are to be difficult to judge that it belongs to cycle or nonperiodic signal, thereby are also just unable to estimate fundamental tone weekPhase.
(3) to from voice signal, remove sound channel impact, directly take out only relevant with vocal cord vibration informationMore difficult.
(4) this difficulty that accurately starts and finish that defines each pitch period in voiced segments has limited fundamental toneReliable measurements, this not only because voice signal itself be quasi-periodic (being that fundamental tone is vicissitudinous), withTime also because waveform is subject to the impact of formant and noise etc.
(5) in actual applications, ambient noise can affect the performance of pitch Detection, for mobile communication ringBorder is particularly important, because waveform often there will be high level of noise.
(6) pitch period excursion is large has brought certain difficulty also to accurate pitch Detection.
At present, also do not have a kind of general method can accurately extract reliably voice base in either caseThe sound cycle. Traditional fundamental tone detecting method, can be divided into time domain method and frequency domain method. In time domain, traditional fundamental tonePeriodical algorithms comprise based on average magnitude difference function (AverageMagnitudeDifferenceFunction,AMDF) fundamental tone algorithm for estimating, based on short-time autocorrelation function (AutocorrelationFunction,ACF) Pitch Detection Algorithm. These two kinds of algorithms can be referring to as the introduction of Publication about Document:
Chu,WaiC.Speechcodingalgorithms:foundationandevolutionofstandardizedcoders.JohnWiley&Sons,Inc.2003,pp.33-45。
In the angle of frequency domain, Griffin and Lim have proposed a kind of frequency domain pitch period estimation scheme(D.W.Griffin,J.S.Lim.MultibandExcitationVocoder.IEEETransASSP,1988,36 (8)),, for multi-band excitation speech coding algorithm (MBE), this pitch period algorithm for estimating adopts and closesRing analysis synthetic method, matched signal frequency-domain waveform, obtains optimum pitch period and estimates.
In actual applications, the pitch search algorithm based on time domain is because its algorithm is simple, and performance is compared with good and obtainTo extensive use. For example at current speech coding standard G.729, in AMR-WB, all taked time domainImproved short-time autocorrelation function (ACF) Pitch Detection Algorithm (Bao Changchun. low code check digital speech codeBasis. Beijing: publishing house of Beijing University of Technology, 2001.2.). But the ACF method of time domain is held conventionallyEasily produce " frequency multiplication " and " half frequently " mistake, AMDF method can not effectively be followed the tracks of speech frequency and be become fastChange. Frequency domain method generally adopts Cepstrum Method, owing to introducing logarithm operation, amount of calculation is increased considerably, andBe subject to the impact of noise.
Summary of the invention
The technical problem to be solved in the present invention is, for the above-mentioned defect of prior art, provides a kind of low multipleAssorted degree, efficient pitch estimation method and apparatus, can overcome in pitch period estimation preferablyFrequency multiplication and half frequency mistake, and can raising anti-noise performance.
The technical solution adopted for the present invention to solve the technical problems is: propose a kind of pitch estimationMethod, comprises the steps:
S1, the pretreatment of voice signal being removed to DC component, perceptual weighting and signal down-sampling;
S2, use following formula calculate the normalized autocorrelation functions value of described pretreated voice signal:
ρ ( τ ) = Σ n = 0 N - 1 s ( n ) s ( n - τ ) Σ n = 0 N - 1 s 2 ( n ) Σ n = 0 N - 1 s 2 ( n - τ ) ,
Wherein, ρ (τ) represents normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ representsVoice fundamental cycle candidate value in search, N is the length of a frame signal after signal down-sampling;
S3, determine the maximum in described normalized autocorrelation functions value in pitch period hunting zone, by instituteState the pitch period estimated value that pitch period candidate value corresponding to maximum is defined as described voice signal.
In an embodiment, described step S1 further comprises:
S11, to voice signal resampling to inner sample rate;
S12, the voice signal of resampling is carried out to high-pass filtering to remove DC component;
S13, the voice signal after high-pass filtering is carried out to perceptual weighting;
S14, the voice signal after perceptual weighting is carried out to LPF and 1/2 down-sampling.
In an embodiment, described inner sample rate is 12.8kHz, and the cut-off frequency of described high-pass filtering is50Hz。
In an embodiment, described step S3 further comprises:
S31, according to the sample rate of voice signal, pitch period hunting zone is divided into the first interval,Two interval and the 3rd intervals, obtain respectively each interval normalized autocorrelation functions maximum and corresponding baseSound cycle candidate value;
S32, the weight parameter that foundation is certain, from described three interval normalized autocorrelation functions maximumsIn select the normalized autocorrelation functions maximum of described pitch period hunting zone, by this maximum correspondencePitch period candidate value be defined as the pitch period estimated value of described voice signal.
In an embodiment, described step S32 further comprises: judge that normalization between Second Region is from phaseClose function maximum and whether be more than or equal to normalized autocorrelation functions maximum and the described weight in the first intervalThe product of parameter, if so, by fundamental tone week corresponding the normalized autocorrelation functions maximum between Second RegionPhase candidate value is defined as the pitch period estimated value of described voice signal, otherwise, further judge the 3rd intervalNormalized autocorrelation functions maximum whether be more than or equal to the normalized autocorrelation functions maximum in the first intervalThe product of value and described weight parameter, if so, by the normalized autocorrelation functions maximum in the 3rd intervalCorresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal, otherwise by the firstth districtBetween pitch period candidate value corresponding to normalized autocorrelation functions maximum be defined as described voice signalPitch period estimated value.
In an embodiment, between described the first interval, Second Region and the 3rd interval is specially [L_min, 39],[40,79], [80, L_max], wherein L_min represents the initial value of pitch period hunting zone, L_maxRepresent the end value of pitch period hunting zone.
The present invention also proposes a kind of pitch estimation device for solving its technical problem, comprising:
Signal Pretreatment unit, removes DC component, perceptual weighting and signal down-sampling to voice signalPretreatment;
Normalized autocorrelation functions computing unit, uses following formula to calculate returning of described pretreated voice signalOne changes auto-correlation function value:
ρ ( τ ) = Σ n = 0 N - 1 s ( n ) s ( n - τ ) Σ n = 0 N - 1 s 2 ( n ) Σ n = 0 N - 1 s 2 ( n - τ ) ,
Wherein, ρ (τ) represents normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ representsVoice fundamental cycle candidate value in search, N is the length of a frame signal after signal down-sampling;
Pitch period post-processing unit, determines described normalized autocorrelation functions value in pitch period hunting zoneIn maximum, pitch period candidate value corresponding described maximum is defined as to the fundamental tone of described voice signalCycle estimated value.
In an embodiment, further sample to inner to voice signal resampling in described Signal Pretreatment unitRate, then carries out high-pass filtering to remove DC component, subsequently to high-pass filtering to the voice signal of resamplingAfter voice signal carry out perceptual weighting, finally to the voice signal after perceptual weighting carry out LPF and1/2 down-sampling.
In an embodiment, described pitch period post-processing unit is further according to the sample rate of voice signal,Pitch period hunting zone is divided between the first interval, Second Region and the 3rd interval, obtains respectively each districtBetween normalized autocorrelation functions maximum and corresponding pitch period candidate value, and according to certain weight ginsengNumber is selected described pitch period search model from described three interval normalized autocorrelation functions maximumsThe normalized autocorrelation functions maximum of enclosing, described in pitch period candidate value corresponding this maximum is defined asThe pitch period estimated value of voice signal.
In an embodiment, described pitch period post-processing unit according to certain weight parameter from described threeIn interval normalized autocorrelation functions maximum, select the normalization of described pitch period hunting zone certainlyCorrelation function maximum is specially: judge whether the normalized autocorrelation functions maximum between Second Region is greater than etc.In the normalized autocorrelation functions maximum in the first interval and the product of described weight parameter, if so, willPitch period candidate value corresponding to normalized autocorrelation functions maximum between Second Region is defined as described voiceThe pitch period estimated value of signal, otherwise, further judge the normalized autocorrelation functions maximum in the 3rd intervalWhether value is more than or equal to the normalized autocorrelation functions maximum in the first interval and the product of described weight parameter,If so, pitch period candidate value corresponding the normalized autocorrelation functions maximum in the 3rd interval is determinedFor the pitch period estimated value of described voice signal, otherwise by the normalized autocorrelation functions maximum in the first intervalPitch period candidate value corresponding to value is defined as the pitch period estimated value of described voice signal.
Pitch estimation method and apparatus of the present invention, examines based on normalized autocorrelation functions fundamental toneSurvey, and introduce pretreatment and post-processing technology in pitch period is estimated, overcome preferably pitch period and estimatedFrequency multiplication in meter and half mistake frequently, has promoted the noise robustness of pitch period method of estimation, has reduced simultaneouslyThe computational complexity of algorithm, has improved corresponding DAB/voice coding efficiency. The present invention can be suitable forPitch search in various voice coding/decoding algorithms, has applicability widely.
Brief description of the drawings
Below in conjunction with drawings and Examples, the invention will be further described, in accompanying drawing:
Fig. 1 is the flow chart of the pitch estimation method of one embodiment of the invention;
Fig. 2 is the flow chart of a specific embodiment of step 110 in Fig. 1;
Fig. 3 is the flow chart of a specific embodiment of step 130 in Fig. 1;
Fig. 4 is the logic diagram of the pitch estimation device of one embodiment of the invention.
Detailed description of the invention
In order to make object of the present invention, technical scheme and advantage clearer, below in conjunction with accompanying drawing and realityExecute example, the present invention is further elaborated. Only should be appreciated that specific embodiment described hereinOnly, in order to explain the present invention, be not intended to limit the present invention.
Fig. 1 shows the flow chart of the pitch estimation method 100 of one embodiment of the invention. AsShown in Fig. 1, this pitch estimation method 100 comprises:
In step 110, voice signal is removed to the pre-of DC component, perceptual weighting and signal down-samplingProcess.
In step 120, calculate the normalized autocorrelation functions value of pretreated voice signal. The present invention makesNormalized autocorrelation functions with following:
ρ ( τ ) = Σ n = 0 N - 1 s ( n ) s ( n - τ ) Σ n = 0 N - 1 s 2 ( n ) Σ n = 0 N - 1 s 2 ( n - τ ) ,
Wherein, ρ (τ) represents normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ representsVoice fundamental cycle candidate value in search, N is the length of a frame signal after signal down-sampling.
In step 130, determine the maximum in normalized autocorrelation functions value in pitch period hunting zone,Pitch period candidate value corresponding described maximum is defined as to the pitch period estimated value of voice signal.
The present invention has introduced Signal Pretreatment technology in pitch period is estimated. Fig. 2 shows shown in Fig. 1The flow chart of a specific embodiment of Signal Pretreatment step 110. As shown in Figure 2, this Signal PretreatmentStep 110 further comprises:
In step 111, voice signal resampling is arrived to inner sample rate (Fs=12.8kHz).
In later step 112, the voice signal of resampling is carried out to high-pass filtering. High-pass filtering wave filterCut-off frequency can be 50Hz, and its object is to remove DC component.
Then in step 113, the voice signal after high-pass filtering is carried out to perceptual weighting.
In final step 114, the voice signal after perceptual weighting is carried out to LPF and 1/2 down-sampling,To be 3.2kHz by signal broadband.
Further, in preferred embodiment, the present invention can also add numerical value in Signal Pretreatment step 110Thereby formant is removed in filtering and high-frequency noise is estimated pitch period more accurately.
The present invention, before carrying out pitch period search, carries out pretreatment to the voice signal of input, so bothCan filtering estimate inoperative HFS to pitch period, the computing that also can reduce algorithm is simultaneously multipleAssorted degree.
The present invention has also introduced pitch period post-processing technology in pitch period is estimated. Fig. 3 shows Fig. 1The flow chart of a specific embodiment of shown pitch period post-processing step 130. As shown in Figure 3, shouldPitch period post-processing step 130 further comprises:
In step 131, according to the sample rate of voice signal, pitch period hunting zone is divided into the firstth districtBetween, between Second Region and the 3rd interval, obtain respectively each interval normalized autocorrelation functions maximum and rightThe pitch period candidate value of answering.
In an embodiment, pitch period hunting zone is [L_min, L_max], and wherein L_min representsThe initial value of pitch period hunting zone, L_max represents the end value of pitch period hunting zone. According to frontThe sample frequency of the voice signal of stating, can be divided into this pitch period hunting zone following three intervals,I.e. the first interval [L_min, 39], between Second Region [40,79], the 3rd interval [80, L_max], so thatIn these three intervals, determine correct pitch period estimated value. In specific embodiment, L_min and L_maxCan be respectively 0 and 256. Based on above three intervals, can obtain each interval maximum ρ (τ) value andCorresponding pitch period candidate value τ, is designated as ρmax1、ρmax2And ρmax3,τ1、τ2And τ3
In step 132, according to certain weight parameter, from described three interval normalized autocorrelation functionsIn maximum, select the normalized autocorrelation functions maximum of described pitch period hunting zone, by this maximumPitch period candidate value corresponding to value is defined as the pitch period estimated value of described voice signal.
In an embodiment, selected weight parameter c(can be near the numerical value 1.0, for example 0.97) and, canCarry out by the following method to determine optimum pitch period candidate value τopt
First judge the normalized autocorrelation functions maximum ρ between Second Regionmax2Whether be more than or equal to the firstth districtBetween normalized autocorrelation functions maximum ρmax1With the product of weight parameter c, if so, by Second RegionBetween normalized autocorrelation functions maximum ρmax2Corresponding pitch period candidate value τ2Be defined as voice signalPitch period estimated value, otherwise, further judge the normalized autocorrelation functions maximum in the 3rd intervalρmax3Whether be more than or equal to the normalized autocorrelation functions maximum ρ in the first intervalmax1With taking advantage of of weight parameter cLong-pending, if so, by the normalized autocorrelation functions maximum ρ in the 3rd intervalmax3Corresponding pitch period is waitedChoosing value τ3Be defined as the pitch period estimated value of voice signal, otherwise by the normalized autocorrelation letter in the first intervalNumber maximum ρmax1Corresponding pitch period candidate value τ1Be defined as the pitch period estimated value of voice signal.
Relevant mathematical notation is as follows:
Make τopt1,ρmaxmax1
If ρmax2≥cρmax, ρmaxmax2,τopt2
If ρmax3≥cρmax, ρmaxmax3,τopt3
Further, in preferred embodiment, the present invention can also utilize in pitch period post-processing step 130Normalized autocorrelation functions judge voice signal clear/accuracy that turbid characteristic is estimated to promote pitch period.
Pitch estimation method based on above introduction, the present invention also proposes a kind of voice fundamental cycleEstimation unit. Fig. 4 shows the logic of the pitch estimation device 400 of one embodiment of the inventionBlock diagram. As shown in Figure 4, this pitch estimation device 400 comprise Signal Pretreatment unit 410,Normalized autocorrelation functions computing unit 420 and pitch period post-processing unit 430. Signal Pretreatment unitThe voice signal of 410 pairs of inputs is removed the pretreatment of DC component, perceptual weighting and signal down-sampling.Normalized autocorrelation functions computing unit 420 uses following formula to calculate through 410 pretreatment of Signal Pretreatment unitAfter the normalized autocorrelation functions value of voice signal:
ρ ( τ ) = Σ n = 0 N - 1 s ( n ) s ( n - τ ) Σ n = 0 N - 1 s 2 ( n ) Σ n = 0 N - 1 s 2 ( n - τ ) ,
Wherein, ρ (τ) represents normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ representsVoice fundamental cycle candidate value in search, N is the length of a frame signal after signal down-sampling. Pitch periodPost-processing unit 430 is determined the maximum in normalized autocorrelation functions value in pitch period hunting zone, willPitch period candidate value corresponding to this maximum is defined as the pitch period estimated value of described voice signal.
In a specific embodiment, first Signal Pretreatment unit 410 arrives the voice signal resampling of inputInner sample rate (Fs=12.8kHz), then carries out high-pass filtering, wave filter to the voice signal of resamplingCut-off frequency can be 50Hz, its object is to remove DC component, subsequently to the language after high-pass filteringTone signal is carried out perceptual weighting, finally the voice signal after perceptual weighting is carried out LPF and is adopted for 1/2 timeSample will be 3.2kHz by signal broadband. So both can filtering estimate inoperative height to pitch periodFrequently part, the while also can be reduced the computational complexity of algorithm.
In a specific embodiment, pitch period post-processing unit 430, will according to the sample rate of voice signalPitch period hunting zone is divided between the first interval, Second Region and the 3rd interval, for example the first interval[L_min, 39], between Second Region [40,79], the 3rd interval [80, L_max], wherein L_min represents fundamental toneThe initial value of cycle hunting zone, L_max represents the end value of pitch period hunting zone, then obtains respectivelyTo maximum ρ (τ) value and the corresponding pitch period candidate value τ in each interval, be designated as ρmax1、ρmax2And ρmax3,τ1、τ2And τ3. Pitch period post-processing unit 430 can be also 1.0 according to certain weight parameter c(Near numerical value, for example 0.97), carry out by the following method to determine optimum pitch period candidate value τopt
Make τopt1,ρmaxmax1
If ρmax2≥cρmax, ρmaxmax2,τopt2
If ρmax3≥cρmax, ρmaxmax3,τopt3
Pitch estimation method and apparatus of the present invention, examines based on normalized autocorrelation functions fundamental toneSurvey, and introduce pretreatment and post-processing technology in pitch period is estimated, overcome preferably pitch period and estimatedFrequency multiplication in meter and half mistake frequently, has promoted the noise robustness of pitch period method of estimation, has improved correspondingDAB/voice coding efficiency. Below provide pitch search algorithm in the present invention and AMR-WB+Performance Ratio is:
1, performance test methods: sequence of calculation average signal-to-noise ratio (SNR), it is defined as follows:
segSNR ‾ = 1 N SF Σ i = 0 N SF - 1 segSNR i ,
Wherein, N(N=256) be the length of a frame voice signal, NSFBe the totalframes of a voice sequence, xw(n)For the signal of primary signal after perceptual weighting,For the voice signal process after coding/decodingSignal after perceptual weighting.
2, test result
Two kinds of sequence of algorithms average SNR contrasts of table 1 (monophonic)
Two kinds of sequence of algorithms average SNR contrasts of table 2 (stereo)
3, test result analysis
(1), from test result, the algorithm performance that the present invention proposes is slightly better than the fundamental tone week of AMR-WB+Phase searching algorithm performance, computational complexity is than the complexity of AMR-WB+ algorithm suitable (also slightly smallPoint).
(2) from the interpretation of result of table 1 and table 2, es02, two sequential coding performances of s_cl_mt_2_org arePoor, s_cl_ft_3_org coding efficiency is best. By sequence analysis es02, two sequences of s_cl_mt_2_orgBe middle-aged male sound, s_cl_ft_3_org is young woman's sound. By Algorithm Analysis, this and Ben FaPreventing of setting in bright algorithm detects that the parameter of doubling time chooses relevantly, and this parameter is an empirical value, orderFront algorithm is mainly considered schoolgirl, scholar without a xiucai degree's situation, and the feature of these sequences is its pitch period excursionGreatly, and rapid, and Comparatively speaking its pitch period variation of middle-aged male sound is very mild, and changes modelEnclose relative also less.
(3) test in along tape test some typical noisy speech s_no_ft_9_org, s_no_2t_1_org,S_no_2t_2_org, s_no_2t_3_org, s_no_ft_1_org, for example, contain a large amount of backgrounds on airport etc.The situation of noise, from test result, the noiseproof feature of algorithm of the present invention is better than AMR-WB+ algorithm.

Claims (4)

1. a pitch estimation method, is characterized in that, comprises the steps:
S1, the pretreatment of voice signal being removed to DC component, perceptual weighting and signal down-sampling;
S2, use following formula calculate the normalized autocorrelation functions value of described pretreated voice signal:
ρ ( τ ) = Σ n = 0 N - 1 s ( n ) s ( n - τ ) Σ n = 0 N - 1 s 2 ( n ) Σ n = 0 N - 1 s 2 ( n - τ ) ,
Wherein, ρ (τ) represents normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ representsVoice fundamental cycle candidate value in search, N is the length of a frame signal after signal down-sampling;
S3, determine the maximum in described normalized autocorrelation functions value in pitch period hunting zone, by instituteState the pitch period estimated value that pitch period candidate value corresponding to maximum is defined as described voice signal;
Wherein, described step S1 further comprises:
S11, to voice signal resampling to inner sample rate;
S12, the voice signal of resampling is carried out to high-pass filtering to remove DC component;
S13, the voice signal after high-pass filtering is carried out to perceptual weighting;
S14, the voice signal after perceptual weighting is carried out to LPF and 1/2 down-sampling;
Described step S3 further comprises:
S31, according to the sample rate of voice signal, pitch period hunting zone is divided into the first interval,Two interval and the 3rd intervals, obtain respectively each interval normalized autocorrelation functions maximum and corresponding baseSound cycle candidate value;
S32, the weight parameter that foundation is certain, from described three interval normalized autocorrelation functions maximumsIn select the normalized autocorrelation functions maximum of described pitch period hunting zone, by this maximum correspondencePitch period candidate value be defined as the pitch period estimated value of described voice signal, specifically comprise: judgeWhether the normalized autocorrelation functions maximum in two intervals is more than or equal to the normalized autocorrelation letter in the first intervalThe product of number maximum and described weight parameter, if so, by the normalized autocorrelation functions between Second RegionPitch period candidate value corresponding to maximum is defined as the pitch period estimated value of described voice signal, otherwise,Further judge whether the normalized autocorrelation functions maximum in the 3rd interval is more than or equal to returning of the first intervalOne changes the product of auto-correlation function maximum and described weight parameter, if so, and by the normalizing in the 3rd intervalChange the pitch period that pitch period candidate value corresponding to auto-correlation function maximum is defined as described voice signalEstimated value, otherwise by true pitch period candidate value corresponding the normalized autocorrelation functions maximum in the first intervalBe decided to be the pitch period estimated value of described voice signal.
2. method according to claim 1, is characterized in that, described inner sample rate is 12.8kHz,The cut-off frequency of described high-pass filtering is 50Hz.
3. method according to claim 1, is characterized in that, between described the first interval, Second RegionBe specially [L_min, 39] with the 3rd interval, [40,79], [80, L_max], wherein L_min represents fundamental toneThe initial value of cycle hunting zone, L_max represents the end value of pitch period hunting zone.
4. a pitch estimation device, is characterized in that, comprising:
Signal Pretreatment unit, removes DC component, perceptual weighting and signal down-sampling to voice signalPretreatment;
Normalized autocorrelation functions computing unit, uses following formula to calculate returning of described pretreated voice signalOne changes auto-correlation function value:
ρ ( τ ) = Σ n = 0 N - 1 s ( n ) s ( n - τ ) Σ n = 0 N - 1 s 2 ( n ) Σ n = 0 N - 1 s 2 ( n - τ ) ,
Wherein, ρ (τ) represents normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ representsVoice fundamental cycle candidate value in search, N is the length of a frame signal after signal down-sampling;
Pitch period post-processing unit, determines described normalized autocorrelation functions value in pitch period hunting zoneIn maximum, pitch period candidate value corresponding described maximum is defined as to the fundamental tone of described voice signalCycle estimated value;
Wherein, described Signal Pretreatment unit further arrives inner sample rate to voice signal resampling, thenThe voice signal of resampling is carried out to high-pass filtering to remove DC component, subsequently to the voice after high-pass filteringSignal carries out perceptual weighting, finally the voice signal after perceptual weighting is carried out to LPF and 1/2 down-sampling;
Described pitch period post-processing unit further, according to the sample rate of voice signal, is searched for pitch periodScope is divided between the first interval, Second Region and the 3rd interval, obtains respectively each interval normalization from phaseClose function maximum and corresponding pitch period candidate value, and according to certain weight parameter, from described threeIn interval normalized autocorrelation functions maximum, select the normalization of described pitch period hunting zone certainlyCorrelation function maximum, is defined as pitch period candidate value corresponding this maximum in the base of described voice signalSound cycle estimated value;
Wherein, the weight parameter that described pitch period post-processing unit foundation is certain is from described three interval returningOne changes the normalized autocorrelation functions of selecting described pitch period hunting zone in auto-correlation function maximumMaximum is specially: judge whether the normalized autocorrelation functions maximum between Second Region is more than or equal to the firstth districtBetween normalized autocorrelation functions maximum and the product of described weight parameter, if so, by between Second RegionPitch period candidate value corresponding to normalized autocorrelation functions maximum be defined as the base of described voice signalSound cycle estimated value, otherwise, further judge that whether the normalized autocorrelation functions maximum in the 3rd interval is largeIn equaling the normalized autocorrelation functions maximum in the first interval and the product of described weight parameter, if so,Described in pitch period candidate value corresponding the normalized autocorrelation functions maximum in the 3rd interval being defined asThe pitch period estimated value of voice signal, otherwise by the normalized autocorrelation functions maximum correspondence in the first intervalPitch period candidate value be defined as the pitch period estimated value of described voice signal.
CN201310409433.8A 2013-09-09 2013-09-09 Pitch estimation method and apparatus Active CN103474074B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310409433.8A CN103474074B (en) 2013-09-09 2013-09-09 Pitch estimation method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310409433.8A CN103474074B (en) 2013-09-09 2013-09-09 Pitch estimation method and apparatus

Publications (2)

Publication Number Publication Date
CN103474074A CN103474074A (en) 2013-12-25
CN103474074B true CN103474074B (en) 2016-05-11

Family

ID=49798895

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310409433.8A Active CN103474074B (en) 2013-09-09 2013-09-09 Pitch estimation method and apparatus

Country Status (1)

Country Link
CN (1) CN103474074B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108831504A (en) * 2018-06-13 2018-11-16 西安蜂语信息科技有限公司 Determination method, apparatus, computer equipment and the storage medium of pitch period

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105185385B (en) * 2015-08-11 2019-11-15 东莞市凡豆信息科技有限公司 Voice fundamental frequency estimation method based on gender anticipation with the mapping of multiband parameter
CN107039051B (en) * 2016-02-03 2019-11-26 重庆工商职业学院 Fundamental frequency detection method based on ant group optimization
CN106205638B (en) * 2016-06-16 2019-11-08 清华大学 A kind of double-deck fundamental tone feature extracting method towards audio event detection
EP3306609A1 (en) * 2016-10-04 2018-04-11 Fraunhofer Gesellschaft zur Förderung der Angewand Apparatus and method for determining a pitch information
CN108830232B (en) * 2018-06-21 2021-06-15 浙江中点人工智能科技有限公司 Voice signal period segmentation method based on multi-scale nonlinear energy operator
CN109119097B (en) * 2018-10-30 2021-06-08 Oppo广东移动通信有限公司 Pitch detection method, device, storage medium and mobile terminal
CN110390953B (en) * 2019-07-25 2023-11-17 腾讯科技(深圳)有限公司 Method, device, terminal and storage medium for detecting howling voice signal

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4486900A (en) * 1982-03-30 1984-12-04 At&T Bell Laboratories Real time pitch detection by stream processing
US5127053A (en) * 1990-12-24 1992-06-30 General Electric Company Low-complexity method for improving the performance of autocorrelation-based pitch detectors
CN101149924A (en) * 2006-09-18 2008-03-26 华为技术有限公司 Method and device for implementing open-loop pitch search

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4486900A (en) * 1982-03-30 1984-12-04 At&T Bell Laboratories Real time pitch detection by stream processing
US5127053A (en) * 1990-12-24 1992-06-30 General Electric Company Low-complexity method for improving the performance of autocorrelation-based pitch detectors
CN101149924A (en) * 2006-09-18 2008-03-26 华为技术有限公司 Method and device for implementing open-loop pitch search

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于归一化自相关函数的开环基音分析算法研究;赵丹明;《中国优秀硕士学位论文全文数据库 信息科技辑》;20130315;1-52 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108831504A (en) * 2018-06-13 2018-11-16 西安蜂语信息科技有限公司 Determination method, apparatus, computer equipment and the storage medium of pitch period
CN108831504B (en) * 2018-06-13 2020-12-04 西安蜂语信息科技有限公司 Method and device for determining pitch period, computer equipment and storage medium

Also Published As

Publication number Publication date
CN103474074A (en) 2013-12-25

Similar Documents

Publication Publication Date Title
CN103474074B (en) Pitch estimation method and apparatus
CN103854662B (en) Adaptive voice detection method based on multiple domain Combined estimator
CN102054480B (en) Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT)
US10510363B2 (en) Pitch detection algorithm based on PWVT
CN111128213B (en) Noise suppression method and system for processing in different frequency bands
CN103440872B (en) The denoising method of transient state noise
CN104183245A (en) Method and device for recommending music stars with tones similar to those of singers
Ding et al. A DCT-based speech enhancement system with pitch synchronous analysis
Mittal et al. Study of characteristics of aperiodicity in Noh voices
Cabral et al. Glottal spectral separation for parametric speech synthesis.
CN101625858B (en) Method for extracting short-time energy frequency value in voice endpoint detection
CN103258543B (en) Method for expanding artificial voice bandwidth
CN105679312A (en) Phonetic feature processing method of voiceprint identification in noise environment
CN104269180A (en) Quasi-clean voice construction method for voice quality objective evaluation
CN104599677A (en) Speech reconstruction-based instantaneous noise suppressing method
CN110349598A (en) A kind of end-point detecting method under low signal-to-noise ratio environment
CN112116909A (en) Voice recognition method, device and system
CN109102823A (en) A kind of sound enhancement method based on subband spectrum entropy
CN104658547A (en) Method for expanding artificial voice bandwidth
Patil et al. Effectiveness of Teager energy operator for epoch detection from speech signals
Shannon et al. MFCC computation from magnitude spectrum of higher lag autocorrelation coefficients for robust speech recognition.
Govind et al. Epoch extraction in high pass filtered speech using hilbert envelope
Park et al. Pitch detection based on signal-to-noise-ratio estimation and compensation for continuous speech signal
Park et al. Improving pitch detection through emphasized harmonics in time-domain
Graf et al. Low-Complexity Pitch Estimation Based on Phase Differences Between Low-Resolution Spectra.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220513

Address after: 510530 No. 10, Nanxiang 2nd Road, Science City, Luogang District, Guangzhou, Guangdong

Patentee after: Guangdong Guangsheng research and Development Institute Co.,Ltd.

Address before: 518057 6th floor, software building, No. 9, Gaoxin Zhongyi Road, high tech Zone, Nanshan District, Shenzhen, Guangdong Province

Patentee before: SHENZHEN RISING SOURCE TECHNOLOGY Co.,Ltd.