Suche Bilder Maps Play YouTube News Gmail Drive Mehr »
Anmelden
Nutzer von Screenreadern: Klicke auf diesen Link, um die Bedienungshilfen zu aktivieren. Dieser Modus bietet die gleichen Grundfunktionen, funktioniert aber besser mit deinem Reader.

Patentsuche

  1. Erweiterte Patentsuche
VeröffentlichungsnummerCN1464685 A
PublikationstypAnmeldung
AnmeldenummerCN 02115372
Veröffentlichungsdatum31. Dez. 2003
Eingetragen13. Juni 2002
Prioritätsdatum13. Juni 2002
Veröffentlichungsnummer02115372.8, CN 02115372, CN 1464685 A, CN 1464685A, CN-A-1464685, CN02115372, CN02115372.8, CN1464685 A, CN1464685A
Erfinder余泊
Antragsteller优创科技(深圳)有限公司
Zitat exportierenBiBTeX, EndNote, RefMan
Externe Links:  SIPO, Espacenet
Method for processing acoustic frequency flow playback in network terminal buffer
CN 1464685 A
Zusammenfassung
The invention discloses a method for processing audio stream playback in the network terminal buffer zone for solving the problem of voice pausing and jamming in the network communication. The aim ofthe invention is achieved by performing real time audio (e.g. voice) communication on the packet-switching network (e.g. IP network), arranging a shake buffer zone on the receiving end, after the receiving end receives the audio package, it first performs decoding based on the normal sequence, then places it into the shake buffer zone, when the shake buffer zone is to be filled, lower the sampling rate to the audio data to realize the fast playback of the audio data stream, when the shake buffer zone is to be empty, raise the sampling rate to the audio data to realize the low speed playback of the audio data stream, when the audio data in the shake buffer is within the normal range, playback the audio stream with the original sampling rate.
Ansprüche(8)  übersetzt aus folgender Sprache: Chinesisch
1.一种在网络终端缓冲区中处理音频流回放的方法,其特征在于:当包交换网络上传输来音频数据包进入抖动缓冲区时,计算读、写指针之间的距离D;在初始化过程中,设置读、写指针之间的距离D正常值的范围和一次从抖动缓冲区中读出的数据长度;在运行过程中,当读、写指针之间的距离D正常时,音频流按正常速度回放;当读、写指针之间的距离D大于或小于正常值时,通过音频重采样单元对从抖动缓冲区中读出的大于或小于正常播放长度的数据进行重采样处理,使其恢复正常的播放长度。 1. A method of processing an audio stream playback buffer in the network terminal, wherein: when the transmitted audio data packets into the jitter buffer on the packet switched network, computing read, write distance D between the pointer; at initialization process, setting the read, write, and the normal range of the distance D the length of a data read out from the jitter buffer between pointers; during operation, when the distance D normal read, the time between the write pointer, the audio stream Playback at normal speed; when the distance D is read, the write pointer is greater than or less than the normal value, by the resampling unit length of the normal-play data is greater than or less than the read out from the jitter buffer in the re-sampling processing, so that restore normal playback length.
2.根据权利要求1所述的一种在网络终端缓冲区中处理音频流回放的方法,其特征在于:所述的抖动缓冲区为环形抖动缓冲区。 2. A method of 1, wherein processing an audio stream played back in the network terminal buffer claim, wherein: the jitter buffer is a ring jitter buffer.
3.根据权利要求1或2所述的一种在网络终端缓冲区中处理音频流回放的方法,其特征在于:所述的读、写指针之间的距离D是为进入抖动缓冲区的读指针偏移量与写指针偏移量的差。 3. A 1 or claim 2, wherein the buffer in the network terminal processing method of the audio stream playback, wherein: said read and write pointer distance D between the jitter buffer is read into the the difference between the write pointer offset pointer offsets.
4.根据权利要求1或2所述的一种在网络终端缓冲区中处理音频流回放的方法,其特征在于:当D小于正常值范围的下限时,从环形缓冲区读出的数据长度小于正常播放长度,采用升采样处理。 4. A 1 or claim 2, wherein the buffer in the network terminal processing method of the audio stream playback, which is characterized in that: when D is smaller than the lower limit of the normal range, the data length read out from the ring buffer is less than normal playback length, using up-sampling process.
5.根据权利要求1或2所述的一种在网络终端缓冲区中处理音频流回放的方法,其特征在于:当D大于正常值范围的上限时,从环形缓冲区读出的数据长度大于正常播放长度,采用降采样处理。 The A 1 or claim 2, wherein the buffer in the network terminal processing method of the audio stream playback, which is characterized in that: when D is greater than the upper limit of the normal range, the data length read out from the ring buffer is greater than normal playback length, using down-sampling process.
6.根据权利要求3所述的一种在网络终端缓冲区中处理音频流回放的方法,其特征在于:当读指针的偏移量小于写指针的偏移量时,读、写指针之间的距离D为写指针的偏移量减读指针的偏移量所得的差。 6. The method of 3, wherein processing an audio stream played back in the network terminal buffer claim, characterized in that: when the read pointer offset is less than the write pointer offset, read, write pointer between The distance D is the write pointer minus the read pointer offset offsets the resulting difference.
7.根据权利要求3所述的一种在网络终端缓冲区中处理音频流回放的方法,其特征在于:当读指针的偏移量大于写指针的偏移量时,读、写指针之间的距离D为环形抖动缓冲区的写指针的偏移量减读指针的偏移量所得的差在加上环形抖动缓冲区的长度所得的值。 7. The method of 3, wherein the playback of the audio stream processing in the network terminal buffer claim, characterized in that: when the offset is greater than the read pointer to the offset of the write pointer, read, write pointer between The distance D ring jitter buffer write pointer offset minus the difference between the read pointer offset resulting in jitter value plus the ring buffer length obtained.
8.根据权利要求1或2所述的一种在网络终端缓冲区中处理音频流回放的方法,其特征在于:该方法在实现方式上既可以通过硬件编码实现也可以通过软件编码实现。 8. A 1 or claim 2, wherein the buffer in the network terminal processing method of the audio stream playback, which is characterized in that: the method is implemented on the way to code either in hardware may also be realized by software code.
Beschreibung  übersetzt aus folgender Sprache: Chinesisch
一种在网络终端缓冲区中处理音频流回放的方法 A method of processing an audio stream played back in the network terminal buffer

技术领域 FIELD

本发明涉及一种处理音频流的方法,尤其是指在包交换网络上使通话连续、流畅正常播放的在网络终端缓冲区中处理音频流回放的方法。 The present invention relates to a method of processing an audio stream, in particular means to make a call on a packet-switched network of continuous, smooth normal playback method for processing an audio stream played back in the network terminal buffer.

背景技术 BACKGROUND

当前,由于包交换网络的快速发展,原来主要承载在电路交换网络上的话音实时通信,开始大规模以包交换语音(VoIP)的形式转移到IP包交换网络上,最终会形成音频与数据及其它流媒体融合的网络。 Currently, due to the rapid development of packet-switched networks, the original main bearing on the circuit-switched voice network in real-time communications, began a large-scale shift to packet-switched voice (VoIP) in the form of an IP packet-switched networks, will eventually form audio and data and Other streaming online media integration. 传统的电路交换网络与包交换网络存在一些不同特性,比如电路交换网络提供端到端的固定带宽并且独占的通信线路,不会出现包交换网络上存在的数据包丢失、乱序、时延抖动等问题;由于其传输带宽利用的高效率,网络融合的需求,以及组网与网络管理、扩容等的灵活性,决定了在包交换网络上传输音频信号的必要性,但是包交换网络上存在的数据包丢失、乱序、时延抖动等问题,会严重影响实时音频通信的质量,造成卡音、断音等不连续现象。 Traditional circuit-switched networks and packet switched networks, there are some different features, such as circuit-switched networks to provide end-to-fixed bandwidth and exclusive communication line, does not appear on the existence of a packet-switched network packet loss, out of order, delay and jitter problem; because of its high bandwidth utilization efficiency, network integration needs, and networking and network management, expansion and other flexibility to decide the necessity of packet-switched networks to transmit audio signals, but the existence of the network packet switching packet loss, reordering, delay jitter and other issues, will seriously affect the quality of real-time audio communication, resulting in discontinuity sound card, etc. staccato.

数据包的丢失对音频通信质量的影响主要体现在丢包率上,如果丢包率较低比如1%到2%,则听者不会感觉到明显的音频质量下降,但是随着丢包率的上升,听者会感觉到音频信号的断断续续或不连续,此时听者在感觉到少量断断续续时可能还能听懂对方的说话,但是大量的断续就会使听者听不懂对方在说什么了,从而造成通信终止。 Packet loss affects the communication quality of the audio is mainly reflected in the loss ratio, if the packet loss rate is low, such as 1% to 2%, then the listener will not feel a significant decline in the quality of audio, but with the loss rate The rise, the listener will feel the audio signal is intermittent or non-continuous, then the listener feel a small amount at the time may be able to understand each other on and off the talk, but it will make a lot of intermittent listener understand each other what to say, causing communications terminated. 少量的数据包丢失可以通过利用音频信号的冗余性,即通过插值对丢失的音频数据包进行弥补,但是大量的数据包丢失,无论如何都会使音频通信的质量下降。 A small amount of packet loss by utilizing the redundancy of the audio signal, i.e., by interpolation of the missing audio packets made up with a large amount of packet loss, the quality of audio communication anyway decreased.

乱序与时延抖动本质上都会造成需要连续、顺序播放的音频数据包的在播放时间上的抖动;由于实时音频通信是一个连续的过程,每个音频数据包要在解码之后按固定时间间隔顺序播放,因此音频数据包在到来时间上的抖动会造成听觉上的不连续;如果抖动太大,则有的数据包由于来的太晚已没有播放的价值而被丢掉。 Disorder and can cause delay jitter essentially require continuous playback time jitter in order to play audio data packets; because of the real-time audio communication is a continuous process, each audio data packet to be a fixed time interval after decoding order of play, so the audio packet jitter in the arrival time can cause hearing on the discontinuity; if jitter is too large, some packets due to the play of late has no value is lost.

解决数据包抖动的方法之一是在接收端设置一个抖动缓冲区,在数据包到来时首先放入该缓冲区,在回放音频数据时,本着先进先出的原则,在时间上均匀地取出音频数据包并送往音频回放设备播出;只要该抖动缓冲区不被音频回放程序读空,则可以保证音频流的连续回放。 One way to solve the packet jitter is set up on the receiving end of a jitter buffer, packet arrival when first placed in the buffer when playing back audio data, in line with the FIFO principle, take out evenly over time audio packets and sent to broadcast the audio playback device; as long as the jitter buffer is not empty the program reads the audio playback, you can ensure continuous playback of audio stream. 该抖动缓冲区越大,则可平滑的抖动越大,但是过大的抖动缓冲区会造成音频回放时延的加大,过大的时延也是不可取的,会造成实时通讯的困难;抖动缓冲区的大小可根据具体需求来设定,也可以根据网络状况动态调整。 The larger the jitter buffer, you can smooth the larger jitter, but too much can cause increased jitter buffer latency audio playback, excessive delay is not desirable, would cause difficulties to communicate in real time; jitter The buffer size can be set according to specific needs, it can be dynamically adjusted based on network conditions. 但是这样并不能有效保证音频流的连续播放。 But this is not an effective guarantee continuous playback of the audio stream.

发明内容 SUMMARY

本发明的目的是提供一种通话质量好、并能维持实时通讯的一种在网络终端缓冲区中处理音频流回放的方法。 Object of the present invention is to provide a good call quality, and can maintain a real-time communication method of processing an audio stream played back in the network terminal buffer.

本发明是这样实现的:当包交换网络上传输来音频数据包进入抖动缓冲区时,计算读、写指针之间的距离D;在初始化过程中,设置读、写指针之间的距离D正常值的范围和一次从抖动缓冲区中读出的数据长度;在运行过程中,当读、写指针之间的距离D正常时,音频流按正常速度回放;当读、写指针之间的距离D大于或小于正常值时,通过音频重采样单元对从抖动缓冲区中读出的大于或小于正常播放长度的数据进行重采样处理,使其恢复正常的播放长度。 The present invention is implemented as follows: When the transmitted audio data packets into the jitter buffer on a packet-switched network, computing read, write distance D between the pointer; during initialization, set to read, write distance D between the normal hand range of values and a data length read out from the jitter buffer; In operation, when the distance D normal read, the time between the write pointer, an audio stream according to the normal playback speed; when the read distance, between the write pointer When D is greater than or less than the normal value, by resampling the audio normal play data unit is greater than or less than the length read out from the jitter buffer resampling processing, to return to normal playback length.

上述的抖动缓冲区为环形抖动缓冲区。 Jitter buffer is a ring above the jitter buffer.

所述的读、写指针之间的距离D是为进入抖动缓冲区的读指针偏移量与写指针偏移量的差。 The read and write pointers distance D between the jitter buffer is read into the pointer offset, the write pointer offset difference.

当D小于正常值范围的下限时,从环形缓冲区读出的数据长度小于正常播放长度,采用升采样处理。 When D is smaller than the lower limit of the normal range, the data length read out from the ring buffer is less than the length of the normal playback, using up-sampling processing.

当D大于正常值范围的上限时,从环形缓冲区读出的数据长度大于正常播放长度,采用降采样处理。 When D exceeds the upper limit of the normal range, the data length read out from the ring buffer length is greater than the normal playback, using down-sampling processing.

当读指针的偏移量小于写指针的偏移量时,读、写指针之间的距离D为写指针的偏移量减读指针的偏移量所得的差。 When the read pointer offset is less than the write pointer offset, read, write pointer distance D between the write pointer minus offset read pointer offset resulting difference.

当读指针的偏移量大于写指针的偏移量时,读、写指针之间的距离D为环形抖动缓冲区的写指针的偏移量减读指针的偏移量所得的差在加上环形抖动缓冲区的长度所得的值。 When the difference between the offset of the resulting offset is greater than the read pointer to offset the write pointer, reading and writing distance D between the annular pointer jitter buffer write pointer offset minus the read pointer plus ring jitter buffer length obtained.

该方法在实现方式上既可以通过硬件编码实现也可以通过软件编码实现。 On the implementation of the method either through hardware encoder can also be realized by software code implementation.

采用上述方法后,当对读、写指针之间的距离D是非正常值时,通过对从抖动缓冲区中读出的大于或小于正常长度的语音数据块进行重采样处理,从而加速或减慢播放速度,最终使环形抖动缓冲区中的读指针、写指针始终保持一定的距离,保证了包交换网络上传输来的实时音频流的连续流畅播放,使通话连续、流畅,减少了卡音、断音等不连续现象。 With the above method, when the read, write distance D between the non-normal pointer, by reading out from the jitter buffer is greater than or less than the normal length of the block speech data resampling processing, thus speeding up or slowing playback speed, and finally to the ring jitter buffer read pointer, write pointer and always maintain a certain distance, to ensure smooth playback of real-time continuous audio stream transmitted on a packet-switched network, so call continuous, smooth, reducing the sound card, staccato and other discontinuities.

附图说明 Brief Description

下面结合附图和具体的实施方式对本发明作进一步详述。 Below in connection with the accompanying drawings and specific embodiments of the present invention will be described in further detail.

图1是包交换网络音频通讯示意图;图2是音频通信终端的发送部分示意图; Figure 1 is a schematic view for audio packet switching network; FIG. 2 is a schematic diagram of the transmitting portion of the audio communication terminal;

图3是音频通信终端的接收部分示意图;图4是环形抖动缓冲区示意图。 Figure 3 is a schematic view of the receiving part of the audio communication terminal; FIG. 4 is a schematic view of an annular jitter buffer.

具体实施方式 DETAILED DESCRIPTION

如图1所示,每个终端通过某种接入方式连接到包交换网络之中,每个音频终端可以通过该包交换网络向另一个音频终端发送或接收音频数据包;多个音频终端也可以通过某种形式组成一个可以多方通话的会议网络。 Shown, each terminal connection shown in Figure 1 in some way to access among packet switched network, each audio terminal may send or receive audio data packets through the packet switched network to another audio terminals; a plurality of audio terminals also can form a multi-party call conferencing network can pass in some form.

如图2所示,音频输入信号首先送到音频采集单元,该音频采集单元完成音频信号从模拟到数字信号的转换,也就是量化过程,该处理过程一般将音频信号量化为16Bit精度的有符号数字信号,之后送入音频编码器进行数据压缩以节省网络带宽,音频信号经编码压缩之后,送入打包传送单元,数据打包单元一般将用于实时通讯的音频包按实时通信协议(RTP)标准进行封装,之后再封装为用户数据协议(UDP)包,最后打入互连协议(IP)包传送到网络上。 2, the audio input signal is first sent to the audio capture, audio capture unit to complete the audio signal converted from analog to digital signals, i.e. the quantization process, the process is generally the audio signal is quantized for 16Bit precision signed a digital signal, then fed to an audio encoder data compressed to save network bandwidth, the encoded audio signal after compression, packing into the transfer unit, the data packing unit is generally used for real-time communication of audio packets in real-time communication protocol (RTP) standard package, the package for the package after another user data protocol (UDP), and finally into the interconnection protocol (IP) packets to the network. 网络接口单元一般是网络层中的物理层及数组链路层,如以太网接口芯片或调制解调器等,经过压缩后的音频数据包最后经过网络接口单元传送到包交换网络上。 The network interface unit is generally in the network layer and the physical layer link layer array, such as an Ethernet interface chip or modem, the audio data after the last compressed packet via the network interface unit to the packet-switched network.

在音频数据包到达目标接收终端之前,要通过一系列的网络传输单元;这可能包括多种交换设备和路由设备,不同的路由及网络状况会产生不同的传输时延,从而造成按等时间间隔顺序传送的音频数据包会在时间上非均匀地到达接收端,这就造成数据包接收的抖动;另外按等时间间隔顺序传送的音频数据包有可能在不同的路由上传送,比如顺序发送的数据包p1,p2,p3,...在接收端的顺序可能变成p1,p3,p2,...。 Before the audio data packet to the destination receiving terminal, through a series of network transmission unit; this may include a variety of switching devices and routing equipment, different routing and network conditions will produce different transmission delay, resulting in at equal intervals sequentially transmitting the audio data packets will arrive at non-uniform time on the receiving end, which resulted in the received packet jitter; Also at equal intervals sequentially transmitting audio data packets may be transmitted on different routes, such as the order of transmission of packet p1, p2, p3, ... in the order of the receiving side may become p1, p3, p2, ....

如图3所示;音频数据包经过包交换网络传输之后到达接收端的网络接口单元,网络接口单元将收到的数据包拆去物理层地址等信息后还原成互连协议(IP)数据包;该数据包接着被送入音频流接收及拆包单元,在该处理单元中拆去互连协议(IP)包头信息、用户数据协议(UDP)包头信息、实时通信协议(RTP)包头信息,最后还原为音频数据包;还原后的音频数据包被送入音频包解乱序单元,解出按正常时间顺序排列的数据包;按正常时间顺序排列的数据包接着被送入音频解码单元,解出音频信号的线性码;解出的音频信号的线性码被连续写入环形抖动缓冲区暂时存储;由于音频数据包经过网络传输之后出现的抖动,使音频数据流往环形抖动缓冲区的写入操作在时间上是非均匀的,但音频信号的回放在时间上要求是连续且均匀的,所以从环形抖动缓冲区读出音频数据并回放的操作与往环形抖动缓冲区的写入操作时非同步的。 As shown in Figure 3; audio packets after a packet switched network is reduced to reach the interconnection protocol (IP) after the receiving end of the network interface unit, a data packet received by the network interface unit will remove the physical layer address information such as data packet; The packet is then sent to the audio stream receiving unit and unpacking, remove interconnect protocol (IP) header information, User Datagram Protocol (UDP) header information, real-time communication protocol (RTP) header information in the processing unit, and finally Restore audio packets; audio packet is sent to the restored audio package solution disorder unit, solve the normal chronological data packets; normal chronological data packet is then sent to the audio decoding unit, the solution a linear code the audio signal; linear codes solve the audio signal is continuously written into the ring buffer for temporarily storing the jitter; audio packet jitter due to the transmission network after emerging, so that the audio data stream is written to the ring buffer jitter operation is non-uniform in time, but the playback of the audio signal on the time requirement is continuous and uniform, so the jitter buffer is read out from the ring and playback of audio data is not synchronized with the operation of the jitter buffer to write the ring a.

如下详细描述音频数据在环形缓冲区中的处理及回放过程,包括图3中的抖动缓冲器,抖动缓冲器读出单元,抖动缓冲器读出控制单元,音频重采样单元,音频回放单元;图3中实线为数据流,虚线位控制流。 The following detailed description and the playback audio data processing in the ring buffer, comprising Figures 3 jitter buffer, the jitter buffer readout unit, the jitter buffer readout control unit, resampling unit, an audio playback unit; FIG. 3, the solid line for the data stream, the dashed line position control flow.

如图4所示,抖动缓冲区实际上是一个长度为N的连续存储空间,该存储空间的起始地址用偏移量0表示,结束地址用偏移量N-1表示,当前的写入地址指针用偏移量W表示,当前的读出地址指针用偏移量R表示,当前的写入指针与读出指针之间的距离用D表示;对该抖动缓冲区的写入操作为对当前写指针指向的存储位置写入音频数据,之后写指针偏移量加1并对抖动缓冲区的长度N取余运算,即用C语言可以表示为每写入一个数据后 W=(W+1)%N;对该抖动缓冲区的读出操作为每读出一个数据单元后读指针偏移量加1,同样用C语言可以表示为每读出一个数据后R=(R+1)%N;这样的读写操作可以保证在读写指针到达抖动缓冲区的顶部时,也就是偏移量N-1时,下一次读或写操作会自动翻转到抖动缓冲区的底部,也就是偏移量0;这样的操作实际上相当于将该抖动缓冲区的顶部与底部相接,组成了一个环形缓冲区,读写指针按着相同的方向(顺时针或逆指针)旋转;为了保证读写指针之间不发生冲突,即读指针要保持始终跟随在写指针的后方,这是通过读写指针之间的距离D来判断的;如果W大于R则D=WR;如果W小于R则D=N+WR;由此只要D大于0则可以保证读指针始终跟随在写指针的后方,同时读写指针之间的距离D作为控制信号送入抖动缓冲区读出控制单元,实际上读写指针之间的距离D相当于一个闭环控制系统中的误差控制信号,作为对抖动缓冲区读出操作的依据。 4, the jitter buffer is actually a length N of contiguous memory, the starting address of the memory space, an offset of 0 indicates that, with the offset end address N-1, said current write address pointer with offset W said that the current use of the address of the read pointer R represents an offset from the current write pointer and read pointer expressed between D; the jitter buffer for write operations write pointer to the current memory location to write audio data, after the write pointer offset plus a jitter buffer length N and take over operation, which uses C language can be represented as a data after each write W = (W + 1)% N; the jitter buffer readout operation for reading a data read out after each unit plus a pointer offset, the same C language can be expressed after each reading a data R = (R + 1) % N; so read and write operations can guarantee write pointer reaches the top of the jitter buffer, which is offset N-1, the next read or write operation will automatically flip to the bottom of the jitter buffer, which is offset 0; this operation is actually equivalent to the top and bottom of the jitter buffer phase, consisting of a ring buffer write pointer according to the same direction (clockwise or counterclockwise pointer) rotation; in order to ensure no conflict between the read and write pointers, i.e. the read pointer to remain behind the write pointer is always followed, which is the distance D between the read and write pointers to judge; if W is greater than R D = WR; If W is less than R then D = N + WR; whereby D is greater than 0 as long as the read pointer is always guaranteed to follow behind the write pointer, the read and write pointers distance D between the jitter buffer as a control signal fed to the read-out control unit, in fact, the distance between the read and write pointers D equivalent of a closed-loop control system error control signal, as the basis for the jitter buffer readout operation.

抖动缓冲区读出控制单元得到读写指针之间的距离D之后,根据D的大小来决定每次音频回放设备需要回放数据时读出的音频数据块的大小;音频回放单元一般是按等时间间隔(比如30毫秒)需要一定长度的音频数据块进行回放,如果音频信号的采样率为8000个采样点每秒(8k/s)则每30毫秒需要长度为240个采样点的音频数据块;若D的大小在正常的范围内,则抖动缓冲区读出控制单元控制抖动缓冲区读出单元每次在音频回放单元需要数据时读出正常长度的一块数据(比如240个音频数据),而此时音频重采样单元对此数据不进行任何操作,透明地将该块数据送到音频回放单元播放。 After the jitter buffer readout control unit to obtain the distance D between the read and write pointers, the size of D is determined according to the size of each read data when the audio playback device to play back audio data block; audio playback unit is generally at equal time intervals (eg 30 msec) requires a certain length of the block of audio data for playback, if the audio signal sampling rate of 8000 sample points per second (8k / s) is required every 30 ms length audio blocks 240 sampling points; If the size of D is in the normal range, the jitter buffer read-out control unit controls the jitter buffer reading unit reads the normal length of each piece of data (such as 240 audio data) When the audio playback unit needs data, and At this audio resampling unit does not carry out any operation on this data, the block data transparently to the audio playback unit play.

如果读写指针之间的距离D超过正常范围,如过大或者过小则需要进行必要的调整,因为如果D过小,则可能发生由于数据包到来时间的随机抖动造成该抖动缓冲区被不时读空的情况出现,从而造成W小于R的情况出现,也就是出现读指针比写指针在环形缓冲区中跑的快的情况,这时就会出现断音与卡音;如果D过大,由于上述相同的原因,可能出现W绕过R一圈的情况出现,这时也会出现断音与卡音。 If the distance D between the read and write pointers above the normal range, such as too large or too small, the need to make the necessary adjustments, because if D is too small, it may occur due to random packet arrival time jitter caused by the jitter buffer is often Reading the situation appeared empty, resulting in less than R W's situation, that is, than the write pointer read pointer running fast in the ring buffer happens, then it will appear with the staccato sound card; if D is too large, For the same reason as described above, the situation may arise W R bypass circle appears, then the card will also appear staccato tone.

另外,网络传输上总会出现数据包丢失的情况,从长时统计平均的意义上来讲D会因此随着时间的推移越来越小;还有一个因素会影响D的变化,如果音频发送端的采样时钟频率不同于音频接收端的播放时钟频率,就会出现如下情况,即当音频发送端的采样时钟频率大于音频接收端的播放时钟频率时,D会因此随着时间的推移越来越大,当音频发送端的采样时钟频率小于音频接收端的播放时钟频率时,D会因此随着时间的推移越来越小。 In addition, there will always be transmitted on the network packet loss, the statistics on the average length of time from the sense D would therefore getting smaller and smaller as time goes on; there is a factor that will affect the change of D, if the audio sender sampling clock frequency is different from the audio playback clock frequency at the receiving end, there will be a situation in which the transmitting side when the audio sampling clock frequency is greater than the receiving end of the audio playback clock frequency, D will thus increasing over time when the audio When the sending end of the audio sampling clock frequency is less than the frequency of the receiver play clock, D and therefore over time will become smaller and smaller.

假设上述综合原因允许D在大于d1小于d2的范围内变化,也就是说当d1<D<d2时,抖动缓冲区读出单元在相应的控制单元的控制下每次从抖动缓冲区中读出长度为L的音频数据块;当D<d1时,表示该抖动缓冲区很可能将要被读空,此时控制单元应控制抖动缓冲区读出单元在音频回放单元需要下一帧数据时,读出长度小于正常长度L的一帧数据,假定其长度为L1(L1<L);此时音频重采样单元就将长度为L1的音频数据块通过符合音频感知特性的插值运算变为长度为L的音频数据块,保证音频回放单元有标准长度的音频数据回放;这样相当于减慢了该帧音频数据的回放速度,只要L与L1的差值不是太大(比如((L-L1)/L)<1%)则主观听觉上不会有可感知的变化;由于减慢了音频数据的回放速度,可以预知D会因此越变越大,当D回到正常的范围内时(d1<D<d2)就可以按正常的速度进行音频的回放了。 Assuming that the above reasons allows comprehensive D vary within a range greater than d1 is less than d2, that is when d1 <D <d2, the jitter buffer readout unit reads out from the jitter buffer every time under the control of the corresponding control unit a length L block of audio data; when D <d1, represents the jitter buffer is likely to be read empty, then the control unit should control the jitter buffer reading unit when audio playback unit requires the next frame, read a length less than the normal length L of a frame of data, assuming a length of L1 (L1 <L); case resampling unit will length L1 block of audio data via the audio in line with a perceptual characteristic of the interpolation operation becomes a length L block of audio data, the audio playback unit to ensure a standard length of the audio data playback; this is equivalent to slow down the playback speed of the frame of audio data, as long as the difference between L and L1 is not too large (for example, ((L-L1) / L) <1%) then there will be no appreciable change in the subjective hearing; by slowing down the playback speed of audio data, you can predict the D will therefore getting bigger and bigger, when the D back into the normal range (d1 < D <d2) can be carried out according to the normal speed of audio playback.

当D>d2时,表示该抖动缓冲区很可能将要被写满,此时控制单元应控制抖动缓冲区读出单元在音频回放单元需要下一帧数据时,读出长度大于正常长度L的一帧数据,假定其长度为L2(L2>L);此时音频重采样单元就将长度为L2的音频数据块通过符合音频感知特性抽取运算变为长度为L的音频数据块,保证音频回放单元有标准长度的音频数据回放;这样相当于加速了该帧音频数据的回放速度,同样只要L与L2的差值不是太大(比如((L2-L)/L)<1%)则主观听觉上不会有可感知的变化;由于加快了音频数据的回放速度,可以预知D会因此越变越小,当D回到正常的范围内时(d1<D<d2)就可以按正常的速度进行音频的回放了。 When D> d2, it means that the jitter buffer is likely to be filled, then the control unit should control the jitter buffer reading unit when audio playback unit requires the next frame, read length greater than the length L of a normal frame data, assuming a length of L2 (L2> L); case resampling unit will block length of audio data in line L2 through the audio into a perceptual characteristic extraction calculating a length L block of audio data, the audio playback unit to ensure standard length of audio data playback; this is equivalent to the playback speed of the accelerated audio data frames, the same as long as the difference between L and L2 are not too large (for example, ((L2-L) / L) <1%) is subjective hearing there will be a change in perception; because of speeding up the playback speed of audio data, you can predict the D will therefore getting smaller and smaller, when the D back into the normal range (d1 <D <d2) you can press the normal speed playback of audio.

如果数据包到来时间的抖动范围太大或者丢包率太大,则有可能造成环形缓冲区被读空或者写满,此时可以做异常处理,在该缓冲区被读空时,可以重放上一帧语音数据,若紧接着的下一次读操作时缓冲区仍为空,则播放静音信号;如果环形缓冲区被写满,则自动冲掉该缓冲区中的所有未播放数据,并重新开始正常的对该缓冲区的正常读写操作。 If the packet arrival time jitter range of packet loss rate is too big or too large, it may result in a ring buffer is empty or filled with reading, exception handling can be done at this time when the buffer is read empty, you can replay voice and data on a frame, if followed by the next time read the buffer is still empty, the player mute signal; if the ring buffer is full, it is automatically flush the buffer is not playing all the data and re- resume normal operation of the normal read and write buffers.

该方法在实现方式上既可以通过硬件编码实现也可以通过软件编码实现。 On the implementation of the method either through hardware encoder can also be realized by software code implementation.

Referenziert von
Zitiert von PatentEingetragen Veröffentlichungsdatum Antragsteller Titel
CN100596108C20. Apr. 200724. März 2010杭州华三通信技术有限公司Data transfer method and device
CN101188585B17. Nov. 200610. Aug. 2011中兴通讯股份有限公司Conversion method of data sampling rate and its system in baseband signal transmission
CN101518001B28. Aug. 200727. März 2013微软公司Network jitter smoothing with reduced delay
CN102014072A *17. Dez. 201013. Apr. 2011天津曙光计算机产业有限公司Playback method for random disorder of flow
CN102014072B17. Dez. 201023. Juli 2014曙光信息产业股份有限公司Playback method for random disorder of flow
CN102522088A *25. Nov. 201127. Juni 2012展讯通信(上海)有限公司Decoding method and device of audio frequency
CN102522088B25. Nov. 201123. Okt. 2013展讯通信(上海)有限公司Decoding method and device of audio frequency
CN103315734A *16. Mai 201325. Sept. 2013深圳市科曼医疗设备有限公司Waveshaping method and device for monitoring data
CN103315734B *16. Mai 201330. März 2016深圳市科曼医疗设备有限公司监护数据的波形形成方法及装置
CN104202656A *16. Sept. 201410. Dez. 2014国家计算机网络与信息安全管理中心Segmented decoding method for scrambled network audio MP3 (moving picture experts group audio layer 3) streams
CN104202656B *16. Sept. 20144. Aug. 2017国家计算机网络与信息安全管理中心网络音频mp3流乱序分段解码方法
CN104932994A *17. Juni 201523. Sept. 2015青岛海信信芯科技有限公司Data processing method and device
US77076147. Juni 200527. Apr. 2010Sling Media, Inc.Personal media broadcasting system with output buffer
US772591215. März 200125. Mai 2010Sling Media, Inc.Method for implementing a remote display system with transcoding
US77697568. März 20073. Aug. 2010Sling Media, Inc.Selection and presentation of context-relevant supplemental content and advertising
US78777767. Juni 200525. Jan. 2011Sling Media, Inc.Personal media broadcasting system
US79179321. Nov. 200729. März 2011Sling Media, Inc.Personal video recorder functionality for placeshifting systems
US792144621. Dez. 20095. Apr. 2011Sling Media, Inc.Fast-start streaming and buffering of streaming content for personal media player
US79750627. Jan. 20075. Juli 2011Sling Media, Inc.Capturing and sharing media content
US799217612. Apr. 20102. Aug. 2011Sling Media, Inc.Apparatus and method for effectively implementing a wireless television system
US80419889. Apr. 201018. Okt. 2011Sling Media Inc.Firmware update for consumer electronic device
US80514549. Apr. 20101. Nov. 2011Sling Media, Inc.Personal media broadcasting system with output buffer
US806060915. Dez. 200815. Nov. 2011Sling Media Inc.Systems and methods for determining attributes of media items accessed via a personal media broadcaster
US806090927. Dez. 201015. Nov. 2011Sling Media, Inc.Personal media broadcasting system
US809975519. Dez. 200817. Jan. 2012Sling Media Pvt. Ltd.Systems and methods for controlling the encoding of a media stream
US817114817. Apr. 20091. Mai 2012Sling Media, Inc.Systems and methods for establishing connections between devices communicating over a network
US826665721. Apr. 200511. Sept. 2012Sling Media Inc.Method for effectively implementing a multi-room television system
US831489328. Aug. 200920. Nov. 2012Sling Media Pvt. Ltd.Remote control and method for automatically adjusting the volume output of an audio device
US83466057. Jan. 20071. Jan. 2013Sling Media, Inc.Management of shared media content
US835097122. Okt. 20088. Jan. 2013Sling Media, Inc.Systems and methods for controlling media devices
US836523622. Sept. 201129. Jan. 2013Sling Media, Inc.Personal media broadcasting system with output buffer
US838131023. Nov. 200919. Febr. 2013Sling Media Pvt. Ltd.Systems, methods, and program applications for selectively restricting the placeshifting of copy protected digital media content
US840643123. Juli 200926. März 2013Sling Media Pvt. Ltd.Adaptive gain control for digital audio samples in a media stream
US843860226. Jan. 20097. Mai 2013Sling Media Inc.Systems and methods for linking media content
US847779324. Sept. 20082. Juli 2013Sling Media, Inc.Media streaming device with gateway functionality
US853247210. Aug. 200910. Sept. 2013Sling Media Pvt LtdMethods and apparatus for fast seeking within a media stream buffer
US86215334. Apr. 201131. Dez. 2013Sling Media, Inc.Fast-start streaming and buffering of streaming content for personal media player
US862687922. Dez. 20097. Jan. 2014Sling Media, Inc.Systems and methods for establishing network connections using local mediation services
US866716320. März 20094. März 2014Sling Media Inc.Systems and methods for projecting images from a computer system
US86672791. Juli 20084. März 2014Sling Media, Inc.Systems and methods for securely place shifting media content
US879940810. Aug. 20095. Aug. 2014Sling Media Pvt LtdLocalization systems and methods
US879948518. Dez. 20095. Aug. 2014Sling Media, Inc.Methods and apparatus for establishing network connections using an inter-mediating device
US879996913. Mai 20115. Aug. 2014Sling Media, Inc.Capturing and sharing media content
US881975013. Sept. 201226. Aug. 2014Sling Media, Inc.Personal media broadcasting system with output buffer
US883881027. Apr. 201216. Sept. 2014Sling Media, Inc.Systems and methods for establishing connections between devices communicating over a network
US88563492. Apr. 20107. Okt. 2014Sling Media Inc.Connection priority services for data communication between two devices
US890445528. März 20112. Dez. 2014Sling Media Inc.Personal video recorder functionality for placeshifting systems
US890857717. Nov. 20069. Dez. 2014Qualcomm IncorporatedSolving IP buffering delays in mobile multimedia applications with translayer optimization
US895801928. Dez. 201217. Febr. 2015Sling Media, Inc.Systems and methods for controlling media devices
US896610110. Aug. 200924. Febr. 2015Sling Media Pvt LtdSystems and methods for updating firmware over a network
US896665815. Febr. 201324. Febr. 2015Sling Media Pvt LtdSystems, methods, and program applications for selectively restricting the placeshifting of copy protected digital media content
US901522516. Nov. 200921. Apr. 2015Echostar Technologies L.L.C.Systems and methods for delivering messages over a network
US910672330. Dez. 201311. Aug. 2015Sling Media, Inc.Fast-start streaming and buffering of streaming content for personal media player
US913125330. Juni 20108. Sept. 2015Sling Media, Inc.Selection and presentation of context-relevant supplemental content and advertising
US914382726. Febr. 201422. Sept. 2015Sling Media, Inc.Systems and methods for securely place shifting media content
US916097426. Aug. 200913. Okt. 2015Sling Media, Inc.Systems and methods for transcoding and place shifting media content
US917892323. Dez. 20093. Nov. 2015Echostar Technologies L.L.C.Systems and methods for remotely controlling a media server via a network
US919161026. Nov. 200817. Nov. 2015Sling Media Pvt Ltd.Systems and methods for creating logical media streams for media storage and playback
US922578515. Sept. 201429. Dez. 2015Sling Media, Inc.Systems and methods for establishing connections between devices communicating over a network
US92373002. Dez. 201412. Jan. 2016Sling Media Inc.Personal video recorder functionality for placeshifting systems
US925324125. Aug. 20142. Febr. 2016Sling Media Inc.Personal media broadcasting system with output buffer
US927505428. Dez. 20091. März 2016Sling Media, Inc.Systems and methods for searching media content
US93569841. Aug. 201431. Mai 2016Sling Media, Inc.Capturing and sharing media content
US94797376. Aug. 200925. Okt. 2016Echostar Technologies L.L.C.Systems and methods for event programming via a remote media player
US949152310. Sept. 20128. Nov. 2016Echostar Technologies L.L.C.Method for effectively implementing a multi-room television system
US949153821. März 20138. Nov. 2016Sling Media Pvt Ltd.Adaptive gain control for digital audio samples in a media stream
US95100351. Sept. 201529. Nov. 2016Sling Media, Inc.Systems and methods for securely streaming media content
US952583810. Aug. 200920. Dez. 2016Sling Media Pvt. Ltd.Systems and methods for virtual remote control of streamed media
US956547910. Aug. 20097. Febr. 2017Sling Media Pvt Ltd.Methods and apparatus for seeking within a media stream using scene detection
US958475729. Juli 201128. Febr. 2017Sling Media, Inc.Apparatus and method for effectively implementing a wireless television system
US960022228. Febr. 201421. März 2017Sling Media Inc.Systems and methods for projecting images from a computer system
US971691022. Dez. 201525. Juli 2017Sling Media, L.L.C.Personal video recorder functionality for placeshifting systems
US978147330. Aug. 20163. Okt. 2017Echostar Technologies L.L.C.Method for effectively implementing a multi-room television system
WO2015192451A1 *13. Aug. 201423. Dez. 2015中兴通讯股份有限公司Audio play method and device
WO2017054377A1 *25. Jan. 20166. Apr. 2017青岛海信电器股份有限公司Audio data processing method, apparatus and system
WO2017059678A1 *16. Mai 201613. Apr. 2017乐视控股(北京)有限公司Real-time voice receiving device and delay reduction method in real-time voice call
Klassifizierungen
Internationale KlassifikationH04L12/16, H04M11/10, H04M11/06
Juristische Ereignisse
DatumCodeEreignisBeschreibung
31. Dez. 2003C06Publication
15. Febr. 2006C02Deemed withdrawal of patent application after publication (patent law 2001)