US20020122491A1 - Video decoder architecture and method for using same - Google Patents

Video decoder architecture and method for using same Download PDF

Info

Publication number
US20020122491A1
US20020122491A1 US09/827,796 US82779601A US2002122491A1 US 20020122491 A1 US20020122491 A1 US 20020122491A1 US 82779601 A US82779601 A US 82779601A US 2002122491 A1 US2002122491 A1 US 2002122491A1
Authority
US
United States
Prior art keywords
sub
frame
picture
coefficients
frames
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US09/827,796
Inventor
Marta Karczewicz
Ragip Kurceren
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Oyj
Original Assignee
Nokia Oyj
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Oyj filed Critical Nokia Oyj
Priority to US09/827,796 priority Critical patent/US20020122491A1/en
Priority to US09/883,887 priority patent/US6765963B2/en
Assigned to NOKIA CORPORATION reassignment NOKIA CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KARCZEWICZ, MARTA, KURCEREN, RAGIP
Priority to US09/925,769 priority patent/US6920175B2/en
Priority to KR1020037008568A priority patent/KR100626419B1/en
Priority to BRPI0206191A priority patent/BRPI0206191B1/en
Priority to CA002431866A priority patent/CA2431866C/en
Priority to EP02716096.9A priority patent/EP1356684B1/en
Priority to PCT/FI2002/000004 priority patent/WO2002054776A1/en
Priority to HU0400560A priority patent/HU228605B1/en
Priority to EEP200300315A priority patent/EE04829B1/en
Priority to MXPA03005985A priority patent/MXPA03005985A/en
Priority to JP2002555537A priority patent/JP4109113B2/en
Priority to CNB028034414A priority patent/CN1225125C/en
Publication of US20020122491A1 publication Critical patent/US20020122491A1/en
Priority to US10/869,092 priority patent/US20040240560A1/en
Priority to US10/869,455 priority patent/US20040223549A1/en
Priority to US10/869,628 priority patent/US7477689B2/en
Priority to HK04105644A priority patent/HK1062868A1/en
Priority to JP2007178813A priority patent/JP5128865B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/438Interfacing the downstream path of the transmission network originating from a server, e.g. retrieving MPEG packets from an IP network
    • H04N21/4383Accessing a communication channel
    • H04N21/4384Accessing a communication channel involving operations to reduce the access time, e.g. fast-tuning for reducing channel switching latency
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/164Feedback from the receiver or from the transmission channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/39Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability involving multiple description coding [MDC], i.e. with separate layers being structured as independently decodable descriptions of input picture data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/58Motion compensation with long-term prediction, i.e. the reference frame for a current frame not being the temporally closest one
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/65Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using error resilience
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/89Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving methods or arrangements for detection of transmission errors at the decoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/647Control signaling between network components and server or clients; Network processes for video distribution between server and clients, e.g. controlling the quality of the video stream, by dropping packets, protecting content from unauthorised alteration within the network, monitoring of network load, bridging between two different networks, e.g. between IP and wireless
    • H04N21/64784Data processing by the network
    • H04N21/64792Controlling the complexity of the content stream, e.g. by dropping packets

Definitions

  • This invention relates generally to the field of the multimedia applications. More particularly, this invention relates to a decoder/decompressor and method for decoding streaming video.
  • Multimedia applications that include audio and streaming video information have come into greater use.
  • MPEG standards established by the Motion Picture Expert Group, are the most widely accepted international standards in the field of the multimedia applications.
  • ITU-Telecommunications Standardization have developed video coding standards established by the Video Coding Experts Group (VCEG).
  • VCEG Video Coding Experts Group
  • JPEG and Motion JPEG established by the Joint Photographic Expert Group.
  • the purpose of the video coding is to remove the redundancy in the image sequence so that the encoded data rate is commensurate with the available bandwidth to transport the video sequence while keeping the distortion between the original and reconstructed images as small as possible.
  • the redundancy in video sequences can be categorized into spatial and temporal redundancy. Spatial redundancy refers to the correlation between neighboring pixels in a frame while temporal redundancy refers to correlation between neighboring frames.
  • color information For every pixel of a image, color information must be provided. Typically, color information is coded in terms of the primary color components red, green and blue (RGB) or using a related luminance/chrominance model, known as the YUV model.
  • RGB red, green and blue
  • YUV model a related luminance/chrominance model
  • Typical video codec employs three types of frames: intra frames (I-frames) and predicted frames (P-frames) and Bi-directional-frame (B-frames). Coding of a frame is performed independently from the others. I-frame, exploits only the spatial correlation of the pixels within the frame. Coding of P-frames exploits spatial as well temporal redundancies between the successive frames. Since in a typical video sequence the objects appearing in a sequence don't change rapidly from one frame to the next frame, i.e., the adjacent frames in a sequence are highly correlated, higher compression efficiencies are achieved when using P-frames.
  • the terms frame and picture have been interchanged in the art.
  • a frame contains all the color and brightness information that is need to display a picture.
  • a picture is divided into a number of blocks, which are grouped into macroblocks. Each block contains a number of lines, with each line holding a number of samples of luminace or chrominance pixel values from a frame.
  • FIG. 1A is a diagram of an MPEG audio and video decoder 120 that performs decompression of the video and/or audio data which has been compressed and coded according to the MPEG algorithm.
  • the system decoder 110 reads the encoded MPEG data stream 101 having interlaced compressed video and/or audio data, and generates necessary timing information; Video Presentation Time Stamp (VPTS) 104 ; System Clock Reference (SCR) 105 which is also referred to as system time clock (STC); Audio Presentation Time Stamp (APTS) 106 ; and separated video encoded bit streams 102 and audio encoded bit streams 103 .
  • VPTS Video Presentation Time Stamp
  • SCR System Clock Reference
  • STC System time clock
  • APTS Audio Presentation Time Stamp
  • the video decoder 111 decompresses the video data stream 102 and generates a decompressed video signal 107 .
  • the audio decoder 112 decompresses the audio data stream 103 and generates the decompressed audio signal 108 .
  • the decompressed video signal 107 is coupled to a display unit, while the decompressed audio signal 108 is coupled to an audio speaker or other audio generation means.
  • the MPEG encoded/compressed data stream may contain a plurality of encoded/compressed video data packets or blocks and a plurality of encoded/compressed audio data packets or blocks.
  • An MPEG encoder encodes/compresses the video packets based on video frames, also referred to as pictures. These pictures or frames are source or reconstructed image data consisting of three rectangular matrices of multiple-bit numbers representing the luminance and chrominance signals. For example, H.263+ uses four luminance blocks and two chrominance blocks of 8 ⁇ 8 pixels each.
  • FIGS. 2 A- 2 C illustrate the type of encoded/compressed video frames that are commonly utilized for MPEG standard.
  • FIG. 2A depicts an Intra-frame or I-type frame 200 .
  • the I-type frame or picture is a frame of video data that is coded without using information from the past or the future and is utilized as the basis for decoding/decompression of other type frames.
  • FIG. 2B is a representation of a Predictive-frame or P-type frame 210 .
  • the P-type frame or picture is a frame that is encoded/compressed using motion compensated prediction from an I-type or P-type frame of its past, in this case, I.sub. 1 200 . That is, a previous frame is used to encode/compress a present given frame of video data.
  • 205 a represents the motion compensated prediction information to create a P-type frame 210 .
  • FIG. 2C depicts a Bi-directional-frame or B-type of frame 220 .
  • the B-type frame or picture is a frame that is encoded/compressed using a motion compensated prediction derived from the l-type reference frame ( 200 in this example) or P-type reference frame in its past and the I-type reference frame or P-type reference frame ( 210 in this example) in its future or a combination of both.
  • B-type frames are usually inserted between I-type frames or P-type frames.
  • FIG. 2D represents a group of pictures in what is called display order I.sub. 1 B.sub. 2 B.sub. 3 P.sub. 4 B.sub. 5 P.sub. 6 .
  • FIG. 2D illustrates the B-type frames inserted between I-type and P-type frames and the direction which motion compensation information flows.
  • Motion compensation refers to using motion vectors from one frame to the next to improve the efficiency of predicting pixel values for encoding/compression and decoding/decompression.
  • the method of prediction uses the motion vectors to provide offset values and error data that refer to a past or a future frame of video data having decoded pixel values that may be used with the error data to compress/encode or decompress/decode a given frame of video data.
  • the decoding order for the given display order is:
  • the decoding order differs from the display order because the B-type frames need future I-type or P-type frames to be decoded.
  • P-type frames require that the previous l-type reference frame be available.
  • P.sub.4 requires I.sub.1 to be decoded such that the encoded/compressed I.sub.1 frame needs to be available.
  • the frame P.sub.6 requires that P.sub.4 be available in order to decode/decompress frame P.sub.6.
  • B-type frames such as frame B.sub.3, require a past and future I-type or P-type reference frames, such as P.sub.4 and I.sub.1 in order to be decoded.
  • B-type frames are inserted frames between l-type, P-type, or a combination during encoding and are not necessary for faithful reproduction of an image.
  • the frames before an I-type frame, such as P.sub.n-1 in the example, are not needed to decode an I-type frame, and no future frames require P.sub.n-1 in order to be decoded/decompressed.
  • the video frames are buffered before being displayed. There is usually a one frame delay between display and decoding. The difference between display and decoding leads to a condition known as tearing. Tearing occurs when the display frame is overwritten by the decoded frame.
  • FIG. 1B depicts tearing.
  • a decoded/decompressed frame 132 of data representing the image of a closed door 133 is currently stored in a buffer 135 .
  • This decode/decompressed frame is currently being displayed on display unit 140 .
  • another decoded/decompressed frame 130 with data representing the image of an open door 131 is stored in buffer 135 .
  • the display unit 140 will now start displaying using information from the new frame now stored in 135 .
  • the result is a partial display of the first stored image 141 and partial display of the new stored image 142 .
  • Video streaming has emerged as one of the essential applications over the fixed internet and- in the near future over 3G multimedia networks.
  • the server starts streaming the pre-encoded video bitstream to the receiver upon a request from the receiver which plays the stream as it receives with a small delay or no delay.
  • the problem with video streaming is that the best-effort nature of today's networks causes variations of the effective bandwidth available to a user due to the changing network conditions.
  • the server should then scale the bitrate of the compressed video to accommodate these variations. In case of conversational services that are characterized by real-time encoding and point-to-point delivery, this is achieved by adjusting, on the fly, the source encoding parameters, such as quantization parameter or frame rate, based on the network feedback. In typical streaming scenarios when already encoded video bitstream is to be streamed to the client, the above solution can not be applied. A situation similar to tearing as described above would occur.
  • a new picture or frame type and method of using same is provided.
  • This type of novel frame type is referred to as a SP-picture.
  • the temporal redundancies are not exploited in I-frames, compression efficiency of I-frame coding is significantly lower than the predictive coding.
  • the proposed method allows use of motion compensated predictive coding to exploit temporal redundancy in the sequence while still allowing perfect reconstruction of the frame using different reference frames.
  • This new picture type provides for error resilience/recovery, bandwidth scalability, bitstream switching, processing scalability, random access and other functions.
  • the SP-type picture provides for, among other functions, switching between different bitstreams, random access, fast forward and fast error-recovery by replacing I-pictures to increase the coding efficiency.
  • SP-pictures have the property that identical SP-frames may be obtained even when they are predicted using different reference frames.
  • FIG. 1A is a prior art block diagram of an MPEG decoding system.
  • FIG. 1B is a drawing showing the problem of tearing within prior art devices where frame data from two different frames stored consecutively in frame buffer memory is displayed on a display device.
  • FIGS. 2 A- 2 D are diagrams showing the prior art encoding/compression of video frames.
  • FIG. 3 is a block diagram of a generic motion-compensated predictive video coding system (encoder).
  • FIG. 4 is a block diagram of a generic motion-compensated predictive video coding system (decoder).
  • FIG. 5 is an illustration showing switching between bitstreams 1 and 2 using SP-pictures.
  • FIG. 6 is a block diagram of a decoder in accordance with the preferred embodiment of the invention.
  • FIG. 7 is an illustration of random access using SP-pictures.
  • FIG. 8 is an illustration of a fast-forward process using SP-pictures.
  • FIG. 9 is a set of graphs comparing coding efficiencies of I, P, and SP-pictures.
  • FIG. 10 is a set of graphs comparing performance of SP-pictures and I-pictures when used at fixed one second intervals.
  • FIG. 11 is a set of graphs comparing performance of SP-pictures and I-pictures in fast-forward application, one second Intervals.
  • the simplest way of achieving bandwidth scalability in case of pre-encoded sequences is by producing multiple and independent streams of different bandwidth and quality.
  • the server dynamically switches between the streams to accommodate variations of the bandwidth available to the client.
  • a new decoder architecture which has the property that identical frames may be obtained even when they are predicted using different reference frames.
  • the picture type obtained using this structure will be called SP-frame also may be referred to as picture.
  • FIGS. 3 and 4 A system for P-frame encoding and decoding is provided and is shown in FIGS. 3 and 4.
  • a communication system comprising an encoder 300 of FIG. 3 and a decoder 400 of FIG. 4 is operable to communicate a multimedia sequence between a sequence generator and a sequence receiver.
  • Other elements of the video sequence generator and receiver are not shown for the purposes of simplicity.
  • the communication path between sequence generator and receiver may take various forms, including but not limited to a radio-link.
  • Encoder 300 is shown in FIG. 3 coupled to receive video input on line 301 in the form of a current frame to be encoded I(x, y), called the current frame.
  • the video input may be provided to a motion estimation and coding block 370 through 305 and to an input of a subtractor 307 .
  • the motion estimation and coding block 370 may also be coupled frame memory 350 to receive indications of a previously coded and transmitted frame R(x, y), called a reference frame.
  • the motion estimation and coding block 370 may also be coupled to multiplexor 380 to provide motion information for bitstream.
  • the current frame I(x, y), is partitioned into rectangular regions of M ⁇ N pixels.
  • (x, y) we denote location of the pixel within the frame.
  • These blocks may be encoded using either only spatial correlation (intra-coded blocks) or both spatial and temporal correlation (inter-coded blocks). In what follows, we concentrate on inter blocks.
  • Each of inter-coded blocks may be predicted from one of the previously coded and transmitted reference frame, which at given instant is available in the Frame Memory 350 of the encoder 300 and of the Frame Memory 450 of the decoder 400 in FIG. 4.
  • the frame memory 350 is coupled to a Motion Compensated (MC) prediction block 360 .
  • the MC prediction block 360 is operable to generate a prediction frame P(x, y) which is provided to an input of subtractor 307 and adder 345 .
  • MC prediction block 360 is also coupled to the motion estimation and coding block 370 to receive motion information.
  • the prediction information may be represented by two dimensional motion vector ( ⁇ x, ⁇ y) where ⁇ x is the horizontal and ⁇ y is the vertical displacement, respectively of the pixels between the current frame and the reference frame.
  • the motion estimation and coding block 370 calculates the motion vectors ( ⁇ x, ⁇ y).
  • the motion vectors together with the reference frame are used to construct prediction frame P(x, y):
  • the prediction error E(x, y) i.e., the difference between the current frame and the prediction frame P(x, y) is calculated by:
  • weights c.sub.err(l,j), corresponding to the basis functions are called transform coefficients. These coefficients are subsequently quantized in quantization block 320 :
  • I.sub.err( l, j ) Q ( C ( l, j ), QP )
  • I.sub.err(i, j) are the quantized coefficients.
  • the operation of quantization introduces loss of information—the quantized coefficient can be represented with smaller number of bits.
  • the level of compression (loss of information) is controlled by adjusting the value of the quantization parameter (QP).
  • the quantization block 320 is coupled to both a multiplexor 380 and an inverse quantization block 330 and in turn an inverse transform block 340 .
  • Blocks 330 and 340 provide prediction error which is added to the MC predicted frame P(x, y) by added 345 and the result stored in frame memory 350 .
  • Motion vectors and quantized coefficients are further encoded using Variable Length Codes (VLC) which further reduce the number of bits needed for their representation.
  • VLC Variable Length Codes
  • Encoded motion vectors and quantized coefficients as well as other additional information needed to represent each coded frame of the image sequence constitute a bitstream 415 which is transmitted to the decoder 400 of FIG. 4.
  • Bitstream may be multiplexed 380 before transmission.
  • FIG. 4 shows the decoder 400 of the communication system.
  • Bitstream 415 is received from encoder 300 of FIG. 3.
  • Bitstream 415 is demultiplexed via demultiplexor 410 .
  • Dequantized coefficients d.sub.err(l,j) are calculated in the inverse quantization block 420 :
  • d.sub.err( i, j ) Q ⁇ 1 (I.sub.err( l, j ), QP ).
  • the pixels of the current coded frame are reconstructed by finding the prediction pixels in the reference frame R(x,y) using the received motion vectors and then adding to the compressed prediction error in adder 435 resulting in decoded video:
  • I.sub.c ( x, y ) R ( x+ ⁇ x, y+ ⁇ y )+ E.sub.c ( x, y ).
  • One of the key requirements for video streaming is to scale the transmission bitrate of the compressed video according to the changing network conditions.
  • this is achieved by adjusting on the fly the source encoding parameters, such as quantization parameter or frame rate based on the network feedback.
  • the above solution can not be applied.
  • the simplest way of achieving bandwidth scalability in case of pre-encoded sequences is by producing multiple and independent streams of different bandwidth and quality.
  • the server dynamically switches between the streams to accommodate variations of the bandwidth available to the client. Since the encoding algorithms employ motion-compensated prediction, switching between bitstreams at arbitrary P-type pictures, although possible, would lead to visual artifacts due to the mismatch between the reconstructed frames at the same time instant of different bitstreams. The visual artifacts will further propagate in time.
  • VCR functionalities such as random access or “Fast Forward” and “Fast Backward” (increased playback rate) for streaming video content, are achieved. User may skip a portion of video and restart playing at any I-frame location. Similarly, increased playback rate, i.e., fast-forwarding, can be achieved by transmitting only I-pictures.
  • An embodiment of the present invention provides a novel picture type, SP-picture, which allows switching from one bitstream to another, enables VCR like functionalities without introducing any mismatch while still utilizing motion compensated prediction.
  • SP-picture One of the properties of SP-pictures is that identical SP-frames may be obtained even when different reference frames are used.
  • FIG. 5 shows two bitstreams corresponding to the same sequence encoded at different bitrates—bitstream 1 510 and bitstream 2 520 .
  • SP-pictures may be placed at the locations at which one wants to allow switching from one bitstream to another (pictures S.sub. 1 513 and S.sub. 2 523 ).
  • picture S.sub. 12 550 pictures S.sub. 2 523 and S.sub. 12 550 are represented by different bitstreams, i.e., S.sub. 2 (S.sub. 12 ) uses the previously reconstructed frames from bitstream 2 as the reference frames, however their reconstructed values are identical.
  • SP-pictures Application of SP-pictures to enable random access is depicted in FIG. 7.
  • SP-pictures are placed at fixed intervals within bitstream 1 720 (e.g. picture S.sub. 1 ( 730 )) which is being streamed to the client.
  • bitstream 2 To each one of these SP-pictures there is a corresponding pair of pictures generated and stored as another bitstream (bitstream 2 ( 740 )):
  • I-picture I.sub.2 ( 750 ), at the temporal location preceding SP-picture.
  • SP-picture 710 S.sub.2, at the same temporal location as SP-picture.
  • Bitstream 1 ( 720 ) may then be accessed at a location corresponding to an I-picture in bitstream 2 ( 740 ). For example to access bitstream 1 at frame I.sub.2, first the pictures I.sub.2, S.sub.2 from bitstream 2 are transmitted and then the following pictures from bitstream 1 are transmitted.
  • FIG. 8 is an illustration of a fast-forward process using SP-pictures. If the bitstream 2 constitutes of only SP-pictures predicted from each other but at larger temporal intervals (e.g. 1 sec) as illustrated, SP-pictures can be used to obtain “Fast Forward” functionality. Furthermore, “Fast Forward” can start and stop at any location in the bitstream. Similarly, “Fast Backward” functionality can be obtained.
  • VRC Video Redundancy Coding
  • the principle of the VRC method is to divide the sequence of pictures into two or more threads in such a way that all camera pictures are assigned to one of the threads in a round-robin fashion. Each thread is coded independently. In regular intervals, all threads converge into a so-called sync frame. From this sync frame, a new thread series is started. If one of these threads is damaged because of a packet loss, the remaining threads stay intact and can be used to predict the next sync frame.
  • SP-frame comprises blocks encoded using only spatial correlation among the pixels (intra blocks) and blocks encoded using both spatial and temporal correlation (inter blocks).
  • the prediction of this block, P(x,y), is formed using received motion vectors and a reference frame.
  • I.sub.pred The quantized values of c.sub.pred are denoted as I.sub.pred and the dequantized values of I.sub.pred are denoted as d.sub.pred.
  • Quantized coefficients I.sub.err for the prediction error are received from the decoder.
  • the dequantized values of these coefficients will be denoted as d.sub.err.
  • each pixel S(x,y) in the inter block is decoded as a weighted sum of the basis functions f.sub.ij(x,y) where the weigh values d.sub.rec will be called dequantized reconstruction image coefficients.
  • the values of d.sub.rec have to be such that coefficients c.sub.rec, exist by which quantization and dequantization d.sub.rec can be obtained.
  • values d.rec have to fulfill one of the following conditions:
  • Values S(x,y) can be further normalized and filtered.
  • FIG. 5 How to utilize SP-frames to switch between different bitstreams is explained in FIG. 5.
  • SP-pictures should be placed at locations at which one wants to allow switching from one bitstrearm to another (pictures S.sub.1 ( 513 ), and S.sub.2 ( 523 ) in FIG. 5).
  • picture S.sub.12 ( 550 ) When switching from bitstream 1 ( 510 ) to bitstream 2 ( 520 ), another picture of this type will be transmitted (FIG. 5 picture S.sub.12 ( 550 ) will be transmitted instead of S.sub.2 ( 523 )).
  • Pictures S.sub.2 ( 523 ) and S.sub.12 ( 550 ) in FIG. 5 are represented by different bitstreams. However, their reconstructed values are identical.
  • coefficients d.sub.rec are calculated as follows:
  • SP-frames there are two types, specifically, the SP-frames; placed within the bistream, e.g., S.sub.1 ( 513 ) and S.sub.2 ( 523 ) in FIG. 5, and the SP-frames (S.sub.12 in FIG. 5) that will be sent when there is a switch between bitstreams (from bitstrearn 1 to bitstream 2 ).
  • the encodings of S.sub.2 ( 523 ) and S.sub.12 ( 550 ) are such that their reconstructed frames are identical although they use different reference frames as described below.
  • Motion vectors and the quantized prediction error coefficients are encoded using VLC and the corresponding bitstream is transmitted to the decoder.
  • the encoding of S.sub.12 ( 550 ) follows the same procedures as in the encoding of S.sub.2 ( 523 ) with the following exceptions:
  • the first difference is that the reference frames are the reconstructed frames obtained from the decoding of the bitstream 1 up to the current frame.
  • the updated quantized prediction error coefficients and the motion vectors are transmitted to the decoder.
  • I 12 .rec and I 2 .sub.rec are identical, i.e., S.sub.2 and S.sub.12 are identical.
  • S.sub.12 ( 550 ) and S.sub.2 ( 523 ) have different reference frames, they have identical reconstruction values.
  • SP-picture has the same syntax as P-picture. However, interpretation of some of the syntax element differs for Inter and Copy type macroblocks.
  • the macroblocks with type “Inter” are reconstructed as follows:
  • the transform/inverse transform may be performed both vertically and horizontally in the same manner as in H.263.
  • DC0,1,2,3 are the DC coefficients of 2 ⁇ 2 chroma blocks.
  • DC 0 ( DCC (0,0)+ DCC (1,0)+ DCC (0,1)+DCC(1,1))/2
  • DC 1 ( DCC (0,0) ⁇ DCC (1,0)+ DCC (0,1) ⁇ DCC (1,1))/2
  • DC 2 ( DCC (0,0)+ DCC (1,0) ⁇ DCC (0,1) ⁇ DCC (1,1))/2
  • K L.sub.pred (K ⁇ A(QP)+0.5 ⁇ 2 20 )/2 20 where A(QP) is defined below in the following example from the art.
  • the quantization/dequantization process may perform ‘normal’ quantization/dequantization as well as take care of the above transform process which did not contain normalization of transform coefficients. 32 different QP values may be used.
  • QP.sub.luma For chroma quantization/dequantization a different value—QP.sub.chroma—is used. The relation between the two is:
  • QP is used in the following we mean QP.sub.luma or QP.sub.chroma depending on what is appropriate.
  • LEVEL (K ⁇ A(QP)+fx2 20 )/2 20 f is in the range (0-0.5) and f has the same sign as K.
  • K ′ LEVEL ⁇ B ( QP )
  • L.sub.pred ( K ⁇ A ( QP )+0.5 ⁇ 2 20 )/2 20
  • SP-picture placed within a single bitstream (pictures S.sub.1 ( 513 of FIG. 5 and S.sub.2 ( 523 of FIG. 5).
  • the encoding and decoding of SP-picture which is transmitted when switching from one bitstream to another (picture S.sub.12 ( 550 ) and picture S.sub.2 ( 523 )) are described below.
  • the prediction error coefficients for luminance may be obtained as follows:
  • FIG. 9 illustrates a comparison of the coding efficiency of each picture type, namely I, P and SP frames in terms of their PSNR as a function of bitrate performances for the selected sequences (container and hall sequences). These results are generated by encoding every frame in the sequence with the same picture type, i.e., I, P or SP, except the first frame which is always an I-frame. As can be observed from FIG. 8, the coding efficiency of an SP-picture is worse than P-frames while it is significantly much better than that of I-frames. Although the coding efficiency of each picture type is important, it is important to note that the SP-frames provide functionalities that are usually achieved only with I-frames.
  • FIG. 10 illustrates the results obtained with the following conditions:
  • the first frame is encoded as an I-Picture and at fixed intervals, in this case 1 sec, the frames are encoded as I or SP-pictures while the rest of the frames are encoded as P-pictures.
  • FIG. 10 Also included in FIG. 10 is the performance achieved when all the frames are encoded using P-frames. Note that in this case, none of the functionalities mentioned earlier can be obtained while this provides a benchmark for comparison of both SP and I-picture cases. As can be observed from FIG. 10, SP-pictures, while providing the same functionalities as an I-picture, yield significantly better performance in terms of PSNR as a function of bitrate. For example for the Hall sequence around 40 Kbps, there is 2-2.5 dB improvement when SP-frames are used instead of I-frames while there is 0.5 dB penalty over the benchmark all P-frame conditions.
  • FIG. 11 demonstrates the performance improvement using SP-pictures instead of I-frames for “Fast-forward”. We also include the performance achieved by using only P-frames. Note again, in this case, restarting playing is not possible without a mismatch. Nevertheless, this provides another benchmark for comparison of the other schemes. As can be observed from FIG. 11, there is significant improvement with SP-pictures over I-pictures. For container sequence at 10 Kbps, an improvement of 5.5 dB can be obtained.
  • the coding efficiency of SP-frames is improved by using a separate quantization value for the predicted frame than the prediction error coefficients.
  • the changes required in H.26L in order to implement this embodiment of the present invention are described. Although H.26L is used an example standard, embodiments of the present invention and any variations and modifications therefrom are deemed to be within the spirit of scope of the invention.
  • K.sub.rec ( K.sub.pred+L.sub.err ⁇ F ( QP.sub. 1))
  • L.sub.rec ( K.sub.rec ⁇ A ( QP.sub. 2)+0.5 ⁇ 2 20 )/2 20 .
  • QP.sub.1 is given by PQP and QP.sub.2 by SPQP.
  • Dequantization of the chroma component is performed in a similar manner with the following differences.
  • step 2 additional 2 ⁇ 2 transform for the DC coefficients is performed after 4 ⁇ 4 transform.
  • Values of QP.sub.1 and QP.sub.2 are changed according to the relation between QP values used for luma and chroma specified above.
  • step 1 an additional 2 ⁇ 2 transform for DC coefficients is performed after 4 ⁇ 4 transform.
  • Values of QP 1 and QP 2 are changed according to the relation between QP values used for luma and chroma specified in above.
  • a novel SP-picture encoding/decoding method which uses different quantization values for the predicted block and the prediction error coefficients has been provided.
  • the use of two different values of OP allows to trade off between coding efficiency of SP-pictures placed within a single bitstream and SP-pictures used when switching from one bitstream to another.
  • the lower the value of SPQP with respect to PQP the higher the coding efficiency of SP-picture placed within a single bitstream, while, on the other hand, the larger number of bits is required when switching to this picture.
  • the choice of the SPQP value can be application dependent.
  • SPQP value should be small.
  • SPQP value should be kept close to PQP since SP-pictures sent during switching from one bitsream to another will have large share of the overall bandwidth.

Abstract

A decoder and method for using a new picture or frame type is provided. This type is referred to an SP-picture. The temporal redundancies are not exploited in I-frames, compression efficiency of I-frame coding is significantly lower than the predictive coding. A method allows use of motion compensated predictive coding to exploit temporal redundancy in the sequence while still allowing perfect reconstruction of the frame using different reference frames. Methods using this new picture type provide for error resilience/recovery, bandwidth scalability, bitstream switching, processing scalability, random access and other functions.
The SP-type picture provides for, among other functions, switching between different bitstreams, random access, fast forward and fast error-recovery by replacing I-pictures to increase the coding efficiency. As will be demonstrated, SP-pictures have the property that identical SP-frames may be obtained even when they are predicted using different reference frames.

Description

    CROSS REFERENCE TO RELATED APPLICATIONS
  • This application is related to and claims priority from Provisional Application No. 60/259.529 filed on Jan. 3, 2001, incorporated herein by reference.[0001]
  • BACKGROUND
  • This invention relates generally to the field of the multimedia applications. More particularly, this invention relates to a decoder/decompressor and method for decoding streaming video. [0002]
  • Multimedia applications that include audio and streaming video information have come into greater use. Several multimedia groups have established and proposed standards for compressing/encoding and decompressing/decoding the audio and video information. MPEG standards, established by the Motion Picture Expert Group, are the most widely accepted international standards in the field of the multimedia applications. ITU-Telecommunications Standardization have developed video coding standards established by the Video Coding Experts Group (VCEG). Other standards are JPEG and Motion JPEG established by the Joint Photographic Expert Group. [0003]
  • The following are incorporated herein by reference: [0004]
  • Gisle Bjontegaard, “H.26L Test Model Long Term Number 5 (TML-5) draft0”, document Q15-K-59, ITU-T Video Coding Experts Group (Question 15) Meeting, Oregon, USA Aug. 22-25, 2000. Keiichi Hibi, “Report of the Ad Hoc Committee on H.26L Development”, document Q15-H-07, ITU-T Video Coding Experts Group (Question 15) Meeting, Berlin, Aug. 03-06, 1999. Gary S. Greenbaum, “Remarks on the H.26L Project: Streaming Video Requirements for Next Generation Video Compression Standards”, document Q15-G-11, ITU-T Video Coding Experts Group (Question 15) Meeting, Monterey, Feb. 16-19, 1999. G. Bjontegaard, “Recommended Simulation Conditions for H.26L”, document Q15-I-62, ITU-T Video Coding Experts Group (Question 15) Meeting, Red Bank, N.J., Oct. 19-22, 1999. G. Bjontegaard, “H.26L Test Model Long Term Number 6 (TML-6) draft0”, document VCEG-L45, ITU-T Video Coding Experts Group Meeting, Eibsee, Germany, Jan. 09-12, 2001. ATM & MPEG-2 Integrating Digital Video into Broadband Networks by Michael Orzessek and Peter Sommer (Prentice Hall Upper Saddle River N.J.). [0005]
  • The purpose of the video coding is to remove the redundancy in the image sequence so that the encoded data rate is commensurate with the available bandwidth to transport the video sequence while keeping the distortion between the original and reconstructed images as small as possible. The redundancy in video sequences can be categorized into spatial and temporal redundancy. Spatial redundancy refers to the correlation between neighboring pixels in a frame while temporal redundancy refers to correlation between neighboring frames. [0006]
  • For every pixel of a image, color information must be provided. Typically, color information is coded in terms of the primary color components red, green and blue (RGB) or using a related luminance/chrominance model, known as the YUV model. [0007]
  • Typical video codec employs three types of frames: intra frames (I-frames) and predicted frames (P-frames) and Bi-directional-frame (B-frames). Coding of a frame is performed independently from the others. I-frame, exploits only the spatial correlation of the pixels within the frame. Coding of P-frames exploits spatial as well temporal redundancies between the successive frames. Since in a typical video sequence the objects appearing in a sequence don't change rapidly from one frame to the next frame, i.e., the adjacent frames in a sequence are highly correlated, higher compression efficiencies are achieved when using P-frames. The terms frame and picture have been interchanged in the art. A frame contains all the color and brightness information that is need to display a picture. A picture is divided into a number of blocks, which are grouped into macroblocks. Each block contains a number of lines, with each line holding a number of samples of luminace or chrominance pixel values from a frame. [0008]
  • FIGS. 1 and 2 show multimedia coding using MPEG as an example. FIG. 1A is a diagram of an MPEG audio and [0009] video decoder 120 that performs decompression of the video and/or audio data which has been compressed and coded according to the MPEG algorithm. The system decoder 110 reads the encoded MPEG data stream 101 having interlaced compressed video and/or audio data, and generates necessary timing information; Video Presentation Time Stamp (VPTS) 104; System Clock Reference (SCR) 105 which is also referred to as system time clock (STC); Audio Presentation Time Stamp (APTS) 106; and separated video encoded bit streams 102 and audio encoded bit streams 103. The video decoder 111 decompresses the video data stream 102 and generates a decompressed video signal 107. The audio decoder 112 decompresses the audio data stream 103 and generates the decompressed audio signal 108. The decompressed video signal 107 is coupled to a display unit, while the decompressed audio signal 108 is coupled to an audio speaker or other audio generation means.
  • The MPEG encoded/compressed data stream may contain a plurality of encoded/compressed video data packets or blocks and a plurality of encoded/compressed audio data packets or blocks. An MPEG encoder encodes/compresses the video packets based on video frames, also referred to as pictures. These pictures or frames are source or reconstructed image data consisting of three rectangular matrices of multiple-bit numbers representing the luminance and chrominance signals. For example, H.263+ uses four luminance blocks and two chrominance blocks of 8×8 pixels each. FIGS. [0010] 2A-2C illustrate the type of encoded/compressed video frames that are commonly utilized for MPEG standard. FIG. 2A depicts an Intra-frame or I-type frame 200. The I-type frame or picture is a frame of video data that is coded without using information from the past or the future and is utilized as the basis for decoding/decompression of other type frames. FIG. 2B is a representation of a Predictive-frame or P-type frame 210. The P-type frame or picture is a frame that is encoded/compressed using motion compensated prediction from an I-type or P-type frame of its past, in this case, I.sub.1 200. That is, a previous frame is used to encode/compress a present given frame of video data. 205 a represents the motion compensated prediction information to create a P-type frame 210. FIG. 2C depicts a Bi-directional-frame or B-type of frame 220. The B-type frame or picture is a frame that is encoded/compressed using a motion compensated prediction derived from the l-type reference frame (200 in this example) or P-type reference frame in its past and the I-type reference frame or P-type reference frame (210 in this example) in its future or a combination of both. B-type frames are usually inserted between I-type frames or P-type frames. FIG. 2D represents a group of pictures in what is called display order I.sub.1 B.sub.2 B.sub.3 P.sub.4 B.sub.5 P.sub.6. FIG. 2D illustrates the B-type frames inserted between I-type and P-type frames and the direction which motion compensation information flows.
  • Motion compensation refers to using motion vectors from one frame to the next to improve the efficiency of predicting pixel values for encoding/compression and decoding/decompression. The method of prediction uses the motion vectors to provide offset values and error data that refer to a past or a future frame of video data having decoded pixel values that may be used with the error data to compress/encode or decompress/decode a given frame of video data. [0011]
  • The capability to decode/decompress P-type frames requires the availability of the previous l-type or P-type reference frame and the B-type frame requires the availability of the subsequent I-type or P-type reference frame. For example, consider the encoded/compressed data stream to have the following frame sequence or display order: [0012]
  • I.sub.1 B.sub.2 B.sub.3 P.sub.4 B.sub.5 P.sub.6 B.sub.7 P.sub.8 B.sub.9 B.sub.10 P.sub.11 . . . P.sub.n-3 B.sub.n-2 P.sub.n-1 I.sub.n. [0013]
  • The decoding order for the given display order is: [0014]
  • I.sub.1 P.sub.4 B.sub.2 B.sub.3 P.sub.6 B.sub.5 P.sub.8 B.sub.7 P.sub.11 B.sub.9 B.sub.10 . . . P.sub.n-1 B.sub.n-2 I.sub.n. [0015]
  • The decoding order differs from the display order because the B-type frames need future I-type or P-type frames to be decoded. P-type frames require that the previous l-type reference frame be available. For example, P.sub.4 requires I.sub.1 to be decoded such that the encoded/compressed I.sub.1 frame needs to be available. Similarly, the frame P.sub.6 requires that P.sub.4 be available in order to decode/decompress frame P.sub.6. B-type frames, such as frame B.sub.3, require a past and future I-type or P-type reference frames, such as P.sub.4 and I.sub.1 in order to be decoded. B-type frames are inserted frames between l-type, P-type, or a combination during encoding and are not necessary for faithful reproduction of an image. The frames before an I-type frame, such as P.sub.n-1 in the example, are not needed to decode an I-type frame, and no future frames require P.sub.n-1 in order to be decoded/decompressed. [0016]
  • One problem with decoding is that the display process may be slower than the decoding process. For example, a 240.times.16 picture requires 3072 clock cycles to decode(76.8 us at 40 Mhz); it takes 200 us to display 16 lines of video data at a 75 Hz refresh rate (13 us.times.16=200 us). The video frames are buffered before being displayed. There is usually a one frame delay between display and decoding. The difference between display and decoding leads to a condition known as tearing. Tearing occurs when the display frame is overwritten by the decoded frame. [0017]
  • FIG. 1B depicts tearing. A decoded/decompressed [0018] frame 132 of data representing the image of a closed door 133 is currently stored in a buffer 135. This decode/decompressed frame is currently being displayed on display unit 140. During this display period another decoded/decompressed frame 130 with data representing the image of an open door 131 is stored in buffer 135. The display unit 140 will now start displaying using information from the new frame now stored in 135. The result is a partial display of the first stored image 141 and partial display of the new stored image 142.
  • Video streaming has emerged as one of the essential applications over the fixed internet and- in the near future over 3G multimedia networks. In streaming applications, the server starts streaming the pre-encoded video bitstream to the receiver upon a request from the receiver which plays the stream as it receives with a small delay or no delay. The problem with video streaming is that the best-effort nature of today's networks causes variations of the effective bandwidth available to a user due to the changing network conditions. The server should then scale the bitrate of the compressed video to accommodate these variations. In case of conversational services that are characterized by real-time encoding and point-to-point delivery, this is achieved by adjusting, on the fly, the source encoding parameters, such as quantization parameter or frame rate, based on the network feedback. In typical streaming scenarios when already encoded video bitstream is to be streamed to the client, the above solution can not be applied. A situation similar to tearing as described above would occur. [0019]
  • Thus, there is a need to provide bandwidth scalability which will allow a server to dynamically switch between the streams of encoded video in order to accommodate variations of the bandwidth available to the client. [0020]
  • The above-mentioned references are exemplary only and are not meant to be limiting in respect to the resources and/or technologies available to those skilled in the art. [0021]
  • SUMMARY
  • A new picture or frame type and method of using same is provided. This type of novel frame type is referred to as a SP-picture. The temporal redundancies are not exploited in I-frames, compression efficiency of I-frame coding is significantly lower than the predictive coding. The proposed method allows use of motion compensated predictive coding to exploit temporal redundancy in the sequence while still allowing perfect reconstruction of the frame using different reference frames. This new picture type provides for error resilience/recovery, bandwidth scalability, bitstream switching, processing scalability, random access and other functions. [0022]
  • The SP-type picture provides for, among other functions, switching between different bitstreams, random access, fast forward and fast error-recovery by replacing I-pictures to increase the coding efficiency. As will be demonstrated, SP-pictures have the property that identical SP-frames may be obtained even when they are predicted using different reference frames. [0023]
  • These and other features, aspects, and advantages of embodiments of the present invention will become apparent with reference to the following description in conjunction with the accompanying drawings. It is to be understood, however, that the drawings are designed solely for the purposes of illustration and not as a definition of the limits of the invention, for which reference should be made to the appended claims. [0024]
  • BRIEF DESCRIPTIONS OF THE DRAWINGS
  • FIG. 1A is a prior art block diagram of an MPEG decoding system. [0025]
  • FIG. 1B is a drawing showing the problem of tearing within prior art devices where frame data from two different frames stored consecutively in frame buffer memory is displayed on a display device. [0026]
  • FIGS. [0027] 2A-2D are diagrams showing the prior art encoding/compression of video frames.
  • FIG. 3 is a block diagram of a generic motion-compensated predictive video coding system (encoder). [0028]
  • FIG. 4 is a block diagram of a generic motion-compensated predictive video coding system (decoder). [0029]
  • FIG. 5 is an illustration showing switching between [0030] bitstreams 1 and 2 using SP-pictures.
  • FIG. 6 is a block diagram of a decoder in accordance with the preferred embodiment of the invention. [0031]
  • FIG. 7 is an illustration of random access using SP-pictures. [0032]
  • FIG. 8 is an illustration of a fast-forward process using SP-pictures. [0033]
  • FIG. 9 is a set of graphs comparing coding efficiencies of I, P, and SP-pictures. [0034]
  • FIG. 10 is a set of graphs comparing performance of SP-pictures and I-pictures when used at fixed one second intervals. [0035]
  • FIG. 11 is a set of graphs comparing performance of SP-pictures and I-pictures in fast-forward application, one second Intervals. [0036]
  • DETAILED DESCRIPTION
  • The simplest way of achieving bandwidth scalability in case of pre-encoded sequences is by producing multiple and independent streams of different bandwidth and quality. The server dynamically switches between the streams to accommodate variations of the bandwidth available to the client. [0037]
  • Now assume that we have multiple bitstreams generated independently with different encoding parameters, such as quantization parameter, corresponding to the same video sequence. Since encoding parameters are different for each bitstream, the reconstructed frames of different bitstreams at the same time instant will not be the same. Therefore when switching between bitstreams, i.e., starting to decode a bitstream, at arbitrary locations would lead to visual artifacts due to the mismatch between the reference frames used to obtain predicted frame P(x, y). Furthermore, the visual artifacts will not only be confined to the switched frame but will further propagate in time due to motion compensated coding. [0038]
  • The main observation is that perfect (mismatch-free) switching between bitstreams, in the current standards, is possible only at the positions where the future frames/regions do not use any information previous to the current switching location and the information at the current location is made available to the client. [0039]
  • In prior art solutions, the approach adopted is to insert periodic I-frames during encoding and allow the switch to occur only at these I-frames. Since I-frames may be reconstructed independently from the previous frames, switching at these frames doesn't cause any mismatch. Thus a new type of picture or frame is needed as is architecture and methods of using said new frame type. [0040]
  • A new decoder architecture is provided which has the property that identical frames may be obtained even when they are predicted using different reference frames. The picture type obtained using this structure will be called SP-frame also may be referred to as picture. [0041]
  • A system for P-frame encoding and decoding is provided and is shown in FIGS. 3 and 4. Referring to FIGS. 3 and 4, a communication system comprising an [0042] encoder 300 of FIG. 3 and a decoder 400 of FIG. 4 is operable to communicate a multimedia sequence between a sequence generator and a sequence receiver. Other elements of the video sequence generator and receiver are not shown for the purposes of simplicity. The communication path between sequence generator and receiver may take various forms, including but not limited to a radio-link.
  • [0043] Encoder 300 is shown in FIG. 3 coupled to receive video input on line 301 in the form of a current frame to be encoded I(x, y), called the current frame. The video input may be provided to a motion estimation and coding block 370 through 305 and to an input of a subtractor 307. The motion estimation and coding block 370 may also be coupled frame memory 350 to receive indications of a previously coded and transmitted frame R(x, y), called a reference frame. The motion estimation and coding block 370 may also be coupled to multiplexor 380 to provide motion information for bitstream.
  • The current frame I(x, y), is partitioned into rectangular regions of M×N pixels. By (x, y) we denote location of the pixel within the frame. These blocks may be encoded using either only spatial correlation (intra-coded blocks) or both spatial and temporal correlation (inter-coded blocks). In what follows, we concentrate on inter blocks. Each of inter-coded blocks may be predicted from one of the previously coded and transmitted reference frame, which at given instant is available in the [0044] Frame Memory 350 of the encoder 300 and of the Frame Memory 450 of the decoder 400 in FIG. 4. The frame memory 350 is coupled to a Motion Compensated (MC) prediction block 360. The MC prediction block 360 is operable to generate a prediction frame P(x, y) which is provided to an input of subtractor 307 and adder 345. MC prediction block 360 is also coupled to the motion estimation and coding block 370 to receive motion information.
  • The prediction information may be represented by two dimensional motion vector (Δx, Δy) where Δx is the horizontal and Δy is the vertical displacement, respectively of the pixels between the current frame and the reference frame. The motion estimation and [0045] coding block 370 calculates the motion vectors (Δx, Δy). In the Motion Compensated (MC) Prediction block 360, the motion vectors together with the reference frame are used to construct prediction frame P(x, y):
  • P(x, y)=R(x+Δx, y+Δy).
  • Subsequently, the prediction error E(x, y), i.e., the difference between the current frame and the prediction frame P(x, y) is calculated by:[0046]
  • E(x, y)=I(x, y)−P(x, y).
  • In [0047] transform block 310, the prediction error for each K×L block is represented as weighted sum of a transform basis functions f.sub.ij(x, y), E ( X , Y ) = i = 1 K J = 1 L C . sub . err ( l , j ) fsub . ij ( X , Y ) .
    Figure US20020122491A1-20020905-M00001
  • The weights c.sub.err(l,j), corresponding to the basis functions are called transform coefficients. These coefficients are subsequently quantized in quantization block [0048] 320:
  • I.sub.err(l, j)=Q(C(l, j),QP)
  • where I.sub.err(i, j) are the quantized coefficients. The operation of quantization introduces loss of information—the quantized coefficient can be represented with smaller number of bits. The level of compression (loss of information) is controlled by adjusting the value of the quantization parameter (QP). [0049]
  • The quantization block [0050] 320 is coupled to both a multiplexor 380 and an inverse quantization block 330 and in turn an inverse transform block 340. Blocks 330 and 340 provide prediction error which is added to the MC predicted frame P(x, y) by added 345 and the result stored in frame memory 350.
  • Motion vectors and quantized coefficients are further encoded using Variable Length Codes (VLC) which further reduce the number of bits needed for their representation. Encoded motion vectors and quantized coefficients as well as other additional information needed to represent each coded frame of the image sequence constitute a [0051] bitstream 415 which is transmitted to the decoder 400 of FIG. 4. Bitstream may be multiplexed 380 before transmission.
  • FIG. 4 shows the [0052] decoder 400 of the communication system. Bitstream 415 is received from encoder 300 of FIG. 3. Bitstream 415 is demultiplexed via demultiplexor 410. Dequantized coefficients d.sub.err(l,j) are calculated in the inverse quantization block 420:
  • d.sub.err(i, j)=Q −1(I.sub.err(l, j), QP).
  • In [0053] inverse transform block 430, the dequantized coefficients are used to obtain compressed prediction error: E . sub . c ( X , Y ) = i = 1 K J = 1 L C . sub . err ( l , j ) f . sub . ij ( X , Y ) .
    Figure US20020122491A1-20020905-M00002
  • The pixels of the current coded frame are reconstructed by finding the prediction pixels in the reference frame R(x,y) using the received motion vectors and then adding to the compressed prediction error in [0054] adder 435 resulting in decoded video:
  • I.sub.c(x, y)=R(x+Δx, y+Δy)+E.sub.c(x, y).
  • These values can be further normalized and filtered. [0055]
  • One of the key requirements for video streaming is to scale the transmission bitrate of the compressed video according to the changing network conditions. In case of conversational services that are characterized by real-time encoding and point-to-point delivery, this is achieved by adjusting on the fly the source encoding parameters, such as quantization parameter or frame rate based on the network feedback. In typical streaming scenarios when already encoded video sequence is to be streamed to the client the above solution can not be applied. [0056]
  • The simplest way of achieving bandwidth scalability in case of pre-encoded sequences is by producing multiple and independent streams of different bandwidth and quality. The server dynamically switches between the streams to accommodate variations of the bandwidth available to the client. Since the encoding algorithms employ motion-compensated prediction, switching between bitstreams at arbitrary P-type pictures, although possible, would lead to visual artifacts due to the mismatch between the reconstructed frames at the same time instant of different bitstreams. The visual artifacts will further propagate in time. [0057]
  • In the current video encoding standards, perfect (mismatch-free) switching between bitstreams is possible only at the positions where the future frames/regions do not use any information previous to the current switching location, i.e., at I-frames. Furthermore, by placing l-frames at fixed (e.g. 1 sec) intervals, VCR functionalities, such as random access or “Fast Forward” and “Fast Backward” (increased playback rate) for streaming video content, are achieved. User may skip a portion of video and restart playing at any I-frame location. Similarly, increased playback rate, i.e., fast-forwarding, can be achieved by transmitting only I-pictures. [0058]
  • It is, however, well known that I-frames require a lot more bits than the motion-compensated predicted frames. An embodiment of the present invention provides a novel picture type, SP-picture, which allows switching from one bitstream to another, enables VCR like functionalities without introducing any mismatch while still utilizing motion compensated prediction. One of the properties of SP-pictures is that identical SP-frames may be obtained even when different reference frames are used. [0059]
  • Bitstream Switching [0060]
  • An example of how to utilize SP-frames to switch between different bitstreams is illustrated in the FIG. 5. FIG. 5 shows two bitstreams corresponding to the same sequence encoded at different bitrates—[0061] bitstream 1 510 and bitstream 2 520. Within each encoded bitstream, SP-pictures may be placed at the locations at which one wants to allow switching from one bitstream to another (pictures S.sub.1 513 and S.sub.2 523). When switching from bitstream 1 to bitstream 2, another SP-picture will be transmitted. This is show in FIG. 5 by picture S.sub.12 550. Pictures S.sub.2 523 and S.sub.12 550 are represented by different bitstreams, i.e., S.sub.2 (S.sub.12) uses the previously reconstructed frames from bitstream 2 as the reference frames, however their reconstructed values are identical.
  • Random Access [0062]
  • Application of SP-pictures to enable random access is depicted in FIG. 7. SP-pictures are placed at fixed intervals within [0063] bitstream 1 720 (e.g. picture S.sub.1 (730)) which is being streamed to the client. To each one of these SP-pictures there is a corresponding pair of pictures generated and stored as another bitstream (bitstream 2 (740)):
  • I-picture, I.sub.2 ([0064] 750), at the temporal location preceding SP-picture.
  • SP-picture [0065] 710, S.sub.2, at the same temporal location as SP-picture.
  • Bitstream [0066] 1 (720) may then be accessed at a location corresponding to an I-picture in bitstream 2 (740). For example to access bitstream 1 at frame I.sub.2, first the pictures I.sub.2, S.sub.2 from bitstream 2 are transmitted and then the following pictures from bitstream 1 are transmitted.
  • Fast-forward [0067]
  • FIG. 8 is an illustration of a fast-forward process using SP-pictures. If the [0068] bitstream 2 constitutes of only SP-pictures predicted from each other but at larger temporal intervals (e.g. 1 sec) as illustrated, SP-pictures can be used to obtain “Fast Forward” functionality. Furthermore, “Fast Forward” can start and stop at any location in the bitstream. Similarly, “Fast Backward” functionality can be obtained.
  • Video Redundancy Coding [0069]
  • SP-pictures have other uses in applications in which they do not act as replacements of I-pictures. Video Redundancy Coding can be given as an example (VRC). “The principle of the VRC method is to divide the sequence of pictures into two or more threads in such a way that all camera pictures are assigned to one of the threads in a round-robin fashion. Each thread is coded independently. In regular intervals, all threads converge into a so-called sync frame. From this sync frame, a new thread series is started. If one of these threads is damaged because of a packet loss, the remaining threads stay intact and can be used to predict the next sync frame. It is possible to continue the decoding of the damaged thread, which leads to slight picture degradation, or to stop its decoding which leads to a drop of the frame rate. Sync frames are always predicted out of one of the undamaged threads. This means that the number of transmitted I-pictures can be kept small, because there is no need for complete re-synchronization.” For the sync frame, more than one representation (P-picture) is sent, each one using a reference picture from a different thread. Due to the usage of P-pictures these representations are not identical. Therefore, mismatch is introduced when some of the representations cannot be decoded and their counterparts are used when decoding the following threads. Usage of SP-pictures as sync frames eliminates this problem. [0070]
  • Error Resiliency/Recovery [0071]
  • Multiple representations of a single frame in the form of SP-frames predicted from different reference pictures, e.g., the immediate previously reconstructed frames and a reconstructed frame further back in time, can be used to increase error resilience. Now, consider the case when an already encoded bitstream is being streamed and there has been a packet loss leading to a frame loss. The client signals the lost frame(s) to the sender which responds by sending the next SP-frame in the representation that uses frames that have been already received by the client. [0072]
  • We have described the application of SP-pictures in different application/functionality scenarios. Note that the bitstreams in the applications discussed above could have different bitrates, frame sizes and frame rates. Depending on the client's available bandwidth, decoding and viewing capabilities, the appropriate streams can be streamed and moreover, the streams could be dynamically changed to accommodate any changes in these. In the following, we provide a detailed description of SP-picture encoding/decoding within the context of H.26L. [0073]
  • SP-frame comprises blocks encoded using only spatial correlation among the pixels (intra blocks) and blocks encoded using both spatial and temporal correlation (inter blocks). [0074]
  • For each inter block: [0075]
  • The prediction of this block, P(x,y), is formed using received motion vectors and a reference frame. [0076]
  • The transform coefficients C.sub.pred for P(x,y) corresponding to basis functions f.sub.ij(xy) are calculated. [0077]
  • The quantized values of c.sub.pred are denoted as I.sub.pred and the dequantized values of I.sub.pred are denoted as d.sub.pred. [0078]
  • Quantized coefficients I.sub.err, for the prediction error are received from the decoder. The dequantized values of these coefficients will be denoted as d.sub.err. [0079]
  • Value of each pixel S(x,y) in the inter block is decoded as a weighted sum of the basis functions f.sub.ij(x,y) where the weigh values d.sub.rec will be called dequantized reconstruction image coefficients. The values of d.sub.rec, have to be such that coefficients c.sub.rec, exist by which quantization and dequantization d.sub.rec can be obtained. In addition, values d.rec have to fulfill one of the following conditions:[0080]
  • d.sub.rec=d.sub.pred+d.sub.err; or
  • C.sub.rec=C.sub.pred+d.sub.err.
  • Values S(x,y) can be further normalized and filtered. [0081]
  • How to utilize SP-frames to switch between different bitstreams is explained in FIG. 5. Within each encoded bitstream, SP-pictures should be placed at locations at which one wants to allow switching from one bitstrearm to another (pictures S.sub.1 ([0082] 513), and S.sub.2 (523) in FIG. 5). When switching from bitstream 1 (510) to bitstream 2 (520), another picture of this type will be transmitted (FIG. 5 picture S.sub.12 (550) will be transmitted instead of S.sub.2 (523)). Pictures S.sub.2 (523) and S.sub.12 (550) in FIG. 5 are represented by different bitstreams. However, their reconstructed values are identical.
  • The invention is described in view of certain embodiments. Variations and modification are deemed to be within the spirit and scope of the invention. The following describes the preferred implementation of the invention as illustrated in FIG. 6. [0083]
  • In the preferred mode of implementation coefficients d.sub.rec are calculated as follows: [0084]
  • Form prediction of current block, P(x,y), using received motion vectors and the reference frame. [0085]
  • Calculate transform coefficients c.sub.pred for P(x,y) corresponding to basis functions f.sub.ij(x,y) (Transform block [0086] 660). Quantize these coefficients (Quantization block 670). The quantized values will be referred to as quantized prediction image coefficients and denoted as I.sub.pred.
  • Obtain quantized reconstruction image coefficients I.sub.rec by adding the received quantized coefficients for the prediction error I.sub.err to I.sub.pred, i.e., I.sub.rec=I.sub.pred−I.sub.err. [0087]
  • Dequantise I.sub.rec. The dequantised coefficients, output of the Inverse Quantization block, are equal to d.sub.rec. [0088]
  • In the following, we describe the encoding of SP-frames for the decoder structure described as the preferred embodiment of the invention. [0089]
  • As can be observed from FIG. 5, there are two types of SP-frames, specifically, the SP-frames; placed within the bistream, e.g., S.sub.1 ([0090] 513) and S.sub.2 (523) in FIG. 5, and the SP-frames (S.sub.12 in FIG. 5) that will be sent when there is a switch between bitstreams (from bitstrearn 1 to bitstream 2). The encodings of S.sub.2 (523) and S.sub.12 (550) are such that their reconstructed frames are identical although they use different reference frames as described below.
  • First, we describe the encoding of SP-frames placed within the bitstream, e.g., S.sub.1 ([0091] 513) and S.sub.2 (523) in FIG. 5. The original frame is partitioned into blocks and each inter coded block is predicted from one of the earlier reconstructed frames. The prediction frame is formed as described in above. The transform coefficients c.sub.orig and c.sub.pred are calculated for the original frame I(x,y) and prediction frame P(x,y), respectively. C.sub.orig and C.sub.pred are quantized to obtain I.sub.orig and I.sub.pred, respectively. The quantized prediction error coefficients I are then obtained by subtracting I.sub.orig from I.sub.pred, i.e., I.sub.err,=I.sub.orig−I.sub.pred. Motion vectors and the quantized prediction error coefficients are encoded using VLC and the corresponding bitstream is transmitted to the decoder.
  • Let I[0092] 2.sub.err and I2.sub.pred denote the quantized coefficients of the prediction error and the prediction frame, respectively, obtained from encoding of S.sub.2. With the procedure described above. Note that in this case, quantized reconstruction image coefficients are given by I2.sub.rec=I2.sub.err−I2.sub.pred. Assume that there will be a switch from bitstream 1 (510) to bitstream 2 (520) at S.sub.2 (523). The encoding of S.sub.12 (550) follows the same procedures as in the encoding of S.sub.2 (523) with the following exceptions: The first difference is that the reference frames are the reconstructed frames obtained from the decoding of the bitstream 1 up to the current frame. Secondly, the quantized prediction error coefficients are calculated as follows: 112.sub.err=I2.sub.rec−I12.sub.pred where and I12.pred denotes the quantized prediction image coefficients. The updated quantized prediction error coefficients and the motion vectors are transmitted to the decoder.
  • When decoding frame S.sub.12 ([0093] 550), using the reconstructed frames from bitstream 1 before the switch, coefficients I12.sub.pred are constructed and added to the received quantized prediction error coefficients I12.sub.err as described above, i.e., I12.sub.rec=I12.sub.err+I12.sub.pred=I12.sub.rec+I12.sub.pred−I12.pred=I2.sub.rec. Note that I12.rec and I2.sub.rec are identical, i.e., S.sub.2 and S.sub.12 are identical. In summary, although S.sub.12 (550) and S.sub.2 (523) have different reference frames, they have identical reconstruction values.
  • The changes required in H.26L in order to implement this embodiment of the present invention are described. Although H.26L is used an example standard, embodiments of the present invention and any variations and modifications therefrom are deemed to be within the spirit of scope of the invention. [0094]
  • SP-Picture Decoding [0095]
  • Additional picture types Ptype Code_number=5 is added to H.26L for signaling SP-picture: [0096]
  • SP-picture has the same syntax as P-picture. However, interpretation of some of the syntax element differs for Inter and Copy type macroblocks. The macroblocks with type “Inter” are reconstructed as follows: [0097]
  • 1. Decode levels (both a magnitude and a sign) of the prediction error coefficients, L.sub.err, and motion vectors for the macroblock. [0098]
  • 2. After motion compensation, for each 4×4 block in the predicted macroblock, perform forward transform. [0099]
  • An example of a forward transform is provided by Gisle Bjontegaard, “H.26L Test Model Long Term Number 5 (TML-5) draft0”, document Q15-K-59, ITU-T Video Coding Experts Group (Question 15) Meeting, Oregon, USA Aug. 22-25, 2000. Instead of DCT, an integer transform with basically the same coding property as a 4×4 DCT is used. The transformation of some pixels a,b,c,d (say) into 4 transform coefficients A,B,C,D may be defined by:[0100]
  • A=13a+13b+13c+13d
  • B=17a+7b−7c−17d
  • C=13a−13b−13c+13d
  • D=7a−17b+17c−7d
  • The inverse transformation of transform coefficients A,B,C,D into 4 pixels a′,b′,c′,d′ is defined by:[0101]
  • a′=13A+17B+13C+7D
  • b′=13A+7B−13C−17D
  • c′=13A−7B−13C+17D
  • d′=13A−17B+13C−7D
  • Due to the fact that the expressions above are not normalized, a′=676a. Normalization may be performed in the quantization/dequantization process and a final shift after inverse quantization. [0102]
  • The transform/inverse transform may be performed both vertically and horizontally in the same manner as in H.263. [0103]
  • For chroma component, an additional 2×2 transform for the DC coefficients may be performed. The 2 dimensional 2×2 transform procedure is illustrated below. DC0,1,2,3 are the DC coefficients of 2×2 chroma blocks. [0104]
    DC0 DC1 Two dimensional 2 × 2 => DDC (0, 0) DDC (1, 0)
    transform
    DC2 DC3    DDC (0, 1) DDC (1, 1)
  • Definition of transform:[0105]
  • DCC(0,0)=(DC0+DC1+DC2+DC3)/2
  • DCC(1,0)=(DC0−DC1+DC2−DC3)/2
  • DCC(0,1)=(DC0+DC1−DC2−DC3)/2
  • DCC(1,1)=(DC0−DC1−DC2+DC3)/2
  • Definition of inverse transform:[0106]
  • DC0=(DCC(0,0)+DCC(1,0)+DCC(0,1)+DCC(1,1))/2
  • DC1=(DCC(0,0)−DCC(1,0)+DCC(0,1)−DCC(1,1))/2
  • DC2=(DCC(0,0)+DCC(1,0)−DCC(0,1)−DCC(1,1))/2
  • DC3=(DCC(0,0)−DCC(1,0)−DCC(0,1)+DCC(1,1))/2
  • Quantize obtained coefficients K L.sub.pred=(K×A(QP)+0.5×2[0107] 20)/220 where A(QP) is defined below in the following example from the art.
  • An example of quantizing is provided by Gisle Bjontegaard, “H.26L Test Model Long Term Number 5 (TML-5) draft0”, document Q15-K-59, ITU-T Video Coding Experts Group (Question 15) Meeting, Oregon, USA Aug. 22-25, 2000. [0108]
  • The quantization/dequantization process may perform ‘normal’ quantization/dequantization as well as take care of the above transform process which did not contain normalization of transform coefficients. 32 different QP values may be used. [0109]
  • The QP signaled in the bitstream applies for luma quantization/dequantization referred to as QP.sub.luma. For chroma quantization/dequantization a different value—QP.sub.chroma—is used. The relation between the two is: [0110]
  • QP.sub.luma [0111]
  • 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 [0112]
  • QP.sub.chroma [0113]
  • 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 17 18 19 20 20 21 22 22 23 23 24 24 25 25 [0114]
  • When QP is used in the following we mean QP.sub.luma or QP.sub.chroma depending on what is appropriate. [0115]
  • Two arrays of numbers are used for quantization/dequantization. [0116]
  • A(QP=0, . . , 31) [0117]
  • 620, 553, 492, 439, 391, 348, 310, 276, 246, 219, 195, 174, 155, 138, 123, 110, 98, 87, 78, 69, 62, 55, 49, 44, 39, 35, 31, 27, 24, 22, 19, 17 [0118]
  • B(QP=0, . . ,31) [0119]
  • 3881,4351,4890,5481,6154,6914,7761,8718,9781,10987,12339,13828,15523,17435,19561, 21873,24552,27656,30847,34870,38807,43747,49103,54683,61694,68745,77615,89113,10 0253,109366,126635,141533 [0120]
  • The relation between A( ) and B( ) is: A(QP)×B(QP)×676[0121] 2=240.
  • It is assumed that a coefficient K is quantized in the following way: [0122]
  • LEVEL=(K×A(QP)+fx2[0123] 20)/220 f is in the range (0-0.5) and f has the same sign as K.
  • Dequantization:[0124]
  • K′=LEVEL×B(QP)
  • After inverse transform this results in pixel values that are 2[0125] 20 too high. A shift of 20 bits (with rounding) is therefore needed on the reconstruction side. The definition of transform and quantization is designed so that no overflow will occur with use of 32 bit arithmetic.
  • The coefficients of K is obtained by:[0126]
  • L.sub.pred=(K×A(QP)+0.5×220)/220
  • Add the quantized prediction image coefficients, L.sub.pred, to the prediction error coefficients levels, i.e., L.sub.rec=L.sub.err+L.sub.pred. [0127]
  • 3. The coefficients, L.sub.rec, are dequantized and inverse transform is performed for these dequantized levels. Dequantization and inverse transform are performed. The reconstructed values are equal to the result of the inverse transformation shifted by 20 bit (with rounding) as described above. [0128]
  • For Copy type macroblocks, only steps 2 and 3 are performed. While applying the deblocking filter, both Inter and Copy macroblocks are treated as Intra macroblocks with coefficients represented by L.sub.rec. [0129]
  • SP-Picture Encoding [0130]
  • SP-picture placed within a single bitstream (pictures S.sub.1 ([0131] 513 of FIG. 5 and S.sub.2 (523 of FIG. 5). The encoding and decoding of SP-picture which is transmitted when switching from one bitstream to another (picture S.sub.12 (550) and picture S.sub.2 (523)) are described below.
  • When encoding an SP-picture placed within a bitstream, the prediction error coefficients for luminance may be obtained as follows: [0132]
  • 1. After motion compensation, for each 4×4 block in the predicted macroblock and in the original image, forward transform is performed. For chroma component an additional 2×2 transform for DC coefficients is performed. The transform coefficients for the original image are denoted as K.sub.orig and for the predicted image as K.sub.pred. [0133]
  • 2. Transform coefficients for the predicted blocks are quantized. Obtained levels are denoted as L.sub.pred. [0134]
  • 3. The prediction error coefficients are obtained by K.sub.err=K.sub.orig−L.sub.pred×2[0135] 20/A(QP) and can be quantized.
  • Let as assume that we want to encode the SP-picture, denoted as S.sub.12, to switch from [0136] bitstream 1 to bitstream 2. The reconstructed values of this picture have to be identical to the reconstructed values of SP-picture in bitstream 2, denoted as S.sub.2, to which we are switching. The bitstream of the Intra macroblocks in frame S.sub.2are copied to S.sub.12. The encoding of Inter macroblocks is performed as follows:
  • 1. Form the predicted frame for S.sub.12 by performing motion estimation with the reference frames being pictures preceding S.sub.1 in [0137] bitstream 1.
  • 2. Perform for each 4×4 block in the predicted macroblock 4×4 forward transform. An additional 2×2 transform for DC coefficients of the chroma component is performed. [0138]
  • 3. Quantize the obtained coefficients and subtract the quantized coefficient levels from the corresponding L.sub.rec of S.sub.2-picture. The resulting levels are the levels of the prediction error which will be transmitted to the decoder. [0139]
  • FIG. 9 illustrates a comparison of the coding efficiency of each picture type, namely I, P and SP frames in terms of their PSNR as a function of bitrate performances for the selected sequences (container and hall sequences). These results are generated by encoding every frame in the sequence with the same picture type, i.e., I, P or SP, except the first frame which is always an I-frame. As can be observed from FIG. 8, the coding efficiency of an SP-picture is worse than P-frames while it is significantly much better than that of I-frames. Although the coding efficiency of each picture type is important, it is important to note that the SP-frames provide functionalities that are usually achieved only with I-frames. [0140]
  • In the following, we illustrate the simulation results when SP and I-frames are introduced at fixed intervals. FIG. 10 illustrates the results obtained with the following conditions: The first frame is encoded as an I-Picture and at fixed intervals, in this [0141] case 1 sec, the frames are encoded as I or SP-pictures while the rest of the frames are encoded as P-pictures.
  • Also included in FIG. 10 is the performance achieved when all the frames are encoded using P-frames. Note that in this case, none of the functionalities mentioned earlier can be obtained while this provides a benchmark for comparison of both SP and I-picture cases. As can be observed from FIG. 10, SP-pictures, while providing the same functionalities as an I-picture, yield significantly better performance in terms of PSNR as a function of bitrate. For example for the Hall sequence around 40 Kbps, there is 2-2.5 dB improvement when SP-frames are used instead of I-frames while there is 0.5 dB penalty over the benchmark all P-frame conditions. [0142]
  • FIG. 11 demonstrates the performance improvement using SP-pictures instead of I-frames for “Fast-forward”. We also include the performance achieved by using only P-frames. Note again, in this case, restarting playing is not possible without a mismatch. Nevertheless, this provides another benchmark for comparison of the other schemes. As can be observed from FIG. 11, there is significant improvement with SP-pictures over I-pictures. For container sequence at 10 Kbps, an improvement of 5.5 dB can be obtained. [0143]
  • In another embodiment of the present invention, the coding efficiency of SP-frames is improved by using a separate quantization value for the predicted frame than the prediction error coefficients. The changes required in H.26L in order to implement this embodiment of the present invention are described. Although H.26L is used an example standard, embodiments of the present invention and any variations and modifications therefrom are deemed to be within the spirit of scope of the invention. [0144]
  • Another embodiment for SP-Picture decoding [0145]
  • Picture types Ptype Code_number=5 is added to H26.L standard to signal SP-picture. If the Ptype indicates SP-picture, an additional codeword SPQP(5 bits) follows PQP codeword. Otherwise SP-picture has the same syntax as P-picture; however, interpretation of some of the syntax element may differ for Inter and Copy type macroblocks. [0146]
  • Additional array of numbers is used when decoding SP-picture:[0147]
  • F(QP)=220 /A(QP)
  • where constant A(QP) is defined above in the section on quantization. [0148]
  • Decoding of the luma component is described first. The macroblocks with type “Inter” are reconstructed as follows: [0149]
  • 1. Decode levels (both a magnitude and a sign) of the prediction error coefficients, L.sub.err, and motion vectors for the macroblock. [0150]
  • 2. After motion compensation, for each 4×4 block in the predicted macroblock, perform forward transform as described above. Using the resulting prediction image coefficients K.sub.pred and prediction error coefficient levels L.sub.err calculate coefficients[0151]
  • K.sub.rec=(K.sub.pred+L.sub.err×F(QP.sub.1))
  • and quantize them:[0152]
  • L.sub.rec=(K.sub.rec×A(QP.sub.2)+0.5×220)/220.
  • The value of QP.sub.1 is given by PQP and QP.sub.2 by SPQP. [0153]
  • 3. The coefficients, L.sub.rec, are dequantized using QP.sub.2 and the inverse transform is performed for these dequantized levels. Dequantization and inverse transform are performed as described in Gisle Bjontegaard, “H.26L Test Model Long Term Number 5 (TML-5) draft0”, document Q15-K-59, ITU-T Video Coding Experts Group (Question 15) Meeting, Oregon, USA Aug. 22-25, 2000. The reconstructed values are equal to the result of the inverse transformation shifted by 20 bit (with rounding). [0154]
  • For Copy type macroblocks, only the [0155] steps 2 and 3 are performed. While applying deblocking filter, both Inter and Copy macroblocks are treated as Intra macroblocks with coefficients represented by L.sub.rec.
  • Dequantization of the chroma component is performed in a similar manner with the following differences. In [0156] step 2 additional 2×2 transform for the DC coefficients is performed after 4×4 transform. Values of QP.sub.1 and QP.sub.2 are changed according to the relation between QP values used for luma and chroma specified above.
  • Another embodiment for SP-Picture Encoding [0157]
  • When encoding an SP-picture placed within a bitstream, the prediction error coefficients for luminance can be obtained as follows: [0158]
  • 1. After motion compensation, for each 4×4 block in the predicted macroblock and in the original image, forward transform is performed. The transform coefficients for the original image are denoted as K.sub.orig and for the predicted image as K.sub.pred. [0159]
  • 2. Transform coefficients for the predicted blocks are quantized using QP=QP.sub.2 as specified above with f=0.5. The resulting levels are denoted as L.sub.pred. [0160]
  • 3. The prediction error coefficients K.sub.err=K.sub.orig−L.sub.pred×F(QP.sub.2) can be quantized using one of the methods described in G. Bjontegaard, “H.26L Test Model Long Term Number 6 (TML-6) draft0”, document VCEG-L45, ITU-T Video Coding Experts Group Meeting, Eibsee, Germany, Jan. 09-12, 2001, with QP=QP.sub.1. [0161]
  • The same procedure is used to calculate coefficients for chrominance with the following differences: In [0162] step 1, an additional 2×2 transform for DC coefficients is performed after 4×4 transform. Values of QP1 and QP2 are changed according to the relation between QP values used for luma and chroma specified in above.
  • Let as assume that we want to encode the SP-picture, denoted as S.sub.12, to switch from bitstream [0163] 1 (510) to bitstream 2 (520) in FIG. 5. The reconstructed values of this picture have to be identical to the reconstructed values of SP-picture in bitstream 2 (520), denoted as S.sub.2 (523), to which we are switching. The bitstream of the Intra macroblocks in frame S.sub.2 (523) are copied to S.sub.12 (550). The encoding of Inter macroblocks is performed as follows:
  • 1. Form the predicted frame for S.sub.12 by performing motion estimation with the reference frames being pictures proceeding S.sub.1 ([0164] 513) in bitstream 1 (510).
  • 2. Perform for each 4×4 block in the predicted macroblock 4×4 forward transform followed by an additional 2×2 transform for DC coefficients for chroma component. [0165]
  • 3. Quantise obtained coefficients and subtract the quantized coefficient levels from the corresponding levels L.rec of S.sub.2 picture. Use QP equal to value specified by SPQP codeword of frame S.sub.2. The resulting levels are the levels of the prediction error which will be transmitted to the decoder through a communication system. Notice that for frame S.sub.12 QP values specified by PQP and SPQP are same and equal to the value given by SPQP codeword of frame S.sub.2. [0166]
  • A novel SP-picture encoding/decoding method which uses different quantization values for the predicted block and the prediction error coefficients has been provided. The use of two different values of OP allows to trade off between coding efficiency of SP-pictures placed within a single bitstream and SP-pictures used when switching from one bitstream to another. The lower the value of SPQP with respect to PQP, the higher the coding efficiency of SP-picture placed within a single bitstream, while, on the other hand, the larger number of bits is required when switching to this picture. The choice of the SPQP value can be application dependent. For example, when SP-pictures are used to facilitate random access one can expect that SP frames placed within a single bitstream will have the major influence on compression efficiency and therefore SPQP value should be small. On the other hand, when SP pictures are used for streaming rate control, SPQP value should be kept close to PQP since SP-pictures sent during switching from one bitsream to another will have large share of the overall bandwidth. [0167]
  • While the preferred embodiment and various alternative embodiments of the invention has been disclosed and described in detail herein, it will be obvious to those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope thereof. [0168]

Claims (9)

What is claimed is:
1. A decoder for decoding encoded data wherein identical frames may be obtained even when they are predicted using different reference frames, said decoder comprising:
means for forming a prediction of a current block of data using a plurality of motion vectors and a reference frame;
means for calculating a plurality of transform coefficients for said current block of data corresponding to a set of basis functions;
means for quantizing said coefficients creating a plurality of quantized prediction image coefficients;
means for obtaining a plurality of quantized reconstruction image coefficients by adding said received quantized coefficients for the prediction error to said plurality of quantized prediction image coefficients; and
means for dequantizing said plurality of quantized reconstruction image coefficients.
2. A decoder for decoding a block of encoded data wherein identical frames may be obtained even when they are predicted using different reference frames, said decoder comprising:
frame memory for storing a reference frame;
demultiplexor for receiving and demultiplexing said encoded data into motion information and a current frame;
motion compensation predictor coupled to said demultiplexor and said frame memory for receiving said motion information and constructing a prediction of the current block based on said motion information and reference frame;
transformer coupled to said motion compensation predictor for creating a plurality of transform coefficients;
quantisationizer coupled to said transformer for quantizing said plurality of coefficients; and
adder coupled to said quantisationizer and said demultiplexor for adding current frame information and said quantized plurality of coefficients to form a reconstructed frame.
3. The decoder of claim 2, further comprising:
inverse quantizationizer coupled to said adder; and
inverse transformer coupled to said inverse quantizationizer.
4. A method for encoding a frame of video data, comprising the steps of:
forming a prediction of a current block of data using a plurality of motion vectors and a reference frame;
calculating a plurality of transform coefficients for said current block of data corresponding to a set of basis functions;
quantizing said coefficients creating a plurality of quantized prediction image coefficients;
obtaining a plurality of quantized reconstruction image coefficients by adding said received quantized coefficients for the prediction error to said plurality of quantized prediction image coefficients; and
dequantizing said plurality of quantized reconstruction image coefficients.
5. A method for switching between a plurality of bitstreams in a data communication system, wherein said bitstreams correspond to a same data sequence but are encoded at different bitrates, said method comprising the steps of:
placing a first picture within each of said plurality of bitstreams in locations at which switching from one of said plurality of bitstreams to another one of said plurality of bitstreams is desired;
transmitting a second picture wherein said first picture and said second picture are represented by different bitstreams, but wherein said first picture and said second picture reconstructed values are identical.
6. A method for enabling access in a data stream, said method comprising the steps of:
placing a plurality of SP-pictures at fixed intervals within a first bitstream;
generating an I-picture and an SP-picture for each one of said plurality of SP-pictures in said first bitstream;
storing said I-picture in a second bitstream at a temporal location preceding said each one of said plurality of SP-pictures in said first bitstream; and
storing said SP-picture in said second bitstream at same temporal locations as each of said SP-pictures in said first bitstream.
7. The method of claim 6, wherein said second bitstream comprises only SP-pictures predicted from each other, but at longer temporal periods.
8. A method for providing Video Redundancy Coding (VRC), comprising the steps of:
dividing a sequence of pictures into a plurality of threads wherein all pictures are assigned to one of said plurality of threads in a round-robin fashion;
coding each of said plurality of threads independently;
creating a frame, wherein all of said threads converge; and
starting a second plurality of threads from said frame.
9. A method for providing error control in a data stream between a sender and a client in a communication system, said method comprising:
creating a plurality of representations of a frame in the form of a plurality of SP-pictures predicted from different reference pictures;
signaling said sender information regarding lost frames and a one of said plurality of representations received by said client; and
sending said client a SP-picture which is the next picture in said one of plurality of representations received by client.
US09/827,796 2001-01-03 2001-04-06 Video decoder architecture and method for using same Abandoned US20020122491A1 (en)

Priority Applications (18)

Application Number Priority Date Filing Date Title
US09/827,796 US20020122491A1 (en) 2001-01-03 2001-04-06 Video decoder architecture and method for using same
US09/883,887 US6765963B2 (en) 2001-01-03 2001-06-18 Video decoder architecture and method for using same
US09/925,769 US6920175B2 (en) 2001-01-03 2001-08-09 Video coding architecture and methods for using same
CNB028034414A CN1225125C (en) 2001-01-03 2002-01-03 Switching between bit streams in video transmission
PCT/FI2002/000004 WO2002054776A1 (en) 2001-01-03 2002-01-03 Switching between bit-streams in video transmission
JP2002555537A JP4109113B2 (en) 2001-01-03 2002-01-03 Switching between bitstreams in video transmission
BRPI0206191A BRPI0206191B1 (en) 2001-01-03 2002-01-03 method for transmitting video, encoder, decoder, and signal information representing encoded information
CA002431866A CA2431866C (en) 2001-01-03 2002-01-03 Switching between bit-streams in video transmission
EP02716096.9A EP1356684B1 (en) 2001-01-03 2002-01-03 Switching between bit-streams in video transmission
KR1020037008568A KR100626419B1 (en) 2001-01-03 2002-01-03 Switching between bit-streams in video transmission
HU0400560A HU228605B1 (en) 2001-01-03 2002-01-03 Method of forwarding video information, encoder and decoder for coding and decoding video information, and coded cideo information signal
EEP200300315A EE04829B1 (en) 2001-01-03 2002-01-03 Video data transmission method, encoder and decoder, and signal containing video data
MXPA03005985A MXPA03005985A (en) 2001-01-03 2002-01-03 Switching between bit-streams in video transmission.
US10/869,092 US20040240560A1 (en) 2001-01-03 2004-06-16 Video decoder architecture and method for using same
US10/869,628 US7477689B2 (en) 2001-01-03 2004-06-16 Video decoder architecture and method for using same
US10/869,455 US20040223549A1 (en) 2001-01-03 2004-06-16 Video decoder architecture and method for using same
HK04105644A HK1062868A1 (en) 2001-01-03 2004-07-30 Switching between bit-streams in video transmissi on
JP2007178813A JP5128865B2 (en) 2001-01-03 2007-07-06 Switching between bitstreams in video transmission

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US25952901P 2001-01-03 2001-01-03
US09/827,796 US20020122491A1 (en) 2001-01-03 2001-04-06 Video decoder architecture and method for using same

Related Child Applications (3)

Application Number Title Priority Date Filing Date
US09/883,887 Continuation-In-Part US6765963B2 (en) 2001-01-03 2001-06-18 Video decoder architecture and method for using same
US09/883,887 Continuation US6765963B2 (en) 2001-01-03 2001-06-18 Video decoder architecture and method for using same
US10/869,092 Continuation US20040240560A1 (en) 2001-01-03 2004-06-16 Video decoder architecture and method for using same

Publications (1)

Publication Number Publication Date
US20020122491A1 true US20020122491A1 (en) 2002-09-05

Family

ID=34078815

Family Applications (3)

Application Number Title Priority Date Filing Date
US09/827,796 Abandoned US20020122491A1 (en) 2001-01-03 2001-04-06 Video decoder architecture and method for using same
US10/250,838 Active 2024-06-09 US7706447B2 (en) 2001-01-03 2002-01-03 Switching between bit-streams in video transmission
US10/869,092 Abandoned US20040240560A1 (en) 2001-01-03 2004-06-16 Video decoder architecture and method for using same

Family Applications After (2)

Application Number Title Priority Date Filing Date
US10/250,838 Active 2024-06-09 US7706447B2 (en) 2001-01-03 2002-01-03 Switching between bit-streams in video transmission
US10/869,092 Abandoned US20040240560A1 (en) 2001-01-03 2004-06-16 Video decoder architecture and method for using same

Country Status (2)

Country Link
US (3) US20020122491A1 (en)
ZA (1) ZA200304086B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004023819A2 (en) * 2002-09-06 2004-03-18 Koninklijke Philips Electronics N.V. Content-adaptive multiple description motion compensation for improved efficiency and error resilience
EP1496706A1 (en) * 2002-04-16 2005-01-12 Matsushita Electric Industrial Co., Ltd. Image encoding method and image decoding method
US20050262257A1 (en) * 2004-04-30 2005-11-24 Major R D Apparatus, system, and method for adaptive-rate shifting of streaming content
US20060140591A1 (en) * 2004-12-28 2006-06-29 Texas Instruments Incorporated Systems and methods for load balancing audio/video streams
US20080104652A1 (en) * 2006-11-01 2008-05-01 Swenson Erik R Architecture for delivery of video content responsive to remote interaction
US20080104520A1 (en) * 2006-11-01 2008-05-01 Swenson Erik R Stateful browsing
US20080101466A1 (en) * 2006-11-01 2008-05-01 Swenson Erik R Network-Based Dynamic Encoding
US20080184128A1 (en) * 2007-01-25 2008-07-31 Swenson Erik R Mobile device user interface for remote interaction
US20090238267A1 (en) * 2002-02-08 2009-09-24 Shipeng Li Methods And Apparatuses For Use In Switching Between Streaming Video Bitstreams
US7925774B2 (en) 2008-05-30 2011-04-12 Microsoft Corporation Media streaming using an index file
US8265140B2 (en) 2008-09-30 2012-09-11 Microsoft Corporation Fine-grained client-side control of scalable media delivery
US8325800B2 (en) 2008-05-07 2012-12-04 Microsoft Corporation Encoding streaming media as a high bit rate layer, a low bit rate layer, and one or more intermediate bit rate layers
US8379851B2 (en) * 2008-05-12 2013-02-19 Microsoft Corporation Optimized client side rate control and indexed file layout for streaming media
US9247260B1 (en) 2006-11-01 2016-01-26 Opera Software Ireland Limited Hybrid bitmap-mode encoding
US10944982B1 (en) * 2016-11-08 2021-03-09 Amazon Technologies, Inc. Rendition switch indicator
CN115460466A (en) * 2022-08-23 2022-12-09 北京泰豪智能工程有限公司 Video picture customization method and system in video communication

Families Citing this family (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002045372A2 (en) * 2000-11-29 2002-06-06 British Telecommunications Public Limited Company Transmitting and receiving real-time data
KR100425676B1 (en) * 2001-03-15 2004-04-03 엘지전자 주식회사 Error recovery method for video transmission system
WO2003009581A1 (en) * 2001-07-19 2003-01-30 British Telecommunications Public Limited Company Video stream switching
CN1557072A (en) * 2001-09-21 2004-12-22 ���˹���Ѷ��� Data communications method and system using buffer size to calculate transmission rate for congestion control
WO2003049373A1 (en) * 2001-11-30 2003-06-12 British Telecommunications Public Limited Company Data transmission
US7020203B1 (en) * 2001-12-21 2006-03-28 Polycom, Inc. Dynamic intra-coded macroblock refresh interval for video error concealment
JP3923898B2 (en) * 2002-01-18 2007-06-06 株式会社東芝 Image coding method and apparatus
EP1359722A1 (en) * 2002-03-27 2003-11-05 BRITISH TELECOMMUNICATIONS public limited company Data streaming system and method
US20060133514A1 (en) * 2002-03-27 2006-06-22 Walker Matthew D Video coding and transmission
CA2479585A1 (en) * 2002-03-27 2003-10-09 Timothy Ralph Jebb Data structure for data streaming system
GB0306296D0 (en) * 2003-03-19 2003-04-23 British Telecomm Data transmission
US7471724B2 (en) * 2003-06-23 2008-12-30 Vichip Corp. Limited Method and apparatus for adaptive multiple-dimensional signal sequences encoding/decoding
JP2008500760A (en) * 2004-05-25 2008-01-10 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Method and apparatus for encoding digital video data
JP4559811B2 (en) * 2004-09-30 2010-10-13 株式会社東芝 Information processing apparatus and information processing method
US7609765B2 (en) 2004-12-02 2009-10-27 Intel Corporation Fast multi-frame motion estimation with adaptive search strategies
CN101107860B (en) * 2005-01-18 2013-07-31 汤姆森特许公司 Method and apparatus for estimating channel induced distortion
US9661376B2 (en) * 2005-07-13 2017-05-23 Polycom, Inc. Video error concealment method
US20070019742A1 (en) * 2005-07-22 2007-01-25 Davis Kevin E Method of transmitting pre-encoded video
DE102005049017B4 (en) * 2005-10-11 2010-09-23 Carl Zeiss Imaging Solutions Gmbh Method for segmentation in an n-dimensional feature space and method for classification based on geometric properties of segmented objects in an n-dimensional data space
US20070098274A1 (en) * 2005-10-28 2007-05-03 Honeywell International Inc. System and method for processing compressed video data
WO2007061509A2 (en) * 2005-11-22 2007-05-31 Nlight Photonics Corporation Modular diode laser assembly
US20070115617A1 (en) * 2005-11-22 2007-05-24 Nlight Photonics Corporation Modular assembly utilizing laser diode subassemblies with winged mounting blocks
US20070116071A1 (en) * 2005-11-22 2007-05-24 Nlight Photonics Corporation Modular diode laser assembly
US20070116077A1 (en) * 2005-11-22 2007-05-24 Nlight Photonics Corporation Vertically displaced stack of multi-mode single emitter laser diodes
WO2007111473A1 (en) * 2006-03-27 2007-10-04 Electronics And Telecommunications Research Institute Scalable video encoding and decoding method using switching pictures and apparatus thereof
US8358693B2 (en) * 2006-07-14 2013-01-22 Microsoft Corporation Encoding visual data with computation scheduling and allocation
US8311102B2 (en) * 2006-07-26 2012-11-13 Microsoft Corporation Bitstream switching in multiple bit-rate video streaming environments
US8340193B2 (en) * 2006-08-04 2012-12-25 Microsoft Corporation Wyner-Ziv and wavelet video coding
EP2052549A4 (en) * 2006-08-17 2011-12-07 Ericsson Telefon Ab L M Error recovery for rich media
US9094686B2 (en) * 2006-09-06 2015-07-28 Broadcom Corporation Systems and methods for faster throughput for compressed video data decoding
US7388521B2 (en) * 2006-10-02 2008-06-17 Microsoft Corporation Request bits estimation for a Wyner-Ziv codec
CN101523908A (en) * 2006-10-02 2009-09-02 艾利森电话股份有限公司 Multimedia management
CN101569201B (en) * 2006-11-07 2011-10-05 三星电子株式会社 Method and apparatus for encoding and decoding based on intra prediction
KR100846512B1 (en) * 2006-12-28 2008-07-17 삼성전자주식회사 Method and apparatus for video encoding and decoding
US8340192B2 (en) * 2007-05-25 2012-12-25 Microsoft Corporation Wyner-Ziv coding with multiple side information
EP2383920B1 (en) 2007-12-20 2014-07-30 Optis Wireless Technology, LLC Control channel signaling using a common signaling field for transport format and redundancy version
US20110090965A1 (en) * 2009-10-21 2011-04-21 Hong Kong Applied Science and Technology Research Institute Company Limited Generation of Synchronized Bidirectional Frames and Uses Thereof
EP2458861A1 (en) * 2010-11-25 2012-05-30 ST-Ericsson SA Bit rate regulation module and method for regulating bit rate
KR101187530B1 (en) * 2011-03-02 2012-10-02 한국과학기술원 Rendering strategy for monoscopic, stereoscopic and multi-view computer generated imagery, system using the same and recording medium for the same
US9635374B2 (en) 2011-08-01 2017-04-25 Apple Inc. Systems and methods for coding video data using switchable encoders and decoders
US20130083845A1 (en) 2011-09-30 2013-04-04 Research In Motion Limited Methods and devices for data compression using a non-uniform reconstruction space
EP2595382B1 (en) 2011-11-21 2019-01-09 BlackBerry Limited Methods and devices for encoding and decoding transform domain filters
WO2014055826A2 (en) * 2012-10-05 2014-04-10 Huawei Technologies Co., Ltd. Improved architecture for hybrid video codec
EP2920962A4 (en) 2012-11-13 2016-07-20 Intel Corp Content adaptive transform coding for next generation video
CN104737542B (en) 2013-01-30 2018-09-25 英特尔公司 Content-adaptive entropy coding for next-generation video
EP2804374A1 (en) 2013-02-22 2014-11-19 Thomson Licensing Coding and decoding methods of a picture block, corresponding devices and data stream
EP2804375A1 (en) 2013-02-22 2014-11-19 Thomson Licensing Coding and decoding methods of a picture block, corresponding devices and data stream
US8881213B2 (en) * 2013-03-13 2014-11-04 Verizon Patent And Licensing Inc. Alignment of video frames
JP6225446B2 (en) * 2013-03-26 2017-11-08 富士通株式会社 Moving image data distribution apparatus, method, program, and system
US9609336B2 (en) * 2013-04-16 2017-03-28 Fastvdo Llc Adaptive coding, transmission and efficient display of multimedia (acted)
US9462306B2 (en) * 2013-07-16 2016-10-04 The Hong Kong University Of Science And Technology Stream-switching in a content distribution system
US10271062B2 (en) * 2016-03-18 2019-04-23 Google Llc Motion vector prediction through scaling
EP3603090A4 (en) 2017-03-27 2020-08-19 Nokia Technologies Oy An apparatus, a method and a computer program for video coding and decoding

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5386234A (en) * 1991-11-13 1995-01-31 Sony Corporation Interframe motion predicting method and picture signal coding/decoding apparatus
US6012091A (en) * 1997-06-30 2000-01-04 At&T Corporation Video telecommunications server and method of providing video fast forward and reverse
US6137834A (en) * 1996-05-29 2000-10-24 Sarnoff Corporation Method and apparatus for splicing compressed information streams
US6163575A (en) * 1995-10-20 2000-12-19 Nokia Mobile Phones Limited Motion vector field coding
US6175595B1 (en) * 1995-07-19 2001-01-16 U.S. Philips Corporation Method and device for decoding digital video bitstreams and reception equipment including such a device
US6212235B1 (en) * 1996-04-19 2001-04-03 Nokia Mobile Phones Ltd. Video encoder and decoder using motion-based segmentation and merging
US6414999B1 (en) * 1998-05-22 2002-07-02 Sony Corporation Editing method and editing apparatus
US6493389B1 (en) * 1998-03-31 2002-12-10 Koninklijke Philips Electronics N.V. Method and device for modifying data in an encoded data stream
US6516002B1 (en) * 1997-03-21 2003-02-04 Scientific-Atlanta, Inc. Apparatus for using a receiver model to multiplex variable-rate bit streams having timing constraints
US6658056B1 (en) * 1999-03-30 2003-12-02 Sony Corporation Digital video decoding, buffering and frame-rate converting method and apparatus

Family Cites Families (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2891773B2 (en) 1990-03-15 1999-05-17 トムソン マルチメデイア ソシエテ アノニム Method and apparatus for processing digital image sequences
EP0731614B1 (en) 1995-03-10 2002-02-06 Kabushiki Kaisha Toshiba Video coding/decoding apparatus
JP2827997B2 (en) * 1995-12-28 1998-11-25 日本電気株式会社 Image signal Hadamard transform encoding device and decoding device
IT1285258B1 (en) 1996-02-26 1998-06-03 Cselt Centro Studi Lab Telecom HANDLING DEVICE FOR COMPRESSED VIDEO SEQUENCES.
US5708732A (en) * 1996-03-06 1998-01-13 Hewlett-Packard Company Fast DCT domain downsampling and inverse motion compensation
KR100403077B1 (en) * 1996-05-28 2003-10-30 마쯔시다덴기산교 가부시키가이샤 Image predictive decoding apparatus and method thereof, and image predictive cording apparatus and method thereof
GB2318246B (en) * 1996-10-09 2000-11-15 Sony Uk Ltd Processing digitally encoded signals
US6480541B1 (en) * 1996-11-27 2002-11-12 Realnetworks, Inc. Method and apparatus for providing scalable pre-compressed digital video with reduced quantization based artifacts
ES2184137T3 (en) * 1996-12-10 2003-04-01 British Telecomm VIDEO CODING
DE69814212T2 (en) 1997-02-28 2004-04-01 Matsushita Electric Industrial Co., Ltd., Kadoma Device for converting image signals
US6370276B2 (en) 1997-04-09 2002-04-09 Matsushita Electric Industrial Co., Ltd. Image predictive decoding method, image predictive decoding apparatus, image predictive coding method, image predictive coding apparatus, and data storage media
WO1998054910A2 (en) 1997-05-27 1998-12-03 Koninklijke Philips Electronics N.V. Method of switching video sequences and corresponding switching device and decoding system
US6501798B1 (en) 1998-01-22 2002-12-31 International Business Machines Corporation Device for generating multiple quality level bit-rates in a video encoder
US6611624B1 (en) * 1998-03-13 2003-08-26 Cisco Systems, Inc. System and method for frame accurate splicing of compressed bitstreams
FR2782437B1 (en) * 1998-08-14 2000-10-13 Thomson Multimedia Sa MPEG STREAM SWITCHING METHOD
JP2000115783A (en) * 1998-10-06 2000-04-21 Canon Inc Decoder and its method
US6434195B1 (en) * 1998-11-20 2002-08-13 General Instrument Corporaiton Splicing of video data in progressively refreshed video streams
US7046910B2 (en) * 1998-11-20 2006-05-16 General Instrument Corporation Methods and apparatus for transcoding progressive I-slice refreshed MPEG data streams to enable trick play mode features on a television appliance
JP3855522B2 (en) 1999-02-23 2006-12-13 松下電器産業株式会社 Video converter
GB9908809D0 (en) 1999-04-16 1999-06-09 Sony Uk Ltd Signal processor
FR2795272B1 (en) * 1999-06-18 2001-07-20 Thomson Multimedia Sa MPEG STREAM SWITCHING METHOD
US6735249B1 (en) 1999-08-11 2004-05-11 Nokia Corporation Apparatus, and associated method, for forming a compressed motion vector field utilizing predictive motion coding
GB2353653B (en) 1999-08-26 2003-12-31 Sony Uk Ltd Signal processor
GB2353655B (en) * 1999-08-26 2003-07-23 Sony Uk Ltd Signal processor
US6765963B2 (en) * 2001-01-03 2004-07-20 Nokia Corporation Video decoder architecture and method for using same
US6920175B2 (en) 2001-01-03 2005-07-19 Nokia Corporation Video coding architecture and methods for using same
US6804301B2 (en) * 2001-08-15 2004-10-12 General Instrument Corporation First pass encoding of I and P-frame complexity for compressed digital video
US6956600B1 (en) * 2001-09-19 2005-10-18 Bellsouth Intellectual Property Corporation Minimal decoding method for spatially multiplexing digital video pictures
US6996173B2 (en) 2002-01-25 2006-02-07 Microsoft Corporation Seamless switching of scalable video bitstreams
CN100380980C (en) 2002-04-23 2008-04-09 诺基亚有限公司 Method and device for indicating quantizer parameters in a video coding system
DE602004029551D1 (en) * 2003-01-28 2010-11-25 Thomson Licensing STAGGERCASTING IN ROBUST MODE

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5386234A (en) * 1991-11-13 1995-01-31 Sony Corporation Interframe motion predicting method and picture signal coding/decoding apparatus
US6175595B1 (en) * 1995-07-19 2001-01-16 U.S. Philips Corporation Method and device for decoding digital video bitstreams and reception equipment including such a device
US6163575A (en) * 1995-10-20 2000-12-19 Nokia Mobile Phones Limited Motion vector field coding
US6212235B1 (en) * 1996-04-19 2001-04-03 Nokia Mobile Phones Ltd. Video encoder and decoder using motion-based segmentation and merging
US6137834A (en) * 1996-05-29 2000-10-24 Sarnoff Corporation Method and apparatus for splicing compressed information streams
US6516002B1 (en) * 1997-03-21 2003-02-04 Scientific-Atlanta, Inc. Apparatus for using a receiver model to multiplex variable-rate bit streams having timing constraints
US6012091A (en) * 1997-06-30 2000-01-04 At&T Corporation Video telecommunications server and method of providing video fast forward and reverse
US6493389B1 (en) * 1998-03-31 2002-12-10 Koninklijke Philips Electronics N.V. Method and device for modifying data in an encoded data stream
US6414999B1 (en) * 1998-05-22 2002-07-02 Sony Corporation Editing method and editing apparatus
US6658056B1 (en) * 1999-03-30 2003-12-02 Sony Corporation Digital video decoding, buffering and frame-rate converting method and apparatus

Cited By (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090238267A1 (en) * 2002-02-08 2009-09-24 Shipeng Li Methods And Apparatuses For Use In Switching Between Streaming Video Bitstreams
US20140086308A1 (en) * 2002-02-08 2014-03-27 Microsoft Corporation Switching Between Streaming Video Bitstreams
US9686546B2 (en) * 2002-02-08 2017-06-20 Microsoft Technology Licensing, Llc Switching between streaming video bitstreams
US8576919B2 (en) * 2002-02-08 2013-11-05 Microsoft Corporation Methods and apparatuses for use in switching between streaming video bitstreams
US10834388B2 (en) 2002-04-16 2020-11-10 Godo Kaisha Ip Bridge 1 Picture coding method and picture decoding method
US8675729B2 (en) 2002-04-16 2014-03-18 Panasonic Corporation Picture coding method and picture decoding method
US9516307B2 (en) 2002-04-16 2016-12-06 Godo Kaisha Ip Bridge 1 Picture coding method and picture decoding method
US20080069218A1 (en) * 2002-04-16 2008-03-20 Shinya Kadono Picture coding method and picture decoding method
US8787448B2 (en) 2002-04-16 2014-07-22 Panasonic Intellectual Property Corporation Of America Picture coding method and picture decoding method
EP1496706A1 (en) * 2002-04-16 2005-01-12 Matsushita Electric Industrial Co., Ltd. Image encoding method and image decoding method
US20050031030A1 (en) * 2002-04-16 2005-02-10 Shinya Kadono Picture encoding method and image decoding method
US10021389B2 (en) 2002-04-16 2018-07-10 Godo Kaisha Ip Bridge 1 Picture coding method and picture decoding method
US10869034B2 (en) 2002-04-16 2020-12-15 Godo Kaisha Ip Bridge 1 Picture coding method and picture decoding method
US7505518B2 (en) 2002-04-16 2009-03-17 Panasonic Corporation Picture encoding method and image decoding method
US20090135917A1 (en) * 2002-04-16 2009-05-28 Shinya Kadono Picture coding method and picture decoding method
EP1496706A4 (en) * 2002-04-16 2005-05-25 Matsushita Electric Ind Co Ltd Image encoding method and image decoding method
US10148951B2 (en) 2002-04-16 2018-12-04 Godo Kaisha Ip Bridge 1 Picture coding method and picture decoding method
US10812792B2 (en) 2002-04-16 2020-10-20 Godo Kaisha Ip Bridge 1 Picture coding method and picture decoding method
US10542252B2 (en) 2002-04-16 2020-01-21 Godo Kaisha Ip Bridge 1 Picture coding method and picture decoding method
WO2004023819A2 (en) * 2002-09-06 2004-03-18 Koninklijke Philips Electronics N.V. Content-adaptive multiple description motion compensation for improved efficiency and error resilience
WO2004023819A3 (en) * 2002-09-06 2004-05-21 Koninkl Philips Electronics Nv Content-adaptive multiple description motion compensation for improved efficiency and error resilience
US20050262257A1 (en) * 2004-04-30 2005-11-24 Major R D Apparatus, system, and method for adaptive-rate shifting of streaming content
US9407564B2 (en) 2004-04-30 2016-08-02 Echostar Technologies L.L.C. Apparatus, system, and method for adaptive-rate shifting of streaming content
US8868772B2 (en) 2004-04-30 2014-10-21 Echostar Technologies L.L.C. Apparatus, system, and method for adaptive-rate shifting of streaming content
US10225304B2 (en) 2004-04-30 2019-03-05 Dish Technologies Llc Apparatus, system, and method for adaptive-rate shifting of streaming content
US20060140591A1 (en) * 2004-12-28 2006-06-29 Texas Instruments Incorporated Systems and methods for load balancing audio/video streams
US20080104520A1 (en) * 2006-11-01 2008-05-01 Swenson Erik R Stateful browsing
US20080101466A1 (en) * 2006-11-01 2008-05-01 Swenson Erik R Network-Based Dynamic Encoding
US8711929B2 (en) * 2006-11-01 2014-04-29 Skyfire Labs, Inc. Network-based dynamic encoding
US20080104652A1 (en) * 2006-11-01 2008-05-01 Swenson Erik R Architecture for delivery of video content responsive to remote interaction
US8443398B2 (en) 2006-11-01 2013-05-14 Skyfire Labs, Inc. Architecture for delivery of video content responsive to remote interaction
US8375304B2 (en) 2006-11-01 2013-02-12 Skyfire Labs, Inc. Maintaining state of a web page
US9247260B1 (en) 2006-11-01 2016-01-26 Opera Software Ireland Limited Hybrid bitmap-mode encoding
US20080184128A1 (en) * 2007-01-25 2008-07-31 Swenson Erik R Mobile device user interface for remote interaction
US8630512B2 (en) 2007-01-25 2014-01-14 Skyfire Labs, Inc. Dynamic client-server video tiling streaming
US20080181498A1 (en) * 2007-01-25 2008-07-31 Swenson Erik R Dynamic client-server video tiling streaming
US8325800B2 (en) 2008-05-07 2012-12-04 Microsoft Corporation Encoding streaming media as a high bit rate layer, a low bit rate layer, and one or more intermediate bit rate layers
US9571550B2 (en) 2008-05-12 2017-02-14 Microsoft Technology Licensing, Llc Optimized client side rate control and indexed file layout for streaming media
US8379851B2 (en) * 2008-05-12 2013-02-19 Microsoft Corporation Optimized client side rate control and indexed file layout for streaming media
US8819754B2 (en) 2008-05-30 2014-08-26 Microsoft Corporation Media streaming with enhanced seek operation
US8370887B2 (en) 2008-05-30 2013-02-05 Microsoft Corporation Media streaming with enhanced seek operation
US7949775B2 (en) 2008-05-30 2011-05-24 Microsoft Corporation Stream selection for enhanced media streaming
US7925774B2 (en) 2008-05-30 2011-04-12 Microsoft Corporation Media streaming using an index file
US8265140B2 (en) 2008-09-30 2012-09-11 Microsoft Corporation Fine-grained client-side control of scalable media delivery
US10944982B1 (en) * 2016-11-08 2021-03-09 Amazon Technologies, Inc. Rendition switch indicator
CN115460466A (en) * 2022-08-23 2022-12-09 北京泰豪智能工程有限公司 Video picture customization method and system in video communication

Also Published As

Publication number Publication date
US7706447B2 (en) 2010-04-27
ZA200304086B (en) 2004-07-15
US20040240560A1 (en) 2004-12-02
US20040114684A1 (en) 2004-06-17

Similar Documents

Publication Publication Date Title
US20020122491A1 (en) Video decoder architecture and method for using same
US6920175B2 (en) Video coding architecture and methods for using same
US6765963B2 (en) Video decoder architecture and method for using same
US7693220B2 (en) Transmission of video information
Sullivan et al. Video compression-from concepts to the H. 264/AVC standard
Ostermann et al. Video coding with H. 264/AVC: tools, performance, and complexity
Wiegand et al. Overview of the H. 264/AVC video coding standard
US7324595B2 (en) Method and/or apparatus for reducing the complexity of non-reference frame encoding using selective reconstruction
US7532808B2 (en) Method for coding motion in a video sequence
RU2322770C2 (en) Method and device for indication of quantizer parameters in video encoding system
US7724818B2 (en) Method for coding sequences of pictures
KR100323489B1 (en) Method and device for transcoding bitstream with video data
JP4510465B2 (en) Coding of transform coefficients in an image / video encoder and / or decoder
US8374236B2 (en) Method and apparatus for improving the average image refresh rate in a compressed video bitstream
US6040875A (en) Method to compensate for a fade in a digital video input sequence
Golston Comparing media codecs for video content
WO2005091632A1 (en) Transmission of video information
KR100626419B1 (en) Switching between bit-streams in video transmission
JP2001148852A (en) Image information converter and image information conversion method
EP1739970A1 (en) Method for encoding and transmission of real-time video conference data
Nemethova Principles of video coding
Notebaert Bit rate transcoding of H. 264/AVC based on rate shaping and requantization
JP2001148855A (en) Image information converter and image information conversion method
JPH11177988A (en) Moving image signal encoding method and device in accordance with time signal

Legal Events

Date Code Title Description
AS Assignment

Owner name: NOKIA CORPORATION, FINLAND

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KARCZEWICZ, MARTA;KURCEREN, RAGIP;REEL/FRAME:012010/0792

Effective date: 20010709

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION