WO2000018131A1 - Intra-frame quantizer selection for video compression - Google Patents

Intra-frame quantizer selection for video compression Download PDF

Info

Publication number
WO2000018131A1
WO2000018131A1 PCT/US1999/021834 US9921834W WO0018131A1 WO 2000018131 A1 WO2000018131 A1 WO 2000018131A1 US 9921834 W US9921834 W US 9921834W WO 0018131 A1 WO0018131 A1 WO 0018131A1
Authority
WO
WIPO (PCT)
Prior art keywords
region
image
interest
quantization
quantization level
Prior art date
Application number
PCT/US1999/021834
Other languages
French (fr)
Inventor
Ravi Krishnamurthy
Sriram Sethuraman
Original Assignee
Sarnoff Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sarnoff Corporation filed Critical Sarnoff Corporation
Priority to EP99948365A priority Critical patent/EP1114558A1/en
Priority to JP2000571666A priority patent/JP2002525989A/en
Publication of WO2000018131A1 publication Critical patent/WO2000018131A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
    • H04N19/198Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters including smoothing of a sequence of encoding parameters, e.g. by averaging, by choice of the maximum, minimum or median value
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • H04N19/126Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/149Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/15Data rate or code amount at the encoder output by monitoring actual compressed data size at the memory before deciding storage at the transmission buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/152Data rate or code amount at the encoder output by measuring the fullness of the transmission buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/154Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/162User input
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/192Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding the adaptation method, adaptation tool or adaptation type being iterative or recursive
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding

Definitions

  • the present invention relates to image processing, and, in particular, to video compression.
  • the goal of video compression processing is to encode image data to reduce the number of bits used to represent a sequence of video images while maintaining an acceptable level of quality in the decoded video sequence. This goal is particularly important in certain applications, such as videophone or video conferencing over POTS (plain old telephone service) or ISDN (integrated services digital network) lines, where the existence of limited transmission bandwidth requires careful control over the bit rate, that is, the number of bits used to encode each image in the video sequence.
  • POTS plain old telephone service
  • ISDN integrated services digital network
  • images may be designated as the following different types of frames for compression processing: o An intra (I) frame which is encoded using only intra-frame compression techniques, o A predicted (P) frame which is encoded using inter-frame compression techniques based on a previous I or P frame, and which can itself be used as a reference frame to encode one or more other frames, o A bi-directional (B) frame which is encoded using bi-directional inter-frame compression techniques based on a previous I or P frame and a subsequent I or P frame, and which cannot be used to encode another frame, and o A PB frame which corresponds to two images - a P frame and a B frame in between the P frame and the previous I/P frame — that are encoded as a single frame (as in the H.263 video compression algorithm).
  • these different types of frames typically require
  • a block-based transform such as a discrete cosine transform (DCT)
  • DCT discrete cosine transform
  • the resulting transform coefficients for each block are then quantized for subsequent encoding (e.g., run-length encoding followed by variable-length encoding).
  • the degree to which the transform coefficients are quantized directly affects both the number of bits used to represent the image data and the quality of the resulting decoded image.
  • This degree of quantization is also referred to as the quantization level, which is often represented by a specified quantizer value that is used to quantize the transform coefficients.
  • higher quantization levels imply fewer bits and lower quality.
  • the quantizer is often used as the primary variable for controlling the tradeoff between bit rate and image quality. Visual quality of video depends not only on global measures (like pixel signal to noise ratio
  • the other possibility is to vary the quantizer from macroblock-to-macroblock within the constraints of the coding standard being used (for example, in H.263, the quantizer level can change by a value of at most 2 in either direction).
  • the quantizer level can change by a value of at most 2 in either direction.
  • Examples of such schemes are given in the H.263+ TMN8 (Test Model Near-Term 8) and TMN9 documents (see, e.g., ITU - Telecommunications Standardization Sector, "Video Codec Test Model, Near-Term, Version 9 (TMN9)", Document Q15- C-15, December 1997).
  • TMN8 Transmission Model Near-Term 8
  • TMN9 Video Codec Test Model, Near-Term, Version 9
  • some video compression algorithms such as H.263, allow the quantizers to vary from macroblock to macroblock within a frame, although such algorithms often limit the magnitude of change in quantization level between horizontally adjacent macroblocks (e.g., a maximum change of +1-2 levels).
  • this ability to vary the quantization level within a frame enables the video compression processing to judiciously allocate the available number of bits for encoding different regions of a frame differently, for example, allocating, more bits (i.e., lower quantization level) to specific regions of interest (ROI).
  • the foreground consists of a talking head centered on a relatively constant background
  • the present invention is directed to a scheme for selecting quantizers for use in encoding frames having one or more regions of interest.
  • the present invention is a method for processing image data, comprising the steps of: (a) identifying one or more sets of image data corresponding to a region of interest in an image; (b) identifying one or more sets of image data corresponding to a transition region in the image located between the region of interest and a least- important region in the image; (c) selecting a first quantization level for each set of image data in the region of interest; (d) selecting a second quantization level for each set of image data in the transition region; (e) selecting a third quantization level for each set of image data in the least-important region; and (f) encoding the image based on the selected first, second, and third quantization levels.
  • Fig. 1 shows a example of a typical image that can be encoded using the present invention
  • Fig. 2 shows a flow diagram of the image processing implemented according to one embodiment of the present invention for an image, such as the image of Fig. 1.
  • Fig. 1 shows a example of a typical image 100 that can be encoded using the present invention.
  • the image in Fig. 1 consists of the head and shoulders of a person 102 positioned in front of background imagery 104, where the image data corresponding to the head of person 102 varies more in time (i.e., from frame to frame) than the background imagery.
  • Such a scene is typical of videophone and video-conferencing applications.
  • the person in the foreground is of greater importance to the viewer of image 100 than the background imagery.
  • image 100 is encoded such that, during playback, the video quality of the more-important foreground imagery is greater than the video quality of the less important background imagery.
  • This variation in playback video quality within an image is achieved by allowing the quantizers used during the video compression processing to encode the macroblocks of image 100 to vary within the image.
  • the selection of quantizers follows a particular scheme, described as follows.
  • image 100 is divided into three different regions: a foreground region 106 (also referred to as a region of interest (ROI)) consisting of those macroblocks corresponding to the head of person 102, a background region 108 (also referred to as the least-important region) consisting of macroblocks corresponding to background imagery 104 (including the shoulders of person 102), and a transition region 110 consisting of macroblocks located between the foreground region and the background region.
  • ROI region of interest
  • all of the macroblocks corresponding to the foreground region 106 are encoded using the same quantizer QP2
  • all of the macroblocks corresponding to the background region 108 are encoded using the same quantizer QPO
  • Fig. 2 shows a flow diagram of the image processing implemented according to one embodiment of the present invention for an image, such as image 100 of Fig. 1.
  • the present invention is typically implemented by a video processor that performs various conventional image processing routines, such as motion estimation, motion-compensated inter-frame differencing, transform application, quantization, run-length encoding, and variable-length encoding, as part of its overall video compression algorithm. Not all of this processing is shown in Fig. 2, which begins with the selection of a bit target for the present image (i.e., a desired number of bits to be used to encode the present image) based on a suitable bit rate scheme (step 202).
  • a bit target for the present image i.e., a desired number of bits to be used to encode the present image
  • suitable bit rate scheme step 202
  • step 204 the image is analyzed in step 204 to identify those macroblocks corresponding to one or more regions of interest (e.g., foreground region 106 corresponding to the head of person 102 in image 100 of Fig. 1).
  • the analysis of step 204 is referred to as segmentation analysis, which, for purposes of the present invention, can be implemented using any suitable scheme, including automatic schemes or interactive schemes in which the regions of interest are explicitly identified by the user (e.g., a participant in a video-conference located either at the encoder or the decoder).
  • the macroblocks corresponding to one or more transition regions are then identified (step 206).
  • a macroblock is defined as being part of a transition region if it borders on at least one side a macroblock that is part of a region of interest identified in step 204. The rest of the macroblocks in the image are identified as being part of the least-important background region.
  • a transition region need not necessarily be defined by a single contiguous set of macroblocks. The same is true for the background region.
  • an image may have two or more different regions of interest and two or more different corresponding transitions regions.
  • an initial quantization level is selected for each region (step 208).
  • the macroblocks of each region are encoded using a single uniform quantization level, where the quantization level may be different for the different regions.
  • the quantization level differs between different regions of interest and/or between different transition regions, as long as the quantization level is constant within each particular region. For example, a first region of interest may be more important than a second region of interest. In that case, it might be desirable to assign a lower quantizer to the first region of interest than to the second region of interest.
  • each region of interest will still have a corresponding transition region that is encoded using its own, possibly different quantizer.
  • the initial quantization levels are selected based on information related to the previously encoded image in the video sequence. This initial selection of quantizers may be based on the previous frame's actual quantizer assignments and bit expenditure, as well as on comparison of the current bit target and the current motion-compensated distortion with those of the previous frame. For example, if the previous frame's bit expenditure was higher than the previous bit target or if the current bit target is lower than the previous bit target or if the current distortion is higher than the previous distortion, then the previous quantizer assignments may need to be increased for the initial selection for the current frame.
  • the quantizer used to encode a transition region be fairly close to the quantizers used to encode both the corresponding foreground region of interest and the least-important background region.
  • the difference between horizontally adjacent quantizers is already constrained (e.g., never more than 2).
  • the quantizers actually increase from foreground to transition and from transition to background, so that the quality in the regions of interest can be optimized compared to the quality in the other regions.
  • transition regions frequently contain occlusions and artifacts surrounding the region of interest (like a talking head, for example) and using a lower quantizer here (as compared to the rest of the least-important region) can be expected to improve the overall visual quality of the video/image.
  • the image is encoded using those quantizers (step 210).
  • the number of bits used to encode the image is then compared with the bit target (step 212). If the number of bits used is sufficiently close to the bit target (e.g., within a specified tolerance), then processing is terminated.
  • step 2114 processing returns to step 210 to re-encode the image using the adjusted quantizers. Steps 210-214 are repeated iteratively until the bit target is sufficiently satisfied.
  • the quantization level selected for the region of interest (QP2 in Fig. 1) is preferably first decreased.
  • the quantization level selected for the background region is preferably first increased.
  • this increase in QPO may result in the need to increase the quantizer assigned to the transition region (QP1 ), which may in turn make it necessary to increase the quantizer assigned to the foreground region of interest (QP2).
  • the quantizer selection algorithm of the present invention may be unable to match sufficiently the frame's bit target.
  • the frame rate may not be very significant. As such, variations can be allowed in the frame-level bit expenditure, and/or conformance to channel requirements can be achieved by varying the instantaneous frame rate.
  • the present invention has been described in the context of a multi-pass encoding strategy that assigns different quantizer step sizes to different regions of an image while meeting a frame-level bit target and ensuring spatial and temporal smoothness in frame quality. This results in improved visual quality.
  • the scheme is computationally intensive, it may not be able to be used in real-time applications.
  • the invention can also be implemented as a real-time "pseudo-multi-pass" scheme based on modeling the rate-distortion curves at different quantization parameters.
  • this scheme the number of bits required to encode a macroblock is modeled according to the following equation:
  • R_q (X_q * S ⁇ (1 + Q/Q_d)) / Q
  • R_q is the number of bits required to code a macroblock using quantization parameter Q; X_q is the model constant at Q; S is the distortion of the macroblock; and Q_d is the model coefficient in exponent of S.
  • the big skip between an I frame and the following P frame is used to initialize the model.
  • a P frame is used in this interval (but not coded ) to calculate initial model parameters by encoding all macroblocks at all possible values of Q.
  • This model is constantly updated as the sequence is coded. As such, the model adapts very well to scene content.
  • the frame level rate control provides a frame-level bit target.
  • the quantization parameters are selected for the different regions. The important region is given a quantizer of QP-2, the transition region is given QP, and the background is given a quantizer of (QP+2). This ensures near-transmittability of the quantization parameters (DQUANTs).
  • the value of QP that comes closest to the frame-level bit target is chosen.
  • the present invention provides the twin advantages of frame-level rate control and the ability to adapt the quantizer to reflect the importance of the region, while maintaining spatial and temporal smoothness of the quantizer.
  • the present invention enables a video compression algorithm to meet a frame-level bit target, while ensuring spatial and temporal smoothness in frame quality, thus resulting in improved visual perception during playback.
  • quantization level corresponds to a specified quantizer parameter that is used to quantize each transform coefficient
  • present invention can also be implement in alternative embodiments, such as those in which quantization level corresponds to a quantization table in which each transform coefficient in a block of coefficients is assigned its own, possibly different quantizer value.
  • the present invention can be embodied in the form of methods and apparatuses for practicing those methods.
  • the present invention can also be embodied in the form of program code embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.
  • the present invention can also be embodied in the form of program code, for example, whether stored in a storage medium, loaded into and/or executed by a machine, or transmitted over some transmission medium, such as over electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.
  • program code When implemented on a general-purpose processor, the program code segments combine with the processor to provide a unique device that operates analogously to specific logic circuits.

Abstract

An image is divided into one or more (e.g., foreground) regions of interest wiht transition regions defined between each region of interest and the relatively least-important (e.g., background) region. Each region is encoded using a single selected quantization level, where quantizer values can differ between different regions. In general, in order to optimize video quality while still meeting target bit allocations, the quantizer assigned to a region of interest is preferably lower than the quantizer assigned to the corresponding transition region, which is itself preferably lower than the quantizer assigned to the background region. The present invention can be implemented iteratively to adjust the quantizer values as needed to meet the frame's specified bit target. The present invention can also be implemented using a non-iterative scheme that can be more easily implemented in real time. The present invention enables a video compression algorithm to meet a frame-level bit target, while ensuring spatial and temporal smoothness in frame quality, thus resulting in improved visual perception during playback.

Description

INTRA-FRAME QUANTIZER SELECTION FOR VIDEO COMPRESSION
BACKGROUND OF THE INVENTION Field of the Invention The present invention relates to image processing, and, in particular, to video compression.
Cross-Reference to Related Applications
This application claims the benefit of the filing date of U.S. provisional application no.
60/100,939, filed on 09/18/98 as attorney docket no. SAR 12728PROV.
Description of the Related Art
The goal of video compression processing is to encode image data to reduce the number of bits used to represent a sequence of video images while maintaining an acceptable level of quality in the decoded video sequence. This goal is particularly important in certain applications, such as videophone or video conferencing over POTS (plain old telephone service) or ISDN (integrated services digital network) lines, where the existence of limited transmission bandwidth requires careful control over the bit rate, that is, the number of bits used to encode each image in the video sequence.
Furthermore, in order to satisfy the transmission and other processing requirements of a video conferencing system, it is often desirable to have a relatively steady flow of bits in the encoded video bitstream.
Achieving a relatively uniform bit rate can be very difficult, especially for video compression algorithms that encode different images within a video sequence using different compression techniques. Depending on the video compression algorithm, images may be designated as the following different types of frames for compression processing: o An intra (I) frame which is encoded using only intra-frame compression techniques, o A predicted (P) frame which is encoded using inter-frame compression techniques based on a previous I or P frame, and which can itself be used as a reference frame to encode one or more other frames, o A bi-directional (B) frame which is encoded using bi-directional inter-frame compression techniques based on a previous I or P frame and a subsequent I or P frame, and which cannot be used to encode another frame, and o A PB frame which corresponds to two images - a P frame and a B frame in between the P frame and the previous I/P frame — that are encoded as a single frame (as in the H.263 video compression algorithm). Depending on the actual image data to be encoded, these different types of frames typically require different number of bits to encode. For example, I frames typically require the greatest numbers of bits, while B frames typically require the least number of bits.
In a typical transform-based video compression algorithm, a block-based transform, such as a discrete cosine transform (DCT), is applied to blocks of image data corresponding either to pixel values or pixel differences generated, for example, based on a motion-compensated inter-frame differencing scheme. The resulting transform coefficients for each block are then quantized for subsequent encoding (e.g., run-length encoding followed by variable-length encoding). The degree to which the transform coefficients are quantized directly affects both the number of bits used to represent the image data and the quality of the resulting decoded image. This degree of quantization is also referred to as the quantization level, which is often represented by a specified quantizer value that is used to quantize the transform coefficients. In general, higher quantization levels imply fewer bits and lower quality. As such, the quantizer is often used as the primary variable for controlling the tradeoff between bit rate and image quality. Visual quality of video depends not only on global measures (like pixel signal to noise ratio
(PSNR)), but also on how the error is distributed in space and time. Thus, it is important to maintain smoothness of the quantizer (which is closely related to the local distortion) across the picture. In fact, in many scenes, the ideal quantizer selection is a uniform value across the scene. However, such a scheme will not support the moving of bits to a region of interest from less-important regions, and furthermore, will provide very little control over the bits used to encode the picture. Thus, it cannot be used in constant (or near-constant) bit-rate applications (like videophone and video-conferencing over POTS or ISDN).
The other possibility is to vary the quantizer from macroblock-to-macroblock within the constraints of the coding standard being used (for example, in H.263, the quantizer level can change by a value of at most 2 in either direction). Examples of such schemes are given in the H.263+ TMN8 (Test Model Near-Term 8) and TMN9 documents (see, e.g., ITU - Telecommunications Standardization Sector, "Video Codec Test Model, Near-Term, Version 9 (TMN9)", Document Q15- C-15, December 1997). In these schemes, while the frame-level bit target can be accurately met, there are many, possibly large quantizer changes, both spatially and temporally, which show up annoyingly in the moving video as undesirable artifacts.
SUMMARY OF THE INVENTION As described in the previous section, some video compression algorithms, such as H.263, allow the quantizers to vary from macroblock to macroblock within a frame, although such algorithms often limit the magnitude of change in quantization level between horizontally adjacent macroblocks (e.g., a maximum change of +1-2 levels). In an application with limited bandwidth, this ability to vary the quantization level within a frame enables the video compression processing to judiciously allocate the available number of bits for encoding different regions of a frame differently, for example, allocating, more bits (i.e., lower quantization level) to specific regions of interest (ROI). For example, in the classic videophone or video conferencing paradigm in which the foreground consists of a talking head centered on a relatively constant background, it may be advantageous to assign lower quantization levels to the foreground ROI than to the less important background in order to satisfy the bit rate requirements while optimizing video quality.
The present invention is directed to a scheme for selecting quantizers for use in encoding frames having one or more regions of interest. According to one embodiment, the present invention is a method for processing image data, comprising the steps of: (a) identifying one or more sets of image data corresponding to a region of interest in an image; (b) identifying one or more sets of image data corresponding to a transition region in the image located between the region of interest and a least- important region in the image; (c) selecting a first quantization level for each set of image data in the region of interest; (d) selecting a second quantization level for each set of image data in the transition region; (e) selecting a third quantization level for each set of image data in the least-important region; and (f) encoding the image based on the selected first, second, and third quantization levels.
BRIEF DESCRIPTION OF THE DRAWINGS Other aspects, features, and advantages of the present invention will become more fully apparent from the following detailed description, the appended claims, and the accompanying drawings in which:
Fig. 1 shows a example of a typical image that can be encoded using the present invention; and Fig. 2 shows a flow diagram of the image processing implemented according to one embodiment of the present invention for an image, such as the image of Fig. 1.
DETAILED DESCRIPTION Fig. 1 shows a example of a typical image 100 that can be encoded using the present invention. The image in Fig. 1 consists of the head and shoulders of a person 102 positioned in front of background imagery 104, where the image data corresponding to the head of person 102 varies more in time (i.e., from frame to frame) than the background imagery. Such a scene is typical of videophone and video-conferencing applications. In general, during playback, the person in the foreground is of greater importance to the viewer of image 100 than the background imagery. According to the present invention, when bit rate is limited, image 100 is encoded such that, during playback, the video quality of the more-important foreground imagery is greater than the video quality of the less important background imagery. This variation in playback video quality within an image is achieved by allowing the quantizers used during the video compression processing to encode the macroblocks of image 100 to vary within the image. According to the present invention, the selection of quantizers follows a particular scheme, described as follows.
As shown in Fig. 1, image 100 is divided into three different regions: a foreground region 106 (also referred to as a region of interest (ROI)) consisting of those macroblocks corresponding to the head of person 102, a background region 108 (also referred to as the least-important region) consisting of macroblocks corresponding to background imagery 104 (including the shoulders of person 102), and a transition region 110 consisting of macroblocks located between the foreground region and the background region. According to the present invention, all of the macroblocks corresponding to the foreground region 106 are encoded using the same quantizer QP2, all of the macroblocks corresponding to the background region 108 are encoded using the same quantizer QPO, and all of the macroblocks corresponding to the transition region 110 are encoded using the same quantizer QP1, where, typically, QPO >= QP1 >= QP2.
Fig. 2 shows a flow diagram of the image processing implemented according to one embodiment of the present invention for an image, such as image 100 of Fig. 1. The present invention is typically implemented by a video processor that performs various conventional image processing routines, such as motion estimation, motion-compensated inter-frame differencing, transform application, quantization, run-length encoding, and variable-length encoding, as part of its overall video compression algorithm. Not all of this processing is shown in Fig. 2, which begins with the selection of a bit target for the present image (i.e., a desired number of bits to be used to encode the present image) based on a suitable bit rate scheme (step 202).
After selecting a bit target, the image is analyzed in step 204 to identify those macroblocks corresponding to one or more regions of interest (e.g., foreground region 106 corresponding to the head of person 102 in image 100 of Fig. 1). The analysis of step 204 is referred to as segmentation analysis, which, for purposes of the present invention, can be implemented using any suitable scheme, including automatic schemes or interactive schemes in which the regions of interest are explicitly identified by the user (e.g., a participant in a video-conference located either at the encoder or the decoder). After identifying the regions of interest, the macroblocks corresponding to one or more transition regions are then identified (step 206). In one embodiment, a macroblock is defined as being part of a transition region if it borders on at least one side a macroblock that is part of a region of interest identified in step 204. The rest of the macroblocks in the image are identified as being part of the least-important background region.
In the example of image 100 in Fig. 1 , there is only one region of interest 106, one corresponding transition region 110, and one background region 108. Note that a transition region need not necessarily be defined by a single contiguous set of macroblocks. The same is true for the background region. Depending on the particular image and the particular application, an image may have two or more different regions of interest and two or more different corresponding transitions regions.
After identifying the macroblocks corresponding to the different regions, an initial quantization level is selected for each region (step 208). According to the present invention, the macroblocks of each region are encoded using a single uniform quantization level, where the quantization level may be different for the different regions. In general, when there are two or more different regions of interest and/or two or more different transition regions, the quantization level differs between different regions of interest and/or between different transition regions, as long as the quantization level is constant within each particular region. For example, a first region of interest may be more important than a second region of interest. In that case, it might be desirable to assign a lower quantizer to the first region of interest than to the second region of interest. In any case, each region of interest will still have a corresponding transition region that is encoded using its own, possibly different quantizer.
In one implementation, the initial quantization levels are selected based on information related to the previously encoded image in the video sequence. This initial selection of quantizers may be based on the previous frame's actual quantizer assignments and bit expenditure, as well as on comparison of the current bit target and the current motion-compensated distortion with those of the previous frame. For example, if the previous frame's bit expenditure was higher than the previous bit target or if the current bit target is lower than the previous bit target or if the current distortion is higher than the previous distortion, then the previous quantizer assignments may need to be increased for the initial selection for the current frame.
In order to avoid abrupt variations in quality between regions it is desirable that the quantizer used to encode a transition region be fairly close to the quantizers used to encode both the corresponding foreground region of interest and the least-important background region. In some video compression algorithms, such as those conforming to the H.263 framework, the difference between horizontally adjacent quantizers is already constrained (e.g., never more than 2). Also, it is preferable that the quantizers actually increase from foreground to transition and from transition to background, so that the quality in the regions of interest can be optimized compared to the quality in the other regions. Note that transition regions frequently contain occlusions and artifacts surrounding the region of interest (like a talking head, for example) and using a lower quantizer here (as compared to the rest of the least-important region) can be expected to improve the overall visual quality of the video/image. After selecting initial quantization levels for the various regions, the image is encoded using those quantizers (step 210). The number of bits used to encode the image is then compared with the bit target (step 212). If the number of bits used is sufficiently close to the bit target (e.g., within a specified tolerance), then processing is terminated. Otherwise, if the number of bits used is either too much smaller or too much greater than the bit target, then one or more of the quantizers are appropriately adjusted (step 214) and processing returns to step 210 to re-encode the image using the adjusted quantizers. Steps 210-214 are repeated iteratively until the bit target is sufficiently satisfied.
If the number of bits used is too small relative to the bit target, then the quantization level selected for the region of interest (QP2 in Fig. 1) is preferably first decreased. Depending on the existing differences between the quantizers for the different regions and the constraints applied by the overall video compression algorithm related to the magnitude of quantization-level changes between horizontally adjacent macroblocks, it may also be necessary to decrease the quantizer assigned to the transition region (QP1), which may in turn make it necessary to decrease the quantizer assigned to the background region (QPO). For example, assume that initially QP2=10, QP1=12, and QP0=13, and that the maximum allowable quantizer change is 2. Assume further that the number of bits used to encode the image based on these quantizers is too low. In order to optimize video quality for the assigned bit target, it is desirable to decrease QP2 to 9. This change results in the need to decrease QP1 to 11 to avoid violating the maximum allowable horizontal quantizer change between macroblocks of 2. In this situation, QPO will not have to be changed. However, if the number of bits is still too low, all three quantizers will have to be decremented on the next iteration.
Similarly, if the number of bits is too large relative to the bit target, then the quantization level selected for the background region (QPO) is preferably first increased. Here, too, depending on the situation, this increase in QPO may result in the need to increase the quantizer assigned to the transition region (QP1 ), which may in turn make it necessary to increase the quantizer assigned to the foreground region of interest (QP2).
In general, when too few bits are used, it is desirable to add bits first to the foreground region of interest, and, when too many bits are used, it is desirable to remove bits first from the background least-important region. In low activity scenes, the quantizer selection algorithm of the present invention may be unable to match sufficiently the frame's bit target. In many cases, especially in low- activity situations, the frame rate may not be very significant. As such, variations can be allowed in the frame-level bit expenditure, and/or conformance to channel requirements can be achieved by varying the instantaneous frame rate.
The present invention has been described in the context of a multi-pass encoding strategy that assigns different quantizer step sizes to different regions of an image while meeting a frame-level bit target and ensuring spatial and temporal smoothness in frame quality. This results in improved visual quality. However, because the scheme is computationally intensive, it may not be able to be used in real-time applications.
The invention can also be implemented as a real-time "pseudo-multi-pass" scheme based on modeling the rate-distortion curves at different quantization parameters. According to this scheme, the number of bits required to encode a macroblock is modeled according to the following equation:
R_q = (X_q * S Λ (1 + Q/Q_d)) / Q where:
R_q is the number of bits required to code a macroblock using quantization parameter Q; X_q is the model constant at Q; S is the distortion of the macroblock; and Q_d is the model coefficient in exponent of S.
The big skip between an I frame and the following P frame is used to initialize the model. A P frame is used in this interval (but not coded ) to calculate initial model parameters by encoding all macroblocks at all possible values of Q. This model is constantly updated as the sequence is coded. As such, the model adapts very well to scene content. When encoding of a frame is begun, the frame level rate control provides a frame-level bit target. Based on the above rate-distortion model, the quantization parameters are selected for the different regions. The important region is given a quantizer of QP-2, the transition region is given QP, and the background is given a quantizer of (QP+2). This ensures near-transmittability of the quantization parameters (DQUANTs). The value of QP that comes closest to the frame-level bit target is chosen.
The present invention provides the twin advantages of frame-level rate control and the ability to adapt the quantizer to reflect the importance of the region, while maintaining spatial and temporal smoothness of the quantizer. As such, the present invention enables a video compression algorithm to meet a frame-level bit target, while ensuring spatial and temporal smoothness in frame quality, thus resulting in improved visual perception during playback.
Although the invention has been described in the context of the talking head paradigm of videophone and video-conferencing applications, the invention is also applicable for different kinds of schemes, preferably where the different regions are fairly contiguous.
Similarly, although the present invention has been described in the context of embodiments in which quantization level corresponds to a specified quantizer parameter that is used to quantize each transform coefficient, the present invention can also be implement in alternative embodiments, such as those in which quantization level corresponds to a quantization table in which each transform coefficient in a block of coefficients is assigned its own, possibly different quantizer value.
The present invention can be embodied in the form of methods and apparatuses for practicing those methods. The present invention can also be embodied in the form of program code embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention. The present invention can also be embodied in the form of program code, for example, whether stored in a storage medium, loaded into and/or executed by a machine, or transmitted over some transmission medium, such as over electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention. When implemented on a general-purpose processor, the program code segments combine with the processor to provide a unique device that operates analogously to specific logic circuits.
It will be further understood that various changes in the details, materials, and arrangements of the parts which have been described and illustrated in order to explain the nature of this invention may be made by those skilled in the art without departing from the principle and scope of the invention as expressed in the following claims.

Claims

What is claimed is: 1. A method for processing image data, comprising the steps of: (a) identifying one or more sets of image data corresponding to a region of interest in an image; (b) identifying one or more sets of image data corresponding to a transition region in the image located between the region of interest and a least-important region in the image; (c) selecting a first quantization level for each set of image data in the region of interest; (d) selecting a second quantization level for each set of image data in the transition region; (e) selecting a third quantization level for each set of image data in the least-important region; and (f) encoding the image based on the selected first, second, and third quantization levels.
2. The invention of claim 1 , wherein the first quantization level is lower than the second quantization level and the second quantization level is lower than the third quantization level.
3. The invention of claim 1 , further comprising the steps of: (g) comparing the number of bits used to encode the image in step (f) to a bit target for the image; (h) adjusting one or more of the first, second, and third quantization levels in accordance with the comparison of step (g); and (i) re-encoding the image based on the adjusted quantization levels.
4. The invention of claim 3, wherein steps (g)-(i) are repeated until the number of bits used to encode the image is sufficiently close to the bit target.
5. The invention of claim 3, wherein: if the number of bits in step (g) is sufficiently low, then step (h) comprises the step of decreasing the first quantization level and, if appropriate, decreasing the second quantization level, and then, if appropriate, decreasing the third quantization level; and if the number of bits in step (g) is sufficiently high, then step (h) comprises the step of increasing the third quantization level and, if appropriate, increasing the second quantization level, and then, if appropriate, increasing the first quantization level.
6. The invention of claim 1 , wherein the image has two or more regions of interest and each region of interest is assigned its own quantization level, which may differ between regions of interest.
7. The invention of claim 1 , wherein magnitudes of differences between the first and second quantization levels and between the second and third quantization levels are within a specified limit.
8. The invention of claim 1 , wherein the region of interest corresponds to a talking head and the least-important region corresponds to a relatively stationary background.
9. The invention of claim 1 , wherein at least one of the first, second, and third quantization levels is selected based on modeling of rate-distortion curves at different quantization levels.
10. The invention of claim 9, wherein a number of bits used to encode a macroblock is modeled as follows: R_q = (X_q * S Λ (l + Q/Q_d)) / Q where: R_q is a number of bits used to code a macroblock using quantization parameter Q; X_q is a model constant at Q; S is a distortion of the macroblock; and Q_d is a model coefficient in exponent of S.
PCT/US1999/021834 1998-09-18 1999-09-20 Intra-frame quantizer selection for video compression WO2000018131A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP99948365A EP1114558A1 (en) 1998-09-18 1999-09-20 Intra-frame quantizer selection for video compression
JP2000571666A JP2002525989A (en) 1998-09-18 1999-09-20 Intra-frame quantizer selection for video compression

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US10093998P 1998-09-18 1998-09-18
US60/100,939 1998-09-18
US09/212,025 US6256423B1 (en) 1998-09-18 1998-12-15 Intra-frame quantizer selection for video compression
US09/212,025 1998-12-15

Publications (1)

Publication Number Publication Date
WO2000018131A1 true WO2000018131A1 (en) 2000-03-30

Family

ID=26797722

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/021834 WO2000018131A1 (en) 1998-09-18 1999-09-20 Intra-frame quantizer selection for video compression

Country Status (5)

Country Link
US (1) US6256423B1 (en)
EP (1) EP1114558A1 (en)
JP (1) JP2002525989A (en)
KR (2) KR20000023278A (en)
WO (1) WO2000018131A1 (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000046999A1 (en) * 1999-02-03 2000-08-10 Sarnoff Corporation Quantizer selection based on region complexities derived using a rate distortion model
GB2371434A (en) * 2001-01-19 2002-07-24 Motorola Inc Encoding and transmitting video data
JP2002238060A (en) * 2001-02-07 2002-08-23 Sony Corp Image-coding method, image coder, program and recording medium
EP1248470A2 (en) * 2001-03-29 2002-10-09 Matsushita Electric Industrial Co., Ltd. Image coding equipment and image coding program
EP1250012A2 (en) * 2001-03-23 2002-10-16 Sharp Kabushiki Kaisha Adaptive quantization based on bit rate prediction and prediction error energy
JP2003116007A (en) * 2001-07-03 2003-04-18 Eastman Kodak Co Method for applying object content analysis for producing compressed bit stream from digital image
EP1313322A2 (en) * 2001-11-17 2003-05-21 LG Electronics Inc. Object-based bit rate control method and system thereof
JP2005515719A (en) * 2001-12-28 2005-05-26 ノキア コーポレイション Method and apparatus for selecting macroblock quantization parameters in a video encoder
EP1546994A2 (en) * 2002-07-29 2005-06-29 QUALCOMM Incorporated Digital image encoding
EP1711017A2 (en) * 2005-04-07 2006-10-11 British Broadcasting Corporation Compression encoding
EP1995967A1 (en) * 2006-03-16 2008-11-26 Huawei Technologies Co., Ltd. Method and apparatus for realizing adaptive quantization in encoding process
EP2127110A2 (en) * 2006-12-27 2009-12-02 General instrument Corporation Method and apparatus for bit rate reduction in video telephony
US7724972B2 (en) 2005-03-01 2010-05-25 Qualcomm Incorporated Quality metric-biased region-of-interest coding for video telephony
US7983497B2 (en) 2004-04-23 2011-07-19 Sumitomo Electric Industries, Ltd. Coding method for motion-image data, decoding method, terminal equipment executing these, and two-way interactive system
CN102170552A (en) * 2010-02-25 2011-08-31 株式会社理光 Video conference system and processing method used therein
EP2339851A3 (en) * 2005-09-22 2012-04-25 Qualcomm Incorporated Two pass rate control techniques for video coding using rate-distortion characteristics
US8379721B2 (en) 2005-09-22 2013-02-19 Qualcomm Incorported Two pass rate control techniques for video coding using a min-max approach
US8693537B2 (en) 2005-03-01 2014-04-08 Qualcomm Incorporated Region-of-interest coding with background skipping for video telephony
US8768084B2 (en) 2005-03-01 2014-07-01 Qualcomm Incorporated Region-of-interest coding in video telephony using RHO domain bit allocation
WO2015006176A1 (en) * 2013-07-10 2015-01-15 Microsoft Corporation Region-of-interest aware video coding
EP3192262A4 (en) * 2014-09-12 2018-08-01 TMM Inc. Systems and methods for subject-oriented compression

Families Citing this family (114)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001223941A (en) * 1999-12-01 2001-08-17 Ricoh Co Ltd Image pickup device and image pickup method
US6597736B1 (en) * 2000-03-29 2003-07-22 Cisco Technology, Inc. Throughput enhanced video communication
KR100353851B1 (en) * 2000-07-07 2002-09-28 한국전자통신연구원 Water ring scan apparatus and method, video coding/decoding apparatus and method using that
US8290062B1 (en) * 2000-09-27 2012-10-16 Intel Corporation Method and apparatus for manipulating MPEG video
KR20020032862A (en) * 2000-10-27 2002-05-04 신재섭 An object-based multimedia service system and a service method using a moving picture encoding
US7027655B2 (en) * 2001-03-29 2006-04-11 Electronics For Imaging, Inc. Digital image compression with spatially varying quality levels determined by identifying areas of interest
US6970513B1 (en) * 2001-06-05 2005-11-29 At&T Corp. System for content adaptive video decoding
US7773670B1 (en) 2001-06-05 2010-08-10 At+T Intellectual Property Ii, L.P. Method of content adaptive video encoding
US6909745B1 (en) 2001-06-05 2005-06-21 At&T Corp. Content adaptive video encoder
US6968006B1 (en) 2001-06-05 2005-11-22 At&T Corp. Method of content adaptive video decoding
US6810086B1 (en) 2001-06-05 2004-10-26 At&T Corp. System and method of filtering noise
KR100643454B1 (en) * 2001-11-17 2006-11-10 엘지전자 주식회사 Method for video data transmission control
KR100603592B1 (en) * 2001-11-26 2006-07-24 학교법인 고황재단 Intelligent Water ring scan apparatus and method using Quality Factor, video coding/decoding apparatus and method using that
US7167519B2 (en) * 2001-12-20 2007-01-23 Siemens Corporate Research, Inc. Real-time video object generation for smart cameras
KR100528324B1 (en) * 2002-08-30 2005-11-15 삼성전자주식회사 Apparatus and method for coding and decoding image using overlapped rectangular slices
KR100624404B1 (en) * 2002-01-05 2006-09-18 삼성전자주식회사 Adaptive coding method and apparatus considering human visual characteristics
DE10300048B4 (en) * 2002-01-05 2005-05-12 Samsung Electronics Co., Ltd., Suwon Image coding method for motion picture expert groups, involves image quantizing data in accordance with quantization parameter, and coding entropy of quantized image data using entropy coding unit
KR20030085336A (en) * 2002-04-30 2003-11-05 삼성전자주식회사 Image coding method and apparatus using chroma quantization considering human visual characteristics
US7936818B2 (en) 2002-07-01 2011-05-03 Arris Group, Inc. Efficient compression and transport of video over a network
KR100488019B1 (en) * 2002-12-12 2005-05-06 엘지전자 주식회사 Mobile station and operating method for thereof
KR100508119B1 (en) * 2003-03-14 2005-08-10 엘지전자 주식회사 Device and the Method for processing image data
US7580584B2 (en) * 2003-07-18 2009-08-25 Microsoft Corporation Adaptive multiple quantization
US10554985B2 (en) 2003-07-18 2020-02-04 Microsoft Technology Licensing, Llc DC coefficient signaling at small quantization step sizes
US8218624B2 (en) * 2003-07-18 2012-07-10 Microsoft Corporation Fractional quantization step sizes for high bit rates
US7738554B2 (en) 2003-07-18 2010-06-15 Microsoft Corporation DC coefficient signaling at small quantization step sizes
US7602851B2 (en) * 2003-07-18 2009-10-13 Microsoft Corporation Intelligent differential quantization of video coding
CN1655620B (en) * 2004-02-09 2010-09-22 三洋电机株式会社 Image display apparatus
US7801383B2 (en) * 2004-05-15 2010-09-21 Microsoft Corporation Embedded scalar quantizers with arbitrary dead-zone ratios
US8902971B2 (en) 2004-07-30 2014-12-02 Euclid Discoveries, Llc Video compression repository and model reuse
US9743078B2 (en) 2004-07-30 2017-08-22 Euclid Discoveries, Llc Standards-compliant model-based video encoding and decoding
US9532069B2 (en) 2004-07-30 2016-12-27 Euclid Discoveries, Llc Video compression repository and model reuse
US9578345B2 (en) 2005-03-31 2017-02-21 Euclid Discoveries, Llc Model-based video encoding and decoding
WO2008091483A2 (en) * 2007-01-23 2008-07-31 Euclid Discoveries, Llc Computer method and apparatus for processing image data
US20060062478A1 (en) * 2004-08-16 2006-03-23 Grandeye, Ltd., Region-sensitive compression of digital video
US8977063B2 (en) 2005-03-09 2015-03-10 Qualcomm Incorporated Region-of-interest extraction for video telephony
US8019175B2 (en) 2005-03-09 2011-09-13 Qualcomm Incorporated Region-of-interest processing for video telephony
WO2006106030A1 (en) * 2005-04-04 2006-10-12 Thomson Licensing Method for locally adjusting a quantization step
US8224102B2 (en) * 2005-04-08 2012-07-17 Agency For Science, Technology And Research Method for encoding a picture, computer program product and encoder
US7974193B2 (en) 2005-04-08 2011-07-05 Qualcomm Incorporated Methods and systems for resizing multimedia content based on quality and rate information
US8422546B2 (en) 2005-05-25 2013-04-16 Microsoft Corporation Adaptive video encoding using a perceptual model
US8345768B1 (en) * 2005-07-28 2013-01-01 Teradici Corporation Progressive block encoding using region analysis
US8208758B2 (en) * 2005-10-05 2012-06-26 Qualcomm Incorporated Video sensor-based automatic region-of-interest detection
US8019170B2 (en) * 2005-10-05 2011-09-13 Qualcomm, Incorporated Video frame motion-based automatic region-of-interest detection
US7577302B2 (en) * 2005-12-21 2009-08-18 Xerox Corporation Compressed image data enhancement
US8582905B2 (en) * 2006-01-31 2013-11-12 Qualcomm Incorporated Methods and systems for rate control within an encoding device
US20070201388A1 (en) * 2006-01-31 2007-08-30 Qualcomm Incorporated Methods and systems for resizing multimedia content based on quality and rate information
GB2435140B (en) * 2006-02-13 2011-04-06 Snell & Wilcox Ltd Sport action coding
US7974340B2 (en) 2006-04-07 2011-07-05 Microsoft Corporation Adaptive B-picture quantization control
US8059721B2 (en) 2006-04-07 2011-11-15 Microsoft Corporation Estimating sample-domain distortion in the transform domain with rounding compensation
US8503536B2 (en) 2006-04-07 2013-08-06 Microsoft Corporation Quantization adjustments for DC shift artifacts
US7995649B2 (en) 2006-04-07 2011-08-09 Microsoft Corporation Quantization adjustment based on texture level
US8130828B2 (en) 2006-04-07 2012-03-06 Microsoft Corporation Adjusting quantization to preserve non-zero AC coefficients
WO2007130425A2 (en) * 2006-05-01 2007-11-15 Georgia Tech Research Corporation Expert system and method for elastic encoding of video according to regions of interest
US8711925B2 (en) 2006-05-05 2014-04-29 Microsoft Corporation Flexible quantization
CN101507281B (en) * 2006-07-12 2013-06-05 诺基亚公司 Signaling of region-of-interest scalability information in media files
US7826671B2 (en) * 2006-11-21 2010-11-02 Samsung Electronics Co., Ltd. Method and system for quantization layer reduction in digital image processing
US8315466B2 (en) * 2006-12-22 2012-11-20 Qualcomm Incorporated Decoder-side region of interest video processing
US8553782B2 (en) * 2007-01-23 2013-10-08 Euclid Discoveries, Llc Object archival systems and methods
CN101622876B (en) 2007-01-23 2012-05-30 欧几里得发现有限责任公司 Systems and methods for providing personal video services
JP2008193410A (en) * 2007-02-05 2008-08-21 Matsushita Electric Ind Co Ltd Image encoding device, recording device, moving image encoding method, and moving image encoding program
US8238424B2 (en) 2007-02-09 2012-08-07 Microsoft Corporation Complexity-based adaptive preprocessing for multiple-pass video compression
US8498335B2 (en) 2007-03-26 2013-07-30 Microsoft Corporation Adaptive deadzone size adjustment in quantization
US8243797B2 (en) 2007-03-30 2012-08-14 Microsoft Corporation Regions of interest for quality adjustments
US8442337B2 (en) 2007-04-18 2013-05-14 Microsoft Corporation Encoding adjustments for animation content
US8331438B2 (en) 2007-06-05 2012-12-11 Microsoft Corporation Adaptive selection of picture-level quantization parameters for predicted video pictures
US8798148B2 (en) * 2007-06-15 2014-08-05 Physical Optics Corporation Apparatus and method employing pre-ATR-based real-time compression and video frame segmentation
JP4817260B2 (en) * 2007-07-18 2011-11-16 富士フイルム株式会社 Image processing apparatus, image processing method, and program
CN101102495B (en) * 2007-07-26 2010-04-07 武汉大学 A video image decoding and encoding method and device based on area
US8189933B2 (en) 2008-03-31 2012-05-29 Microsoft Corporation Classifying and controlling encoding quality for textured, dark smooth and smooth video content
US8897359B2 (en) 2008-06-03 2014-11-25 Microsoft Corporation Adaptive quantization for enhancement layer video coding
RU2518435C2 (en) 2008-07-20 2014-06-10 Долби Лэборетериз Лайсенсинг Корпорейшн Encoder optimisation in stereoscopic video delivery systems
US8570359B2 (en) * 2008-08-04 2013-10-29 Microsoft Corporation Video region of interest features
KR100930666B1 (en) * 2008-08-11 2009-12-09 삼성전자주식회사 Image decoding apparatus
US8325796B2 (en) * 2008-09-11 2012-12-04 Google Inc. System and method for video coding using adaptive segmentation
EP2345256B1 (en) * 2008-10-07 2018-03-14 Euclid Discoveries, LLC Feature-based video compression
KR100968266B1 (en) * 2009-10-28 2010-07-06 주식회사 인비전트 Controlling system for transmitting data of real time and method for transmitting data of real time
US20110122224A1 (en) * 2009-11-20 2011-05-26 Wang-He Lou Adaptive compression of background image (acbi) based on segmentation of three dimentional objects
US8356114B2 (en) 2010-04-15 2013-01-15 Canon Kabushiki Kaisha Region of interest-based image transfer
US8755441B2 (en) 2010-05-10 2014-06-17 Canon Kabushiki Kaisha Region of interest-based video transfer
WO2012050832A1 (en) 2010-09-28 2012-04-19 Google Inc. Systems and methods utilizing efficient video compression techniques for providing static image data
US8938001B1 (en) 2011-04-05 2015-01-20 Google Inc. Apparatus and method for coding using combinations
US9154799B2 (en) 2011-04-07 2015-10-06 Google Inc. Encoding and decoding motion via image segmentation
US8989256B2 (en) 2011-05-25 2015-03-24 Google Inc. Method and apparatus for using segmentation-based coding of prediction information
US8891616B1 (en) 2011-07-27 2014-11-18 Google Inc. Method and apparatus for entropy encoding based on encoding cost
US9247257B1 (en) 2011-11-30 2016-01-26 Google Inc. Segmentation based entropy encoding and decoding
JP5727398B2 (en) * 2012-01-26 2015-06-03 日本電信電話株式会社 Moving picture coding method, moving picture coding apparatus, and moving picture coding program
US9262670B2 (en) 2012-02-10 2016-02-16 Google Inc. Adaptive region of interest
US9094681B1 (en) 2012-02-28 2015-07-28 Google Inc. Adaptive segmentation
US11039138B1 (en) 2012-03-08 2021-06-15 Google Llc Adaptive coding of prediction modes using probability distributions
US9300984B1 (en) 2012-04-18 2016-03-29 Matrox Graphics Inc. Independent processing of data streams in codec
US10003802B1 (en) 2012-04-18 2018-06-19 Matrox Graphics Inc. Motion-based adaptive quantization
US10003803B1 (en) 2012-04-18 2018-06-19 Matrox Graphics Inc. Motion-based adaptive quantization
US9774856B1 (en) 2012-07-02 2017-09-26 Google Inc. Adaptive stochastic entropy coding
US9380298B1 (en) 2012-08-10 2016-06-28 Google Inc. Object-based intra-prediction
US9826229B2 (en) 2012-09-29 2017-11-21 Google Technology Holdings LLC Scan pattern determination from base layer pixel information for scalable extension
US9350988B1 (en) 2012-11-20 2016-05-24 Google Inc. Prediction mode-based block ordering in video coding
US9681128B1 (en) 2013-01-31 2017-06-13 Google Inc. Adaptive pre-transform scanning patterns for video and image compression
KR102088801B1 (en) 2013-03-07 2020-03-13 삼성전자주식회사 Method and apparatus for ROI coding using variable block size coding information
US9509998B1 (en) 2013-04-04 2016-11-29 Google Inc. Conditional predictive multi-symbol run-length coding
US10230950B2 (en) 2013-05-30 2019-03-12 Intel Corporation Bit-rate control for video coding using object-of-interest data
US9392288B2 (en) 2013-10-17 2016-07-12 Google Inc. Video coding using scatter-based scan tables
US9179151B2 (en) 2013-10-18 2015-11-03 Google Inc. Spatial proximity context entropy coding
US10091507B2 (en) 2014-03-10 2018-10-02 Euclid Discoveries, Llc Perceptual optimization for model-based video encoding
CA2942336A1 (en) 2014-03-10 2015-09-17 Euclid Discoveries, Llc Continuous block tracking for temporal prediction in video encoding
US10097851B2 (en) 2014-03-10 2018-10-09 Euclid Discoveries, Llc Perceptual optimization for model-based video encoding
US9392272B1 (en) 2014-06-02 2016-07-12 Google Inc. Video coding using adaptive source variance based partitioning
US9578324B1 (en) 2014-06-27 2017-02-21 Google Inc. Video coding using statistical-based spatially differentiated partitioning
US9402054B2 (en) * 2014-12-08 2016-07-26 Blue Jeans Network Provision of video conference services
JP6537396B2 (en) * 2015-08-03 2019-07-03 キヤノン株式会社 IMAGE PROCESSING APPARATUS, IMAGING APPARATUS, AND IMAGE PROCESSING METHOD
US10360695B1 (en) 2017-06-01 2019-07-23 Matrox Graphics Inc. Method and an apparatus for enabling ultra-low latency compression of a stream of pictures
US10904528B2 (en) * 2018-09-28 2021-01-26 Tencent America LLC Techniques for QP selection for 360 image and video coding
CN112929668A (en) * 2021-04-07 2021-06-08 百果园技术(新加坡)有限公司 Video coding method, device, equipment and storage medium
US20230110569A1 (en) * 2021-10-07 2023-04-13 Google Llc QP Range Specification For External Video Rate Control
CN114531599B (en) * 2022-04-25 2022-06-21 中国医学科学院阜外医院深圳医院(深圳市孙逸仙心血管医院) Image compression method for medical image storage

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0676899A2 (en) * 1994-04-06 1995-10-11 AT&T Corp. Audio-visual communication system having integrated perceptual speech and video coding
EP0703711A2 (en) * 1994-09-22 1996-03-27 Philips Patentverwaltung GmbH Video signal segmentation coder
EP0785689A2 (en) * 1996-01-22 1997-07-23 Lucent Technologies Inc. Global rate control for model-assisted video encoder
US5764803A (en) * 1996-04-03 1998-06-09 Lucent Technologies Inc. Motion-adaptive modelling of scene content for very low bit rate model-assisted coding of video sequences

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE3854846T2 (en) * 1987-06-25 1996-11-14 Nec Corp Coding an image signal taking into account the contrast in each image and decoding analogous to the coding
JPH0256187A (en) * 1988-08-22 1990-02-26 Matsushita Electric Ind Co Ltd Moving picture encoder
JP3805393B2 (en) * 1994-11-10 2006-08-02 株式会社東芝 Image encoding device
JPH08336135A (en) * 1995-04-07 1996-12-17 Sony Corp Device and method for image compression
KR100493854B1 (en) * 1995-10-26 2005-08-04 주식회사 팬택앤큐리텔 Quantization Method in Object-Centered Encoding
US5650860A (en) * 1995-12-26 1997-07-22 C-Cube Microsystems, Inc. Adaptive quantization
JPH09200764A (en) * 1996-01-16 1997-07-31 Canon Inc Encoder
US5969764A (en) * 1997-02-14 1999-10-19 Mitsubishi Electric Information Technology Center America, Inc. Adaptive video coding method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0676899A2 (en) * 1994-04-06 1995-10-11 AT&T Corp. Audio-visual communication system having integrated perceptual speech and video coding
EP0703711A2 (en) * 1994-09-22 1996-03-27 Philips Patentverwaltung GmbH Video signal segmentation coder
EP0785689A2 (en) * 1996-01-22 1997-07-23 Lucent Technologies Inc. Global rate control for model-assisted video encoder
US5764803A (en) * 1996-04-03 1998-06-09 Lucent Technologies Inc. Motion-adaptive modelling of scene content for very low bit rate model-assisted coding of video sequences

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
COENE W ET AL: "A FAST ROUTE FOR APPLICATION OF RATE-DISTORTION OPTIMAL QUANTIZATION IN AN MPEG VIDEO ENCODER", 16 September 1996, PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP),US,NEW YORK, IEEE, PAGE(S) 825-828, ISBN: 0-7803-3259-8, XP000733350 *

Cited By (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000046999A1 (en) * 1999-02-03 2000-08-10 Sarnoff Corporation Quantizer selection based on region complexities derived using a rate distortion model
US6539124B2 (en) 1999-02-03 2003-03-25 Sarnoff Corporation Quantizer selection based on region complexities derived using a rate distortion model
GB2371434A (en) * 2001-01-19 2002-07-24 Motorola Inc Encoding and transmitting video data
EP1227684A3 (en) * 2001-01-19 2003-06-04 Motorola, Inc. Encoding of video signals
EP1227684A2 (en) * 2001-01-19 2002-07-31 Motorola, Inc. Encoding of video signals
JP2002238060A (en) * 2001-02-07 2002-08-23 Sony Corp Image-coding method, image coder, program and recording medium
EP1250012A3 (en) * 2001-03-23 2004-10-06 Sharp Kabushiki Kaisha Adaptive quantization based on bit rate prediction and prediction error energy
EP1250012A2 (en) * 2001-03-23 2002-10-16 Sharp Kabushiki Kaisha Adaptive quantization based on bit rate prediction and prediction error energy
EP1248470A3 (en) * 2001-03-29 2004-11-17 Matsushita Electric Industrial Co., Ltd. Image coding equipment and image coding program
EP1248470A2 (en) * 2001-03-29 2002-10-09 Matsushita Electric Industrial Co., Ltd. Image coding equipment and image coding program
US7305135B2 (en) 2001-03-29 2007-12-04 Matsushita Electric Industrial Co., Ltd. Image coding equipment and image coding program
JP2003116007A (en) * 2001-07-03 2003-04-18 Eastman Kodak Co Method for applying object content analysis for producing compressed bit stream from digital image
EP1313322A3 (en) * 2001-11-17 2004-09-01 LG Electronics Inc. Object-based bit rate control method and system thereof
EP1313322A2 (en) * 2001-11-17 2003-05-21 LG Electronics Inc. Object-based bit rate control method and system thereof
JP2005515719A (en) * 2001-12-28 2005-05-26 ノキア コーポレイション Method and apparatus for selecting macroblock quantization parameters in a video encoder
US8194987B2 (en) 2002-07-29 2012-06-05 Qualcomm Incorporated Digital image encoding
EP1546994A4 (en) * 2002-07-29 2009-08-12 Qualcomm Inc Digital image encoding
EP1546994A2 (en) * 2002-07-29 2005-06-29 QUALCOMM Incorporated Digital image encoding
US7856149B2 (en) 2002-07-29 2010-12-21 Qualcomm Incorporated Digital image encoding
US7983497B2 (en) 2004-04-23 2011-07-19 Sumitomo Electric Industries, Ltd. Coding method for motion-image data, decoding method, terminal equipment executing these, and two-way interactive system
US8768084B2 (en) 2005-03-01 2014-07-01 Qualcomm Incorporated Region-of-interest coding in video telephony using RHO domain bit allocation
US7724972B2 (en) 2005-03-01 2010-05-25 Qualcomm Incorporated Quality metric-biased region-of-interest coding for video telephony
US8693537B2 (en) 2005-03-01 2014-04-08 Qualcomm Incorporated Region-of-interest coding with background skipping for video telephony
EP1711017A2 (en) * 2005-04-07 2006-10-11 British Broadcasting Corporation Compression encoding
EP1711017A3 (en) * 2005-04-07 2008-06-25 British Broadcasting Corporation Compression encoding
EP2339851A3 (en) * 2005-09-22 2012-04-25 Qualcomm Incorporated Two pass rate control techniques for video coding using rate-distortion characteristics
US8379721B2 (en) 2005-09-22 2013-02-19 Qualcomm Incorported Two pass rate control techniques for video coding using a min-max approach
EP1995967A1 (en) * 2006-03-16 2008-11-26 Huawei Technologies Co., Ltd. Method and apparatus for realizing adaptive quantization in encoding process
US8160374B2 (en) 2006-03-16 2012-04-17 Huawei Technologies Co., Ltd. Method and apparatus for realizing adaptive quantization in process of image coding
US8625917B2 (en) 2006-03-16 2014-01-07 Huawei Technologies Co., Ltd. Method and apparatus for realizing adaptive quantization in process of image coding
EP1995967A4 (en) * 2006-03-16 2009-11-11 Huawei Tech Co Ltd Method and apparatus for realizing adaptive quantization in encoding process
US9277215B2 (en) 2006-03-16 2016-03-01 Tsinghua University Method and apparatus for realizing adaptive quantization in process of image coding
EP2127110A4 (en) * 2006-12-27 2011-07-06 Gen Instrument Corp Method and apparatus for bit rate reduction in video telephony
EP2127110A2 (en) * 2006-12-27 2009-12-02 General instrument Corporation Method and apparatus for bit rate reduction in video telephony
CN102170552A (en) * 2010-02-25 2011-08-31 株式会社理光 Video conference system and processing method used therein
WO2015006176A1 (en) * 2013-07-10 2015-01-15 Microsoft Corporation Region-of-interest aware video coding
US9167255B2 (en) 2013-07-10 2015-10-20 Microsoft Technology Licensing, Llc Region-of-interest aware video coding
US9516325B2 (en) 2013-07-10 2016-12-06 Microsoft Technology Licensing, Llc Region-of-interest aware video coding
EP3192262A4 (en) * 2014-09-12 2018-08-01 TMM Inc. Systems and methods for subject-oriented compression

Also Published As

Publication number Publication date
JP2002525989A (en) 2002-08-13
KR20000023278A (en) 2000-04-25
US6256423B1 (en) 2001-07-03
KR20060131699A (en) 2006-12-20
EP1114558A1 (en) 2001-07-11

Similar Documents

Publication Publication Date Title
US6256423B1 (en) Intra-frame quantizer selection for video compression
US6539124B2 (en) Quantizer selection based on region complexities derived using a rate distortion model
Ribas-Corbera et al. Rate control in DCT video coding for low-delay communications
JP4187405B2 (en) Object-based rate control apparatus and method in coding system
Sullivan et al. Rate-distortion optimization for video compression
US6125147A (en) Method and apparatus for reducing breathing artifacts in compressed video
JP4391809B2 (en) System and method for adaptively encoding a sequence of images
EP1675402A1 (en) Optimisation of a quantisation matrix for image and video coding
US7095784B2 (en) Method and apparatus for moving picture compression rate control using bit allocation with initial quantization step size estimation at picture level
US20060227868A1 (en) System and method of reduced-temporal-resolution update for video coding and quality control
JP4410245B2 (en) How to transcode video
WO1998037701A1 (en) Apparatus and method for optimizing the rate control in a coding system
JP4391810B2 (en) System and method for adaptively encoding a sequence of images
US7373004B2 (en) Apparatus for constant quality rate control in video compression and target bit allocator thereof
Ngan et al. Improved single-video-object rate control for MPEG-4
US7133448B2 (en) Method and apparatus for rate control in moving picture video compression
Yin et al. A rate control scheme for H. 264 video under low bandwidth channel
EP1675405A1 (en) Optimisation of a quantisation matrix for image and video coding
EP1639832A1 (en) Method for preventing noise when coding macroblocks
Chen et al. Encoder Control Enhancement in HEVC Based on R-Lambda Coefficient Distribution
Jang et al. Adaptive rate control algorithm for DPCM/DCT hybrid video codec adopting bidirectional prediction
Choi et al. Adaptive image quantization using total variation classification
JP4292658B2 (en) Image information conversion apparatus and image information conversion method
Jiang Adaptive rate control for advanced video coding
Dong et al. Enhanced linear RQ model based rate control for H. 264/AVC using context-adaptive parameter estimation

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): BR CA CN IN JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 1999948365

Country of ref document: EP

ENP Entry into the national phase

Ref country code: JP

Ref document number: 2000 571666

Kind code of ref document: A

Format of ref document f/p: F

WWP Wipo information: published in national office

Ref document number: 1999948365

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1999948365

Country of ref document: EP