US20060013317A1 - Method for encoding and decoding video information, a motion compensated video encoder and a coresponding decoder - Google Patents

Method for encoding and decoding video information, a motion compensated video encoder and a coresponding decoder Download PDF

Info

Publication number
US20060013317A1
US20060013317A1 US11/219,917 US21991705A US2006013317A1 US 20060013317 A1 US20060013317 A1 US 20060013317A1 US 21991705 A US21991705 A US 21991705A US 2006013317 A1 US2006013317 A1 US 2006013317A1
Authority
US
United States
Prior art keywords
prediction
macroblock
motion
segmentation
blocks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/219,917
Inventor
Jani Lainema
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Oyj
Original Assignee
Nokia Oyj
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=24261115&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=US20060013317(A1) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by Nokia Oyj filed Critical Nokia Oyj
Priority to US11/219,917 priority Critical patent/US20060013317A1/en
Publication of US20060013317A1 publication Critical patent/US20060013317A1/en
Assigned to NOKIA CORPORATION reassignment NOKIA CORPORATION MERGER (SEE DOCUMENT FOR DETAILS). Assignors: NOKIA MOBILE PHONES LTD.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/19Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding using optimisation based on Lagrange multipliers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/537Motion estimation other than block-based
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/56Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/567Motion estimation based on rate distortion criteria

Definitions

  • the present invention relates to video coding.
  • it relates to compression of video information using motion compensated prediction.
  • a video sequence typically consists of a large number video frames, which are formed of a large number of pixels each of which is represented by a set of digital bits. Because of the large number of pixels in a video frame and the large number of video frames even in a typical video sequence, the amount of data required to represent the video sequence quickly becomes large.
  • a video frame may include an array of 640 by 480 pixels, each pixel having an RGB (red, green, blue) color representation of eight bits per color component, totaling 7,372,800 bits per frame.
  • RGB red, green, blue
  • Another example is a QCIF (quarter common intermediate format) video frame including 176 ⁇ 144 pixels. QCIF provides an acceptably sharp image on small (a few square centimeters) LCD displays, which are typically available in mobile communication devices. Again, if the color of each pixel is represented using eight bits per color component, the total number of bits per frame is 608,256.
  • a video frame can be presented using a related luminance/chrominance model, known as the YUV color model.
  • the human visual system is more sensitive to intensity (luminance) variations that it is to color (chrominance) variations.
  • the YUV color model exploits this property by representing an image in terms of a luminance component Y and two chrominance components U, V, and by using a lower resolution for the chrominance components than for the luminance component. In this way the amount of information needed to code the color information in an image can be reduced with an acceptable reduction in image quality.
  • the lower resolution of the chrominance components is usually attained by spatial sub-sampling.
  • a block of 16 ⁇ 16 pixels in the image is coded by one block of 16 ⁇ 16 pixels representing the luminance information and by one block of 8 ⁇ 8 pixels for each chrominance component.
  • the chrominance components are thus sub-samples by a factor of 2 in the x and y directions.
  • the resulting assembly of one 16 ⁇ 16 pixel luminance block and two 8 ⁇ 8 pixel chrominance blocks is here referred to as a YUV macroblock.
  • a QCIF image comprises 11 ⁇ 9 YUV macroblocks.
  • the number of bits needed to represent a video frame is thus 99 ⁇ 3072-304,128 bits.
  • the amount of data needed to transmit information about each pixel in each frame separately would thus be more than 4 Mbps (million bits per second).
  • the transmission bit rates are typically multiples of 64 kilobits/s.
  • the available transmission bit rates can be as low as 20 kilobits/s. Therefore it is clearly evident that methods are required whereby the amount of information used to represent a video sequence can be reduced. Video coding tackles the problem of reducing the amount of information that needs to be transmitted in order to present the video sequence with an acceptable image quality.
  • Video sequences In typical video sequences the change of the content of successive frames is to a great extent the result of the motion in the scene. This motion may be due to camera motion or due to motion of the objects present in the scene. Therefore, typical video sequences are characterized by significant temporal correlation, which is highest along the trajectory of the motion. Efficient compression of video sequences usually takes advantage of this property of video sequences.
  • Motion compensated prediction is a widely recognized technique for compression of video. It utilizes the fact that in a typical video sequence, image intensity/chrominance values in a particular frame segment can be predicted using image intensity/chrominance values of a segment in some other already coded and transmitted frame, given the motion trajectory between these two frames. Occasionally, it is advisable to transmit a frame that is coded without reference to any other frames, to prevent deterioration of image quality due to accumulation of errors and to provide additional functionality such as random access to the video sequence. Such a frame is called an INTRA frame.
  • FIGS. 1 and 2 A schematic diagram of an example video coding system using motion compensated prediction is shown in FIGS. 1 and 2 of the accompanying drawings.
  • FIG. 1 illustrates an encoder 10 employing motion compensation
  • FIG. 2 illustrates a corresponding decoder 20 .
  • the operating principle of video coders using motion compensation is to minimize the prediction error frame E n (x,y), which is the difference between the current frame I n (x,y) being coded and a prediction fame P n (x, y).
  • the prediction frame P n (x,y) is built using pixel values of a reference frame R n (x,y), which is one of the previously coded and transmitted frames (for example, a frame preceding the current frame), and the motion of pixels between the current frame and the reference frame. More precisely, the prediction frame is constructed by finding the prediction pixels in the reference frame R n (x,y) and moving the prediction pixels as the motion information specifies.
  • the motion of the pixels may be presented as the values of horizontal and vertical displacements ⁇ x(x,y) and ⁇ y(x,y) of a pixel at location (x,y) in the current frame I n (x,y)
  • the pair of numbers [ ⁇ x(x,y), ⁇ y(x,y)] is called the motion vector of this pixel.
  • the motion vectors [ ⁇ x(x,y), ⁇ y(x,y)] are calculated in the Motion Field Estimation block 11 in the encoder 10 .
  • the set of motion vectors of all pixels of the current frame [ ⁇ x( ⁇ ), ⁇ y( ⁇ )] is called the motion vector field. Due to the very large number of pixels in a frame it is not efficient to transmit a separate motion vector for each pixel to the decoder. Instead, in most video coding schemes the current frame is divided into larger image segments S k and information about the segments is transmitted to the decoder.
  • Motion Field Coding refers to the process of representing the motion in a frame using some predetermined functions or, in other words, representing it with a model. Almost all of the motion vector field models commonly used are additive motion models.
  • Functions ⁇ 1 and g 1 are called motion field basis functions, and they are known both to the encoder and decoder.
  • An approximate motion vector field ( ⁇ tilde over ( ⁇ ) ⁇ (x,y), ⁇ tilde over ( ⁇ ) ⁇ y(x,y)) can be constructed using the coefficients and the basis functions.
  • the prediction error frame E n (x,y) is typically compressed by representing it as a finite series (transform) of some 2-dimensional functions.
  • a 2-dimensional Discrete Cosine Transform can be used.
  • the transform coefficients related to each function are quantized and entropy coded before they are transmitted to the decoder (information stream 1 in FIGS. 1 and 2 ). Because of the error introduced by quantization, this operation usually produces some degradation in the prediction error frame E n (x,y).
  • a motion compensated encoder comprises a Prediction Error Decoding block 15 , where the a decoded prediction error frame ⁇ tilde over (E) ⁇ n (x,y) is constructed using the transform coefficients.
  • This decoded prediction error frame is added to the prediction frame P n (x,y) and the resulting decoded current frame ⁇ n (x,y) is stored to the Frame Memory 17 for further use as the next reference frame R n+1 (x,y).
  • the information stream 2 carrying information about the motion vectors is combined with information about the prediction error in the multiplexer 16 and an information stream ( 3 ) containing typically at least those two types of information is sent to the decoder 20 .
  • the prediction frame P n (x,y) is constructed in the Motion Compensated Prediction block 21 in the decoder 20 similarly as in the Motion Compensated Prediction block 13 in the encoder 10 .
  • the transmitted transform coefficients of the prediction error frame E n (x,y) are used in the Prediction Error Decoding block 22 to construct the decoded prediction error frame ⁇ tilde over (E) ⁇ n (x,y).
  • This decoded current frame may be stored in the Frame Memory 24 as the next reference frame R n+1 (x,y).
  • the motion field is expressed as a sum of a prediction motion field and a refinement motion field.
  • the prediction motion field is constructed using the motion vectors associated with neighboring segments of the current frame.
  • the prediction is performed using the same set of rules and possibly some auxiliary information in both encoder and decoder.
  • the refinement motion field is coded, and the motion coefficients related to this refinement motion field are transmitted to the decoder. This approach typically results in savings in transmission bit rate.
  • the dashed lines in FIG. 1 illustrate some examples of the possible information some motion estimation and coding schemes may require in the Motion Field Estimation block 11 and in the Motion Field Coding block 12 .
  • Polynomial motion models are a widely used family of models. (See, for example H. Nguyen and E. Dubois, “Representation of motion information for image coding,” in Proc. Picture Coding Symposium ' 90, Cambridge, Massachusetts, Mar. 26-18, 1990, pp. 841-845 and Centre de Morphologie Mathematique (CMM), “Segmentation algorithm by multicriteria region merging,” Document SIM(95)19, COST 211ter Project Meeting, May 1995).
  • CCMM Mathematique
  • motion vectors are described by functions which are linear combinations of two dimensional polynomial functions.
  • the translational motion model is the simplest model and requires only two coefficients to describe the motion vectors of each segment.
  • This model is widely used in various international standards (ISO MPEG-1, MPEG-2, MPEG-4, ITU-T Recommendations H.261 and H.263) to describe motion of 16 ⁇ 16 and 8 ⁇ 8 pixel blocks.
  • Systems utilizing a translational motion model typically perform motion estimation at full pixel resolution or some integer fraction of full pixel resolution, for example with an accuracy of 1 ⁇ 2 or 1 ⁇ 3 pixel resolution.
  • the affine motion model presents a very convenient trade-off between the number of motion coefficients and prediction performance. It is capable of representing some of the common real-life motion types such as translation, rotation, zoom and shear with only a few coefficients.
  • the quadratic motion model provides good prediction performance, but it is less popular in coding than the affine model, since it uses more motion coefficients, while the prediction performance is not substantially better than, for example, that of the affine motion model. Furthermore, it is computationally more costly to estimate the quadratic motion than to estimate the affine motion.
  • the Motion Field Estimation block 11 calculates initial motion coefficients a 0 i , . . . , a n i , b 0 i , . . . , b n i , for [ ⁇ x(x,y), ⁇ y(x,y)]a given segment S k , which initial motion coefficients minimize some measure of prediction error in the segment.
  • the motion field estimation uses the current frame I n (x,y) and the reference frame R n (x,y) as input values.
  • the Motion Field Estimation block outputs the [ ⁇ x(x,y), ⁇ y(x, y)]initial motion coefficients to the Motion Field Coding block 12 .
  • the segmentation of the current frame into segments S k can, for example, be carried out in such a way that each segment corresponds to a certain object moving in the video sequence, but this kind of segmentation is a very complex procedure.
  • a typical and computationally less complex way to segment a video frame is to divide it into macroblocks and to further divide the macroblocks into rectangular blocks.
  • teem macroblock refers generally to a part of a video frame.
  • An example of a macroblock is the previously described YUV macroblock.
  • FIG. 3 presents an example, where a video frame 30 is to divided into macroblocks 31 having a certain number of pixels. Depending on the encoding method, there may be many possible macroblock segmentations.
  • FIG. 3 presents an example, where a video frame 30 is to divided into macroblocks 31 having a certain number of pixels. Depending on the encoding method, there may be many possible macroblock segmentations.
  • FIG. 3 presents an example, where a video frame 30 is to divided into macroblocks 31
  • macroblock 31 A is segmented into blocks 32
  • macroblock 31 B is segmented with a horizontal dividing line into blocks 33
  • macroblock 31 C is segmented with a vertical dividing line into blocks 34 .
  • the fourth possible segmentation is to treat a macroblock as a single block.
  • the macroblock segmentations presented in FIG. 3 are given as examples; they are by no means an exhaustive listing of possible or feasible macroblock segmentations.
  • the Motion Field Coding block 12 makes the final decisions on what kind of motion vector field is transmitted to the decoder and how the motion vector field is coded. It can modify the segmentation of the current frame, the motion model and motion coefficients in order to minimize the amount of information needed to describe a satisfactory motion vector field.
  • the decision on segmentation is typically carried out by estimating a cost of each alternative macroblock segmentation and by choosing the one yielding the smallest cost.
  • the Lagrangian cost represents a trade-off between the quality of transmitted video information and the bandwidth needed in transmission.
  • a better image quality i.e. small D(S k )
  • requires a larger amount of transmitted information i.e. large R(S k ).
  • prediction motion coefficients are typically formed by calculating the median of surrounding, already transmitted motion coefficients. This method achieves fairly good performance in terms of efficient use of transmission bandwidth and image quality.
  • the main advantage of this method is that the prediction of motion coefficients is straightforward.
  • the segment selected for the prediction is signaled to the decoder.
  • the main drawback of this method is that finding the best prediction candidate among the already transmitted image segments is a complex task: the encoder has to perform exhaustive calculations to evaluate all the possible prediction candidates and then select the best prediction block. This procedure has to be carried out separately for each block.
  • An object of the present invention is to provide a method that provides a flexible and versatile motion coefficient prediction for encoding/decoding video information using motion compensation.
  • a further object of the invention is to provide a motion compensated method for coding/decoding video information that provides good performance in terms of transmission bandwidth and image quality while being computationally fairly simple.
  • a further object is to present a method for encoding/decoding video information that provides satisfactory results when a comparatively simple motion model, such as the translational motion model, is used.
  • a method for encoding video information according to the invention comprises the steps of:
  • a piece of current video information is segmented into macroblocks.
  • These macroblocks can have any predetermined shape, but typically they are quadrilateral. Furthermore, a certain number of possible segmentations of the macroblocks into blocks is defined, and these are called the available macroblock segmentations.
  • the segmentation of a macroblock into blocks is in this description called macroblock segmentation.
  • the blocks are also typically quadrilateral.
  • the motion of a block within a piece of current video information is typically estimated using a piece of reference video information (typically a reference frame), and the motion of the block is usually modeled using a set of basis functions and motion coefficients.
  • the motion model used in a method according to the invention is advantageously a translational motion model, but there are no restrictions on the use of any other motion model.
  • at least some motion coefficients are represented as sums of prediction motion coefficients and difference motion coefficients and a certain prediction method is used to determine the prediction motion coefficients.
  • a piece of current video information is encoded by segmenting a frame into macroblocks and then processing the macroblocks in a certain scanning order, for example one by one from left-to-right and top-to-bottom throughout the frame. In other words, in this example the encoding process is performed in rows, progressing from top to bottom.
  • the way in which the macroblocks are scanned is not restricted by the invention.
  • a macroblock may be segmented, and the motion field of blocks within a macroblock is estimated. Prediction motion coefficients for a certain block are produced using the motion coefficients of some of the blocks in the already processed neighboring macroblocks or the motion coefficients of some of the already processed blocks within the same macroblock.
  • the segmentation of the already processed macroblocks and the motion coefficients of the blocks relating to these macroblocks are already known.
  • a distinctive feature in encoding and decoding methods according to the invention is that for each macroblock segmentation there is a finite number of prediction methods. Certain predetermined allowable pairs of the macroblock segmentations and prediction methods are thus formed.
  • prediction method refers to two issues: firstly, it defines which blocks are used in producing the prediction motion coefficients for a certain block within a current macroblock and, secondly, it defines how the motion coefficients related to these prediction blocks are used in producing the prediction motion coefficients for said block.
  • a macroblock-segmentation—prediction-method pair indicates unambiguously both the segmentation of a macroblock and how the prediction motion coefficients for the blocks within the macroblock are produced.
  • the prediction method may specify, for example, that prediction motion coefficients for a block are derived from an average calculated using motion coefficients of certain specific prediction blocks, or that prediction motion coefficients for a block are derived from the motion coefficient of one particular prediction block.
  • the word average here refers to a characteristic describing a certain set of numbers; it may be, for example, an arithmetic mean, a geometric mean, a weighted mean, a median or a mode.
  • the prediction coefficients of a block are obtained by projecting motion coefficients or average motion coefficients from one block to another.
  • the complexity of the encoding process is reduced compared, for example, to an encoding process where the best prediction motion coefficient candidate is determined freely using any neighboring blocks or combinations thereof. In such a case, there is a large number of prediction motion coefficient candidates.
  • the prediction blocks are defined beforehand for each prediction method and there is a limited number of prediction methods per macroblock segmentation, it is possible to estimate the cost of each macroblock-segmentation—prediction-method pair. The pair minimizing the cost can then be selected.
  • the available prediction blocks and defining the macroblock-segmentation-specific prediction methods suitably, it is possible to implement a high performance video encoding method using at most three predetermined prediction blocks to produce prediction motion coefficients and allowing only one prediction method per macroblock segmentation. For each macroblock, the macroblock-segmentation—prediction-method pair minimizing a cost function is selected.
  • the simple adaptive encoding of motion information provided by the invention is efficient in terms of computation and in terms of the amount of transmitted information and further more yields good image quality.
  • a macroblock, which is processed in a method according to the invention may be, for example, the luminance component of an YUV macroblock.
  • a method according to the invention may also be applied, for example, to the luminance component and to one or both of the chrominance components of an YUV macroblock.
  • a method for decoding encoded video information according to the invention comprises the steps of:
  • a decoder for performing the decoding of encoded video information according to the invention comprises:
  • the invention also relates to a storage device and a network element comprising an encoder according to the invention and to a mobile station comprising an encoder or a decoder according to the invention.
  • FIG. 1 illustrates an encoder for motion compensated encoding of video according to prior art
  • FIG. 2 illustrates a decoder for motion compensated decoding of video according to prior art
  • FIG. 3 illustrates a segmentation of a video fame into macroblocks and blocks according to prior art
  • FIG. 4 illustrates a flowchart of a motion compensated video encoding method according to the invention
  • FIG. 5 illustrates a flowchart of a motion compensated video decoding method according to the invention
  • FIG. 6 illustrates various prediction methods that involve different prediction blocks and that can be used to provide prediction motion coefficients for a current block C in a method according to the invention
  • FIG. 7 illustrates a plurality of macroblock-segmentation—prediction-method pairs that can be used in a method according to a first preferred embodiment of the invention
  • FIG. 8 illustrates a plurality of macroblock-segmentation—prediction-method pairs that can be used in a method according to a second preferred embodiment of the invention
  • FIG. 9 illustrates a motion field estimation block and a motion field coding block according to the invention.
  • FIG. 10 illustrates a motion compensated prediction block according to the invention
  • FIG. 11 illustrates a mobile station according to the invention
  • FIG. 12 illustrates schematically a mobile telecommunication network comprising a network element according to the invention.
  • FIGS. 1-3 are discussed in detail in the description of motion compensated video encoding and decoding according to prior art.
  • FIG. 4 presents a flowchart of a method for encoding video information according to the invention. Only features related to motion encoding are presented in FIG. 4 , it does not present, for example, the formation or coding of the prediction error frame. Typically these features are included in encoding methods according to the invention and, of course, may be implemented in any appropriate manner.
  • step 401 the available macroblock segmentations are defined.
  • the available macroblock segmentations can comprise, for example, such macroblock segmentations as presented in FIG. 3 .
  • step 402 at least one prediction method for predicting motion coefficients is defined for each available macroblock segmentation, resulting in a certain number of available macroblock-segmentation—prediction-method pairs.
  • an average prediction method is used and for other macroblock segmentations the prediction motion coefficients are derived from the motion coefficients of a single already processed block which is located either in the current macroblock or in one of the neighboring macroblocks.
  • Advantageous prediction methods related to each macroblock segmentation can be found, for example, by testing various prediction methods beforehand.
  • the motion model used to represent the motion field may affect the selection of the prediction methods.
  • it is possible that a suitable motion model is selected during the encoding.
  • steps 401 and 402 are carried out offline, before encoding video streams. Usually they are carried out already when, for example, an encoder is designed and implemented.
  • Steps 403 - 413 are carried out for each frame of a video stream.
  • a current video frame is segmented into macroblocks, and in step 404 encoding of a current macroblock which is the macroblock currently undergoing motion compensated encoding, starts.
  • the current macroblock is segmented into blocks using one of the available macroblock segmentations. At this point there necessarily is no idea of which is the most appropriate macroblock segmentation for the current macroblock, so one way to select the best macroblock segmentation is to investigate them all and then select the most appropriate according to some criterion.
  • step 406 the motion vector fields of the blocks within the current macroblock are estimated and the motion fields are coded. This results in motion coefficients a i and b i for each of said blocks.
  • step 407 prediction motion coefficients a ip and b ip , for at least one of the blocks with He current macroblock are produced. If there is only one prediction method per macroblock segmentation, this is a straightforward task. Otherwise one of the prediction methods available for the current macroblock segmentation is selected and the prediction motion coefficients are derived according to this prediction method.
  • step 408 the motion coefficients of the blocks within current macroblock are represented as sums of the prediction motion coefficients and difference motion coefficients a id and b id .
  • step 408 the cost L(S k ) related to current macroblock-segmentation—prediction-method pair is calculated.
  • This cost represents the trade-off between the reconstruction error of the decoded image and the number of bits needed to transmit the encoded image, and it links a measure of the reconstruction error D(S k ) with a measure of bits needed for transmission R(S k ) using a Lagrangian multiple ⁇ .
  • the transmission R(S k ) refers to bits required to represent at least the difference motion coefficients and bits required to represent the associated prediction error. It may also involve some signaling information.
  • step 410 the macroblock-segmentation—prediction-method pair yielding the smallest cost is selected.
  • step 412 information indicating the selected macroblock-segmentation prediction-method pair for the current macroblock and the difference motion coefficients a id and b id of at least one of the blocks within the current macroblock are transmitted to a receiver or stored into a storage medium.
  • the information indicating the selected macroblock-segmentation—prediction-method pair may, for example, indicate explicitly both the macroblock segmentation and the prediction method. If there is only one possible prediction method per macroblock segmentation, it can be enough to transmit information indicating only the macroblock segmentation of the current block.
  • step 413 it is checked, if all the macroblocks within the current frame are processed. If they are not, then in step 404 the processing of next macroblock is started.
  • FIG. 5 presents a flowchart of a method for decoding an encoded video stream according to the invention.
  • step 501 information about the available macroblock segmentations is specified, for example by retrieving the information from memory element where it has been previously stored. The decoding method needs to know which kind of macroblock segmentations a received encoded video stream can comprise.
  • step 502 information about the available macroblock-segmentation—prediction-method pairs is specified. Steps 501 and 502 are typically carried out off-line, before receiving an encoded video stream. They may be carried out, for example, during the design of implementation of a decoder.
  • Steps 503 - 507 are carried out during decoding.
  • information indicating the segmentation of a current macroblock and prediction method is received. If there is only one available prediction method per macroblock segmentation, information indicating the prediction method is not needed as previously explained.
  • information indicating difference motion coefficients a id and b id for at least one of the blocks within the current macroblock are received.
  • the decoding entity determines, using the information received in step 503 , the prediction method using which the prediction motion coefficient for blocks within the current macroblock are to be produced. The prediction method indicates the prediction blocks related to a certain block and how prediction coefficients for the current block are produced using the motion coefficients of the prediction blocks.
  • step 506 the prediction motion coefficients a ip and b ip are produced, and in step 507 the motion coefficients a i and b i are produced using the difference motion coefficients and the prediction motion coefficients.
  • FIG. 6 presents schematically four different prediction methods 60 A, 60 B, 60 C and 60 D for providing prediction motion coefficients for a current block C.
  • These four prediction methods are given as examples of prediction methods that may be used in a method according to the invention, and the prediction blocks (i.e. those blocks that are used to from prediction motion coefficients for the current block) are defined according to their spatial relationship with the current block C.
  • the prediction blocks are dictated by certain pixel locations. These pixel locations are just one way of specifying the prediction blocks for a current block, and they are described here to aid the understanding of how the prediction blocks are selected in certain prediction methods. In the methods which are presented in FIG. 6 , the pixel locations are the same for all the methods.
  • Prediction block L is defined as the block which comprises the pixel location 61 .
  • Pixel location 61 is the uppermost pixel adjacent to block C from the left-hand side.
  • prediction block U is defined as the block comprising pixel location 62 , which is the leftmost pixel superjacent to block C.
  • prediction block UR is defined as the block comprising the pixel location 63 , which is the pixel corner to corner with the top right corner pixel of block C.
  • the prediction motion coefficients a 1p , b 1p provided for block C may be derived from an average of the motion coefficients of the L, U and UR prediction blocks. The average may be, for example, the median of the motion coefficient values of block L, U and UR.
  • the prediction motion coefficients are derived from the motion coefficients of prediction block L.
  • the prediction motion coefficients are derived from the motion coefficients of prediction block U and in the fourth prediction method they are derived from the motion coefficients of prediction block UR.
  • the segmentation of the neighboring macroblocks presented in FIG. 6 for prediction method 60 A is just an example.
  • the prediction blocks are defined by pixel locations as presented in FIG. 6
  • the prediction blocks can be determined unambiguously in spite of the macroblock segmentation of the neighboring macroblocks or of the current macroblock.
  • the three pixel locations in FIG. 6 are an example, the number of pixels can be different and they can be located at other places.
  • the pixel locations specifying the prediction blocks are associated with a current block C and they are at the edge of the current block C.
  • FIG. 7 illustrates schematically three macroblock segmentations 70 , 71 and 72 , which are an example of the available macroblock segmentations in a first preferred embodiment of the invention.
  • macroblock segmentation 70 the rectangular macroblock is actually not segmented, but is treated as a single block.
  • macroblock segmentation 71 the macroblock is divided with one vertical line into two rectangular blocks.
  • macroblock segmentation 72 the macroblock is divided with one horizontal line into two rectangular blocks.
  • the macroblock size may be 16 ⁇ 16 pixels and translational motion model, for example, may be used.
  • FIG. 7 furthermore illustrates some examples of prediction method alternatives related to the macroblock segmentations in a method according to the first preferred embodiment.
  • the prediction blocks for blocks within a current macroblock are specified using certain pixel locations which bear a spatial relationship to the blocks within the current macroblock.
  • the pixel locations in FIG. 7 are the same as in FIG. 6 .
  • the prediction coefficients for the single block that comprises the current macroblock can be derived using an average of the motion coefficients of the L, U and U prediction blocks (macroblock-segmentation—prediction-method pair 70 A), or they can be derived from the motion coefficients of prediction block L (pair 70 B), prediction block U (pair 70 C) or prediction block UR (pair 70 D).
  • FIG. 7 also presents some prediction method alternatives for example macroblock segmentations 71 and 72 .
  • each block within a macroblock preferably has its own associated prediction blocks.
  • the blocks within the current macroblock, which are already processed, may themselves act as prediction blocks for other blocks within the same macroblock.
  • prediction motion coefficients for each block C 1 and C 2 within the current macroblock are derived from an average of the motion coefficients of the block-specific prediction blocks.
  • this prediction method block C 1 acts as a prediction block for the block C 2 .
  • the macroblock-segmentation—prediction-method pairs 71 B, 71 C, 71 D and 71 E are further examples of possible prediction methods related to the macroblock segmentation 71 .
  • various prediction method alternatives are presented for macroblock segmentation 72 .
  • the Lagrangian cost function for each of the macroblock-segmentation—prediction-method pairs 70 A, 70 B, 70 C, 70 D, 71 A, 71 B, 71 C, 71 D, 71 E, 72 A, 72 B, 72 C and 72 D is evaluated and then the pair minimizing the cost function is chosen as the actual macroblock segmentation used in encoding the macroblock, as described above in connection with an encoding method according to the invention.
  • the segmentation of the neighboring macroblocks affects the number of the macroblock-segmentation—prediction-method pairs available for the current macroblock.
  • the segmentation of the neighboring macroblocks may lead to a situation in which that some of the pairs illustrated in FIG. 7 cannot be used for a current macroblock or where some extra macroblock-segmentation—prediction-method pairs are available for the current macroblock.
  • the decoding entity can conclude the prediction method from the segmentation of the previously received macroblocks when, for example, a method according to the first preferred embodiment of the invention is used.
  • FIG. 8 illustrates an example of a plurality of macroblock-segmentation—prediction-method pairs that can be used in a method according to the second preferred embodiment.
  • FIG. 8 illustrates six possible macroblock segmentations: single block (macroblock segmentation 70 ), macroblock is divided once with a vertical dividing line ( 71 ) or with a horizontal dividing line ( 72 ), macroblock is divided once with a vertical dividing line and once with a horizontal dividing line ( 83 ), macroblock is divided once with a vertical deviding line and thrice with a horizontal dividing line ( 84 ) and thrice with a vertical dividing line and once with a horizontal dividing line ( 85 ).
  • the small black squares in FIG. 8 illustrate schematically the prediction methods.
  • prediction method 70 A is associated with macroblock segmentation 70
  • prediction method 71 B is used with macroblock segmentation 71
  • prediction method 72 B is used with macroblock segmentation 72 .
  • the selection of these macroblock-segmentation—prediction method pairs is quite intuitive, When the current macroblock is segmented using macroblock segmentation 71 , it is reasonable to expect that the left block C 1 and the right block C 2 of the macroblock move somehow differently. It is quite natural to assume that the left block C 1 would move in a similar way to the prediction block L and to derive the prediction motion coefficients for block C 1 from the motion coefficients of prediction block L of block C 1 .
  • the prediction motion coefficients for each block within the current macroblock are derived as average values using three prediction blocks.
  • the prediction motion coefficients for block C 4 are derived using blocks C 1 , C 2 and C 3 within the current macroblock.
  • the prediction motion coefficients for blocks C 1 , C 3 , C 5 and C 7 related to macroblock segmentation 84 are derived as averages of the prediction blocks, as specified in FIG. 8 .
  • prediction motion coefficients are derived from the motion coefficients of the block on the left hand side of each block, i.e. bock C 1 , C 3 , C 5 and C 7 of the current macroblock, respectively.
  • the prediction motion coefficients for the blocks relating to macroblock segmentation 85 are produced as averages, as specified in FIG. 8 . Again, there is no UR prediction block available for block C 8 in macroblock segmentation 85 , and therefore blocks C 3 , C 4 and C 7 within the same macroblock are used in producing prediction motion coefficients for that block.
  • a second sensible alternative for the prediction method related to macroblock segmentation 85 is, for example, median prediction for the blocks in the upper row of the macroblock 85 and subsequent use of the motion coefficients of these blocks to derive prediction motion coefficients for the blocks in the lower row.
  • the number of prediction blocks and the choice of blocks to be used as prediction blocks may further depend on the position of the current macroblock in the frame and on the scanning order of the blocks/macroblocks within the frame. For example, if the encoding process starts from the top left-hand corner of the frame, the block in the top left-hand corner of the frame has no available prediction blocks. Therefore the prediction motion coefficients for this block are usually zero. For the blocks in the upper frame boundary, prediction using a prediction block to the left (prediction block L) is usually applied. For the blocks in the left-hand frame boundary, there are no left (L) prediction blocks available. The motion coefficients of these blocks may be assumed to be zero, if an average prediction is used for the blocks at the left frame boundary.
  • the upper right (UR) prediction block is missing.
  • the prediction motion coefficients for these blocks can be derived, for example, in a manner similar to that described in connection with block C 4 of macroblock segmentation 83 in FIG. 8 .
  • prediction methods used in a method according to the invention are not restricted median prediction or single block predictions. They are presented in the foregoing description as examples. Furthermore, any of the already processed blocks can be used in constructing the prediction motion field/coefficients for a certain block.
  • the macroblock-segmentation—prediction-method pairs discussed above are also presented as examples of feasible pairs. In a method according to the invention the macroblock segmentations, prediction methods and mapping between the macroblock segmentations and prediction methods may be different from those described above.
  • FIG. 9 illustrates an example of a Motion Field Estimation block 11 ′ and a Motion Field Coding block 12 ′ according to the invention.
  • FIG. 10 illustrates an example of a Motion Compensated Prediction block 13 ′/ 21 ′ according to the invention.
  • An encoder according to the invention typically comprises all these blocks, and a decoder according to the invention typically comprises a Motion Compensated Prediction block 21 ′.
  • the Motion Field Coding block 11 ′ there is a Macroblock Segmentation block 111 , which segments an incoming macroblock into blocks.
  • the Available Macroblock Segmentations block 112 comprises information about the possible macroblock segmentations S k .
  • FIG. 9 the number of possible macroblock segmentations is illustrated by presenting each segmentation as a arrow heading away from the Macroblock Segmentation block 111 .
  • the various macroblock segmentations are processed in a Motion Vector Field Estimation block 113 , and the initial motion coefficients a 0 i , . . . , a n i , b 0 i , . . .
  • the Motion Vector Field Coding block 121 codes the estimated motion fields relating to each segmentation.
  • the Segmentation—Prediction Method Mapping block 122 is responsible for indicating to the Prediction Motion Field block 123 the correct prediction method related to each macroblock segmentation.
  • the Difference Motion Coefficient Construction block 124 the motion fields of the blocks are presented as difference motion coefficients.
  • the costs of the macroblock-segmentation—prediction-method pairs are calculated in the Macroblock Segmentation Selection block 125 , and the most appropriate macroblock-segmentation—prediction-method pair is selected.
  • the difference motion coefficients and some information indicating the selected segmentation are transmitted further.
  • the information indicating the selected segmentation may also be implicit. For example, if there is only one macroblock segmentation producing four blocks and the format of the transmitted data reveals to the receiver that it is receiving four pairs of difference motion coefficients relating to a certain macroblock, it can determine the correct segmentation. If there are various available prediction methods per macroblock segmentation, there may be a need to transmit some information that also indicates the selected prediction method. Information about the prediction error frame is typically also transmitted to the decoder, to enable an accurate reconstruction of the image.
  • the Motion Compensated Prediction block 13 ′/ 21 ′ receives information about difference motion coefficients and (implicit or explicit) information about the segmentation of a macroblock. It may also receive information about the selected prediction method if there is more than one prediction method available per macroblock segmentation.
  • the segmentation information is used to produce correct prediction motion coefficients in the Prediction Motion Coefficient Construction block 131 .
  • the Segmentation—Prediction Method Mapping block 132 is used to store information about the allowed pairs of macroblock segmentations and prediction methods.
  • the constructed prediction motion coefficients and received difference motion coefficients are used to construct the motion coefficients in the Motion Coefficient Construction block 133 .
  • the motion coefficients are transmitted further to a Motion Vector Field Decoding block 134 .
  • An encoder or a decoder according to the invention can be realized using hardware or software, or using a suitable combination of both.
  • An encoder or decoder implemented in software may be, for example, a separate program or a software building block that can be used by various programs.
  • the functional blocks are represented as separate units, but the functionality of these blocks can be implemented, for example, in one software program unit.
  • a codec may be a computer program or a computer program element, or it may implemented at least partly using hardware.
  • FIG. 11 shows a mobile station MS according to an embodiment of the invention.
  • a central processing unit, microprocessor ⁇ P controls the blocks responsible for different fictions of the mobile station a random access memory RAM, a radio frequency block RF, a read only memory ROM, a user interface UI having a display DPL and a keyboard KBD, and a digital camera block CAM.
  • the microprocessor's operating instructions, that is program code and the mobile station's basic functions have been stored in the mobile station in advance, for example during the manufacturing process, in the ROM.
  • the microprocessor uses the RF block or transmitting and receiving messages on a radio path.
  • the microprocessor monitors the state of the user interface UI and controls the digital camera block CAM.
  • microprocessor In response to a user command, microprocessor instructs the camera block CAM to record a digital image into the RAM. Once the image is captured or alternatively during the capturing process, the microprocessor segments the image into image segments and performs motion compensated encoding for the segments in order to generate a compressed image as explained in the foregoing description.
  • a user may command the mobile station to display the image on its display or to send the compressed image using the RF block to another mobile station, a wired telephone or another telecommunications device. In a preferred embodiment, such transmission of image data is started as soon as the first segment is encoded so that the recipient can start a corresponding decoding process with a minimum delay.
  • the mobile station comprises an encoder block ENC dedicated for encoding and possibly also for decoding of digital video data.
  • FIG. 12 is a schematic diagram of a mobile telecommunications network according to an embodiment of the invention.
  • Mobile stations MS are in communication with base stations BTS by means of a radio link.
  • the base stations BTS are further connected, through a so-called Abis interface, to a base station controller BSC, which controls and manages several base stations.
  • the entity formed by a number of base stations BTS (typically, by a few dozen base stations) and a single base station controller BSC, controlling the base stations, is called a base station subsystem BSS.
  • the base station controller BSC manages radio communication channels and handovers.
  • the base station controller BSC is connected, through a so-called.
  • a interface to a mobile services switching centre MSC, which co-ordinates the formation of connections to and from mobile stations.
  • a further connection is made, through the mobile service switching centre MSC, to outside the mobile communications network.
  • Outside the mobile communications network there may further reside other network(s) connected to the mobile communications network by gateway(s) GTW, for example the Internet or a Public Switched Telephone Network (PSTN).
  • PSTN Public Switched Telephone Network
  • GTW for example the Internet or a Public Switched Telephone Network (PSTN).
  • PSTN Public Switched Telephone Network
  • the mobile telecommunications network comprises a video server VSRVR to provide video data to a MS subscribing to such a service. This video data is compressed using the motion compensated video compression method as described earlier in this document.
  • the video server may function as a gateway to an online video source or it may comprise previously recorded yield clips.
  • Typical videotelephony applications may involve, for example, two mobile stations or one mobile station MS and a videotelephone connected to the PSTN, a PC connected to the Internet or a H.261 compatible terminal connected either to the Internet or to the PSTN.

Abstract

A method for encoding video information is presented, where a piece of current video information is segmented into macroblocks and a certain number of available macroblock segmentations for segmenting a macroblock into blocks is defined. Furthermore, for each available macroblock segmentation at least one available prediction method is defined, each of which prediction methods produces prediction motion coefficients for blocks within said macroblock resulting in a certain finite number of available macroblock-segmentation—prediction-method pairs. For a macroblock, one of the available macroblock-segmentation—prediction-method pairs is selected, and thereafter the macroblock is segmented into blocks and prediction motion coefficients for the blocks within said macroblock are produced using the selected macroblock-segmentation—prediction-method pair. A corresponding decoding method, an encoder and a decoder are also presented.

Description

  • The present invention relates to video coding. In particular, it relates to compression of video information using motion compensated prediction.
  • BACKGROUND OF THE INVENTION
  • A video sequence typically consists of a large number video frames, which are formed of a large number of pixels each of which is represented by a set of digital bits. Because of the large number of pixels in a video frame and the large number of video frames even in a typical video sequence, the amount of data required to represent the video sequence quickly becomes large. For instance, a video frame may include an array of 640 by 480 pixels, each pixel having an RGB (red, green, blue) color representation of eight bits per color component, totaling 7,372,800 bits per frame. Another example is a QCIF (quarter common intermediate format) video frame including 176×144 pixels. QCIF provides an acceptably sharp image on small (a few square centimeters) LCD displays, which are typically available in mobile communication devices. Again, if the color of each pixel is represented using eight bits per color component, the total number of bits per frame is 608,256.
  • Alternatively, a video frame can be presented using a related luminance/chrominance model, known as the YUV color model. The human visual system is more sensitive to intensity (luminance) variations that it is to color (chrominance) variations. The YUV color model exploits this property by representing an image in terms of a luminance component Y and two chrominance components U, V, and by using a lower resolution for the chrominance components than for the luminance component. In this way the amount of information needed to code the color information in an image can be reduced with an acceptable reduction in image quality. The lower resolution of the chrominance components is usually attained by spatial sub-sampling. Typically a block of 16×16 pixels in the image is coded by one block of 16×16 pixels representing the luminance information and by one block of 8×8 pixels for each chrominance component. The chrominance components are thus sub-samples by a factor of 2 in the x and y directions. The resulting assembly of one 16×16 pixel luminance block and two 8×8 pixel chrominance blocks is here referred to as a YUV macroblock. A QCIF image comprises 11×9 YUV macroblocks. The luminance blocks and chrominance blocks are represented with 8 bit resolution, and the total number of bits required per YUV macroblock is (16×16×8)+2x(8×8×8)=3072 bits. The number of bits needed to represent a video frame is thus 99×3072-304,128 bits.
  • In a video sequences comprising a sequence of frames in YUV coded QCIF format recorded/displayed at a rate of 15-30 frames per second, the amount of data needed to transmit information about each pixel in each frame separately would thus be more than 4 Mbps (million bits per second). In conventional videotelephony, where the encoded video information is transmitted using fixed-line telephone networks, the transmission bit rates are typically multiples of 64 kilobits/s. In mobile videotelephony, where transmission takes place at least in part over a radio communications ink, the available transmission bit rates can be as low as 20 kilobits/s. Therefore it is clearly evident that methods are required whereby the amount of information used to represent a video sequence can be reduced. Video coding tackles the problem of reducing the amount of information that needs to be transmitted in order to present the video sequence with an acceptable image quality.
  • In typical video sequences the change of the content of successive frames is to a great extent the result of the motion in the scene. This motion may be due to camera motion or due to motion of the objects present in the scene. Therefore, typical video sequences are characterized by significant temporal correlation, which is highest along the trajectory of the motion. Efficient compression of video sequences usually takes advantage of this property of video sequences. Motion compensated prediction is a widely recognized technique for compression of video. It utilizes the fact that in a typical video sequence, image intensity/chrominance values in a particular frame segment can be predicted using image intensity/chrominance values of a segment in some other already coded and transmitted frame, given the motion trajectory between these two frames. Occasionally, it is advisable to transmit a frame that is coded without reference to any other frames, to prevent deterioration of image quality due to accumulation of errors and to provide additional functionality such as random access to the video sequence. Such a frame is called an INTRA frame.
  • A schematic diagram of an example video coding system using motion compensated prediction is shown in FIGS. 1 and 2 of the accompanying drawings. FIG. 1 illustrates an encoder 10 employing motion compensation and FIG. 2 illustrates a corresponding decoder 20. The operating principle of video coders using motion compensation is to minimize the prediction error frame E n(x,y), which is the difference between the current frame In(x,y) being coded and a prediction fame Pn(x, y). The prediction error frame is thus
    En(x, y)=In(x, y)−Pn(x, y).  (1)
  • The prediction frame Pn(x,y) is built using pixel values of a reference frame Rn(x,y), which is one of the previously coded and transmitted frames (for example, a frame preceding the current frame), and the motion of pixels between the current frame and the reference frame. More precisely, the prediction frame is constructed by finding the prediction pixels in the reference frame Rn(x,y) and moving the prediction pixels as the motion information specifies. The motion of the pixels may be presented as the values of horizontal and vertical displacements Δx(x,y) and Δy(x,y) of a pixel at location (x,y) in the current frame In(x,y) The pair of numbers [Δx(x,y),Δy(x,y)] is called the motion vector of this pixel.
  • The motion vectors [Δx(x,y), Δy(x,y)] are calculated in the Motion Field Estimation block 11 in the encoder 10. The set of motion vectors of all pixels of the current frame [Δx(·), Δy(·)] is called the motion vector field. Due to the very large number of pixels in a frame it is not efficient to transmit a separate motion vector for each pixel to the decoder. Instead, in most video coding schemes the current frame is divided into larger image segments Sk and information about the segments is transmitted to the decoder.
  • The motion vector field is coded in the Motion Field Coding block 12 of the encoder 10. Motion Field Coding refers to the process of representing the motion in a frame using some predetermined functions or, in other words, representing it with a model. Almost all of the motion vector field models commonly used are additive motion models. Motion compensated video coding schemes may define the motion vectors of image segments by the following general formula: Δ x ( x , y ) = i = 0 N - 1 a i f i ( x , y ) ( 2 ) Δ y ( x , y ) = i = 0 M - 1 b i g i ( x , y ) ( 3 )
    where coefficients a1 and b1 are called motion coefficients. They are transmitted to the decoder (information stream 2 in FIGS. 1 and 2). Functions ƒ1 and g1 are called motion field basis functions, and they are known both to the encoder and decoder. An approximate motion vector field ({tilde over (Δ)}(x,y),{tilde over (Δ)}y(x,y)) can be constructed using the coefficients and the basis functions.
  • The prediction frame Pn(x,y) is constructed in the Motion Compensated Prediction block 13 in the encoder 10, and it is given by
    P n(x,y)=R n[x+{tilde over (Δ)}x(x,y),y+{tilde over (Δ)}y(x,y)],  (4)
    where the reference frame Rn(x,y) is available in the Frame Memory 17 of the encoder 10 at a given instant.
  • In the Prediction Error Coding block 14, the prediction error frame En(x,y) is typically compressed by representing it as a finite series (transform) of some 2-dimensional functions. For example, a 2-dimensional Discrete Cosine Transform (DCT) can be used. The transform coefficients related to each function are quantized and entropy coded before they are transmitted to the decoder (information stream 1 in FIGS. 1 and 2). Because of the error introduced by quantization, this operation usually produces some degradation in the prediction error frame En(x,y). To cancel this degradation, a motion compensated encoder comprises a Prediction Error Decoding block 15, where the a decoded prediction error frame {tilde over (E)}n(x,y) is constructed using the transform coefficients. This decoded prediction error frame is added to the prediction frame Pn(x,y) and the resulting decoded current frame Ĩn(x,y) is stored to the Frame Memory 17 for further use as the next reference frame Rn+1(x,y).
  • The information stream 2 carrying information about the motion vectors is combined with information about the prediction error in the multiplexer 16 and an information stream (3) containing typically at least those two types of information is sent to the decoder 20.
  • In the Frame Memory 24 of the decoder 20 there is a previously reconstructed reference frame Rn(x,y). The prediction frame Pn(x,y) is constructed in the Motion Compensated Prediction block 21 in the decoder 20 similarly as in the Motion Compensated Prediction block 13 in the encoder 10. The transmitted transform coefficients of the prediction error frame En(x,y) are used in the Prediction Error Decoding block 22 to construct the decoded prediction error frame {tilde over (E)}n(x,y). The pixels of the decoded current frame Ĩn(x,y) are reconstructed by adding the prediction frame Pn(x,y) and the decoded prediction error frame {tilde over (E)}n(x,y)
    Ĩ n(x,y)=P n(x,y)+{tilde over (E)} n(x,y)=R n [x+{tilde over (Δ)}x(x,y),y+{tilde over (Δ)}y(x,y)]+{tilde over (E)} n(x,y).  (5)
  • This decoded current frame may be stored in the Frame Memory 24 as the next reference frame Rn+1(x,y).
  • Let us next discuss in more detail the motion compensation and transmission of motion information. In order to minimize the amount of information needed in sending the motion coefficients to the decoder, coefficients can be predicted from the coefficients of the neighboring segments. When this kind of motion field prediction is used, the motion field is expressed as a sum of a prediction motion field and a refinement motion field. The prediction motion field is constructed using the motion vectors associated with neighboring segments of the current frame. The prediction is performed using the same set of rules and possibly some auxiliary information in both encoder and decoder. The refinement motion field is coded, and the motion coefficients related to this refinement motion field are transmitted to the decoder. This approach typically results in savings in transmission bit rate. The dashed lines in FIG. 1 illustrate some examples of the possible information some motion estimation and coding schemes may require in the Motion Field Estimation block 11 and in the Motion Field Coding block 12.
  • Polynomial motion models are a widely used family of models. (See, for example H. Nguyen and E. Dubois, “Representation of motion information for image coding,” in Proc. Picture Coding Symposium '90, Cambridge, Massachusetts, Mar. 26-18, 1990, pp. 841-845 and Centre de Morphologie Mathematique (CMM), “Segmentation algorithm by multicriteria region merging,” Document SIM(95)19, COST 211ter Project Meeting, May 1995).
  • The values of motion vectors are described by functions which are linear combinations of two dimensional polynomial functions. The translational motion model is the simplest model and requires only two coefficients to describe the motion vectors of each segment. The values of motion vectors are given by the formulae:
    Δx(x,y)=a0
    Δy(x,y)=b 0  (6)
  • This model is widely used in various international standards (ISO MPEG-1, MPEG-2, MPEG-4, ITU-T Recommendations H.261 and H.263) to describe motion of 16×16 and 8×8 pixel blocks. Systems utilizing a translational motion model typically perform motion estimation at full pixel resolution or some integer fraction of full pixel resolution, for example with an accuracy of ½ or ⅓ pixel resolution.
  • Two other widely used models are the affine motion model given by the equation:
    Δx(x,y)=a 0 +a 1 x+a 2 y
    Δy(x,y)=b 0 +b 1 x+b 2 y  (7)
    and the quadratic motion model given by the equation:
    Δx(x,y)=a 0 a 1 x+a 2 y+a 3 xy+a 4 x 2 +a 5 y 2
    Δy(x,y)=b 0+b1x+b2y+b3 xy+b 4 x 2 +b 5 y 2
  • The affine motion model presents a very convenient trade-off between the number of motion coefficients and prediction performance. It is capable of representing some of the common real-life motion types such as translation, rotation, zoom and shear with only a few coefficients. The quadratic motion model provides good prediction performance, but it is less popular in coding than the affine model, since it uses more motion coefficients, while the prediction performance is not substantially better than, for example, that of the affine motion model. Furthermore, it is computationally more costly to estimate the quadratic motion than to estimate the affine motion.
  • The Motion Field Estimation block 11 calculates initial motion coefficients a0 i, . . . , an i, b0 i, . . . , bn i, for [Δx(x,y), Δy(x,y)]a given segment Sk, which initial motion coefficients minimize some measure of prediction error in the segment. In the simplest case, the motion field estimation uses the current frame In(x,y) and the reference frame Rn(x,y) as input values. Typically the Motion Field Estimation block outputs the [Δx(x,y), Δy(x, y)]initial motion coefficients to the Motion Field Coding block 12.
  • The segmentation of the current frame into segments Sk can, for example, be carried out in such a way that each segment corresponds to a certain object moving in the video sequence, but this kind of segmentation is a very complex procedure. A typical and computationally less complex way to segment a video frame is to divide it into macroblocks and to further divide the macroblocks into rectangular blocks. In this description teem macroblock refers generally to a part of a video frame. An example of a macroblock is the previously described YUV macroblock. FIG. 3 presents an example, where a video frame 30 is to divided into macroblocks 31 having a certain number of pixels. Depending on the encoding method, there may be many possible macroblock segmentations. FIG. 3 presents a case, where there are four possible ways to segment a macroblock: macroblock 31A is segmented into blocks 32, macroblock 31B is segmented with a horizontal dividing line into blocks 33, and macroblock 31C is segmented with a vertical dividing line into blocks 34. The fourth possible segmentation is to treat a macroblock as a single block. The macroblock segmentations presented in FIG. 3 are given as examples; they are by no means an exhaustive listing of possible or feasible macroblock segmentations.
  • The Motion Field Coding block 12 makes the final decisions on what kind of motion vector field is transmitted to the decoder and how the motion vector field is coded. It can modify the segmentation of the current frame, the motion model and motion coefficients in order to minimize the amount of information needed to describe a satisfactory motion vector field. The decision on segmentation is typically carried out by estimating a cost of each alternative macroblock segmentation and by choosing the one yielding the smallest cost. As a measure of cost, the most commonly used is a Lagrangian cost function
    L(S k)=D(S k)+λR(S k),
    which links a measure of the reconstruction error D(Sk) with a measure of bits needed for transmission R(Sk) using Lagrangian multiple λ. The Lagrangian cost represents a trade-off between the quality of transmitted video information and the bandwidth needed in transmission. In general a better image quality, i.e. small D(Sk), requires a larger amount of transmitted information, i.e. large R(Sk).
  • In present systems, which utilize a translational motion model, prediction motion coefficients are typically formed by calculating the median of surrounding, already transmitted motion coefficients. This method achieves fairly good performance in terms of efficient use of transmission bandwidth and image quality. The main advantage of this method is that the prediction of motion coefficients is straightforward.
  • The more accurately the prediction motion coefficients correspond to the motion coefficients of the segment being predicted, the fewer bits are needed to transmit information about the refinement motion field. It is possible to select, for example among the neighboring blocks, the block whose motion coefficient are closest the motion coefficients of the block being predicted. The segment selected for the prediction is signaled to the decoder. The main drawback of this method is that finding the best prediction candidate among the already transmitted image segments is a complex task: the encoder has to perform exhaustive calculations to evaluate all the possible prediction candidates and then select the best prediction block. This procedure has to be carried out separately for each block.
  • There are systems where the transmission capacity for the compressed video stream is very limited and where the encoding of video information should not be too complicated. For example, wireless mobile terminals have limited space for additional components and as they operate by battery, they typically cannot provide computing capacity comparable to that of desktop computers. In radio access networks of cellular systems, the available transmission capacity for a video stream can be as low as 20 kbps. Consequently, there is need for a video encoding method, which is computationally simple, provides good image quality and achieves good performance in terms of required transmission bandwidth. Furthermore, to keep the encoding method computationally simple, the encoding method should provide satisfactory results using simple motion models.
  • SUMMARY OF THE INVENTION
  • An object of the present invention is to provide a method that provides a flexible and versatile motion coefficient prediction for encoding/decoding video information using motion compensation. A further object of the invention is to provide a motion compensated method for coding/decoding video information that provides good performance in terms of transmission bandwidth and image quality while being computationally fairly simple. A further object is to present a method for encoding/decoding video information that provides satisfactory results when a comparatively simple motion model, such as the translational motion model, is used.
  • These and other objects of the invention are achieved by associating the motion coefficient prediction method used for a certain macroblock with the segmentation of the macroblock.
  • A method for encoding video information according to the invention comprises the steps of:
      • segmenting a piece of current video information into macroblocks,
      • defining a certain number of available macroblock segmentations for segmenting a macroblock into blocks,
      • defining for each available macroblock segmentation at least one available prediction method, each of which prediction methods produces prediction motion coefficients for blocks within said macroblock, resulting in a certain finite number of available macroblock-segmentation—prediction-method pairs,
      • selecting for a macroblock one of the available macroblock-segmentation—prediction-method pairs, and
      • segmenting the macroblock into blocks and producing prediction motion coefficients for the blocks within said macroblock using the selected macroblock-segmentation prediction-method pair.
  • In a method according to the invention, a piece of current video information, typically a current frame, is segmented into macroblocks. These macroblocks can have any predetermined shape, but typically they are quadrilateral. Furthermore, a certain number of possible segmentations of the macroblocks into blocks is defined, and these are called the available macroblock segmentations. The segmentation of a macroblock into blocks is in this description called macroblock segmentation. The blocks are also typically quadrilateral. The motion of a block within a piece of current video information is typically estimated using a piece of reference video information (typically a reference frame), and the motion of the block is usually modeled using a set of basis functions and motion coefficients. The motion model used in a method according to the invention is advantageously a translational motion model, but there are no restrictions on the use of any other motion model. In a method according to the invention, at least some motion coefficients are represented as sums of prediction motion coefficients and difference motion coefficients and a certain prediction method is used to determine the prediction motion coefficients.
  • Typically a piece of current video information, for example a current frame, is encoded by segmenting a frame into macroblocks and then processing the macroblocks in a certain scanning order, for example one by one from left-to-right and top-to-bottom throughout the frame. In other words, in this example the encoding process is performed in rows, progressing from top to bottom. The way in which the macroblocks are scanned is not restricted by the invention. A macroblock may be segmented, and the motion field of blocks within a macroblock is estimated. Prediction motion coefficients for a certain block are produced using the motion coefficients of some of the blocks in the already processed neighboring macroblocks or the motion coefficients of some of the already processed blocks within the same macroblock. The segmentation of the already processed macroblocks and the motion coefficients of the blocks relating to these macroblocks are already known.
  • A distinctive feature in encoding and decoding methods according to the invention is that for each macroblock segmentation there is a finite number of prediction methods. Certain predetermined allowable pairs of the macroblock segmentations and prediction methods are thus formed. Here the term prediction method refers to two issues: firstly, it defines which blocks are used in producing the prediction motion coefficients for a certain block within a current macroblock and, secondly, it defines how the motion coefficients related to these prediction blocks are used in producing the prediction motion coefficients for said block. Thus, a macroblock-segmentation—prediction-method pair indicates unambiguously both the segmentation of a macroblock and how the prediction motion coefficients for the blocks within the macroblock are produced. The prediction method may specify, for example, that prediction motion coefficients for a block are derived from an average calculated using motion coefficients of certain specific prediction blocks, or that prediction motion coefficients for a block are derived from the motion coefficient of one particular prediction block. The word average here refers to a characteristic describing a certain set of numbers; it may be, for example, an arithmetic mean, a geometric mean, a weighted mean, a median or a mode. Furthermore, it is possible that the prediction coefficients of a block are obtained by projecting motion coefficients or average motion coefficients from one block to another.
  • By restricting the number of possible prediction methods per macroblock segmentation, the complexity of the encoding process is reduced compared, for example, to an encoding process where the best prediction motion coefficient candidate is determined freely using any neighboring blocks or combinations thereof. In such a case, there is a large number of prediction motion coefficient candidates. When the prediction blocks are defined beforehand for each prediction method and there is a limited number of prediction methods per macroblock segmentation, it is possible to estimate the cost of each macroblock-segmentation—prediction-method pair. The pair minimizing the cost can then be selected.
  • Advantageously, there is only one available prediction method per macroblock segmentation. This reduces the complexity of the encoding method even further. Furthermore, in this situation it is possible to conclude the prediction method of a block directly from the selected macroblock segmentation. There is thus necessarily no need to transmit information about the prediction method to the decoding entity. Thus, in this case the amount of transmitted information is not increased by adding adaptive features, i.e. various prediction methods used within a frame, to the encoded information.
  • By selecting the available prediction blocks and defining the macroblock-segmentation-specific prediction methods suitably, it is possible to implement a high performance video encoding method using at most three predetermined prediction blocks to produce prediction motion coefficients and allowing only one prediction method per macroblock segmentation. For each macroblock, the macroblock-segmentation—prediction-method pair minimizing a cost function is selected. The simple adaptive encoding of motion information provided by the invention is efficient in terms of computation and in terms of the amount of transmitted information and further more yields good image quality.
  • A macroblock, which is processed in a method according to the invention, may be, for example, the luminance component of an YUV macroblock. A method according to the invention may also be applied, for example, to the luminance component and to one or both of the chrominance components of an YUV macroblock.
  • A method for decoding encoded video information according to the invention comprises the steps of:
      • specifying information about available macroblock-segmentation—prediction method pairs for producing prediction motion coefficients for blocks within a macroblock,
      • receiving information indicating the macroblock-segmentation—prediction-method pair selected for of a macroblock, and
      • determining the prediction method relating to a macroblock segmentation of said macroblock and producing prediction motion coefficients for blocks within said macroblock using the indicated prediction method. The invention relates also to an encoder for performing motion compensated encoding of video information, which comprises:
      • means for receiving a piece of current video information,
      • means for segmenting a piece of current video information into macroblocks,
      • means for specifying available macroblock segmentations,
      • means for specifying at least one available prediction method for each macroblock segmentation, resulting in a certain finite number of available macroblock-segmentation prediction-method pairs,
      • means for selecting one macroblock-segmentation—prediction method pair among the available macroblock-segmentation—prediction method pairs,
      • means for segmenting a macroblock using the selected macroblock segmentation, and
      • means for producing macroblock-segmentation-specific prediction motion coefficients for blocks within said macroblock using the selected prediction method.
  • A decoder for performing the decoding of encoded video information according to the invention comprises:
      • input means for receiving encoded video information, which comprises information indicating a macroblock-segmentation—prediction-method pair relating to a macroblock and about difference motion coefficients of blocks within the macroblock,
      • means for determining the macroblock-segmentation—prediction-method pair of the macroblock based on the received encoded video information, and
      • means for producing prediction motion coefficients for blocks within said macroblock using a prediction method indicated by the macroblock-segmentation—prediction-method pair.
  • The invention also relates to a storage device and a network element comprising an encoder according to the invention and to a mobile station comprising an encoder or a decoder according to the invention.
  • The novel features which are considered as characteristic of the invention are set forth in particular in the appended claims. The invention itself, however, both as to its construction and its method of operation, together with additional objects and advantages thereof, will be best understood from the following description of specific embodiments when read in connection with the accompanying drawings.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 illustrates an encoder for motion compensated encoding of video according to prior art,
  • FIG. 2 illustrates a decoder for motion compensated decoding of video according to prior art,
  • FIG. 3 illustrates a segmentation of a video fame into macroblocks and blocks according to prior art,
  • FIG. 4 illustrates a flowchart of a motion compensated video encoding method according to the invention,
  • FIG. 5 illustrates a flowchart of a motion compensated video decoding method according to the invention,
  • FIG. 6 illustrates various prediction methods that involve different prediction blocks and that can be used to provide prediction motion coefficients for a current block C in a method according to the invention,
  • FIG. 7 illustrates a plurality of macroblock-segmentation—prediction-method pairs that can be used in a method according to a first preferred embodiment of the invention,
  • FIG. 8 illustrates a plurality of macroblock-segmentation—prediction-method pairs that can be used in a method according to a second preferred embodiment of the invention,
  • FIG. 9 illustrates a motion field estimation block and a motion field coding block according to the invention,
  • FIG. 10 illustrates a motion compensated prediction block according to the invention,
  • FIG. 11 illustrates a mobile station according to the invention, and
  • FIG. 12 illustrates schematically a mobile telecommunication network comprising a network element according to the invention.
  • DETAILED DESCRIPTION
  • FIGS. 1-3 are discussed in detail in the description of motion compensated video encoding and decoding according to prior art.
  • FIG. 4 presents a flowchart of a method for encoding video information according to the invention. Only features related to motion encoding are presented in FIG. 4, it does not present, for example, the formation or coding of the prediction error frame. Typically these features are included in encoding methods according to the invention and, of course, may be implemented in any appropriate manner.
  • In step 401 the available macroblock segmentations are defined. The available macroblock segmentations can comprise, for example, such macroblock segmentations as presented in FIG. 3, In step 402 at least one prediction method for predicting motion coefficients is defined for each available macroblock segmentation, resulting in a certain number of available macroblock-segmentation—prediction-method pairs. Typically, for certain macroblock segmentations an average prediction method is used and for other macroblock segmentations the prediction motion coefficients are derived from the motion coefficients of a single already processed block which is located either in the current macroblock or in one of the neighboring macroblocks. Advantageous prediction methods related to each macroblock segmentation can be found, for example, by testing various prediction methods beforehand. The motion model used to represent the motion field may affect the selection of the prediction methods. Furthermore, it is possible that a suitable motion model is selected during the encoding. Typically steps 401 and 402 are carried out offline, before encoding video streams. Usually they are carried out already when, for example, an encoder is designed and implemented.
  • Steps 403-413 are carried out for each frame of a video stream. In step 403 a current video frame is segmented into macroblocks, and in step 404 encoding of a current macroblock which is the macroblock currently undergoing motion compensated encoding, starts. In step 405 the current macroblock is segmented into blocks using one of the available macroblock segmentations. At this point there necessarily is no idea of which is the most appropriate macroblock segmentation for the current macroblock, so one way to select the best macroblock segmentation is to investigate them all and then select the most appropriate according to some criterion.
  • In step 406 the motion vector fields of the blocks within the current macroblock are estimated and the motion fields are coded. This results in motion coefficients ai and bi for each of said blocks. In step 407 prediction motion coefficients aip and bip, for at least one of the blocks with He current macroblock are produced. If there is only one prediction method per macroblock segmentation, this is a straightforward task. Otherwise one of the prediction methods available for the current macroblock segmentation is selected and the prediction motion coefficients are derived according to this prediction method. In step 408 the motion coefficients of the blocks within current macroblock are represented as sums of the prediction motion coefficients and difference motion coefficients aid and bid.
  • A simple way to search for the best macroblock-segmentation—prediction-method pair is presented in steps 409-411. In step 408 the cost L(Sk) related to current macroblock-segmentation—prediction-method pair is calculated. This cost represents the trade-off between the reconstruction error of the decoded image and the number of bits needed to transmit the encoded image, and it links a measure of the reconstruction error D(Sk) with a measure of bits needed for transmission R(Sk) using a Lagrangian multiple λ. Typically the transmission R(Sk) refers to bits required to represent at least the difference motion coefficients and bits required to represent the associated prediction error. It may also involve some signaling information.
  • Each possible macroblock-segmentation—prediction-method pair is checked, as the loop of steps 405-409 is repeated until prediction motion coefficients and cost function corresponding to all available macroblock-segmentation—prediction-method pairs are evaluated (step 410). In step 411 the macroblock-segmentation—prediction-method pair yielding the smallest cost is selected.
  • In step 412 information indicating the selected macroblock-segmentation prediction-method pair for the current macroblock and the difference motion coefficients aid and bid of at least one of the blocks within the current macroblock are transmitted to a receiver or stored into a storage medium. The information indicating the selected macroblock-segmentation—prediction-method pair may, for example, indicate explicitly both the macroblock segmentation and the prediction method. If there is only one possible prediction method per macroblock segmentation, it can be enough to transmit information indicating only the macroblock segmentation of the current block. In step 413 it is checked, if all the macroblocks within the current frame are processed. If they are not, then in step 404 the processing of next macroblock is started.
  • In a method according to the invention, it is possible that for some macroblocks or for some blocks within a frame the motion coefficients are transmitted as such. It is also possible that for some macroblocks or blocks prediction methods are used, where macroblock-segmentation—prediction-method pairs are not defined.
  • FIG. 5 presents a flowchart of a method for decoding an encoded video stream according to the invention. In step 501 information about the available macroblock segmentations is specified, for example by retrieving the information from memory element where it has been previously stored. The decoding method needs to know which kind of macroblock segmentations a received encoded video stream can comprise. In step 502 information about the available macroblock-segmentation—prediction-method pairs is specified. Steps 501 and 502 are typically carried out off-line, before receiving an encoded video stream. They may be carried out, for example, during the design of implementation of a decoder.
  • Steps 503-507 are carried out during decoding. In step 503 information indicating the segmentation of a current macroblock and prediction method is received. If there is only one available prediction method per macroblock segmentation, information indicating the prediction method is not needed as previously explained. In step 504 information indicating difference motion coefficients aid and bid for at least one of the blocks within the current macroblock are received. In step 505 the decoding entity determines, using the information received in step 503, the prediction method using which the prediction motion coefficient for blocks within the current macroblock are to be produced. The prediction method indicates the prediction blocks related to a certain block and how prediction coefficients for the current block are produced using the motion coefficients of the prediction blocks. There is no need to transmit information about the values of the prediction motion coefficients related to the current block within the current macroblock, because they can be determined in the decoder based on the information received concerning the selected segmentation and prediction method for the current macroblock. In step 506 the prediction motion coefficients aip and bip are produced, and in step 507 the motion coefficients ai and bi are produced using the difference motion coefficients and the prediction motion coefficients.
  • FIG. 6 presents schematically four different prediction methods 60A, 60B, 60C and 60D for providing prediction motion coefficients for a current block C. These four prediction methods are given as examples of prediction methods that may be used in a method according to the invention, and the prediction blocks (i.e. those blocks that are used to from prediction motion coefficients for the current block) are defined according to their spatial relationship with the current block C. In these prediction methods, the prediction blocks are dictated by certain pixel locations. These pixel locations are just one way of specifying the prediction blocks for a current block, and they are described here to aid the understanding of how the prediction blocks are selected in certain prediction methods. In the methods which are presented in FIG. 6, the pixel locations are the same for all the methods. Prediction block L is defined as the block which comprises the pixel location 61. Pixel location 61 is the uppermost pixel adjacent to block C from the left-hand side. Similarly, prediction block U is defined as the block comprising pixel location 62, which is the leftmost pixel superjacent to block C. Furthermore, prediction block UR is defined as the block comprising the pixel location 63, which is the pixel corner to corner with the top right corner pixel of block C.
  • In the first prediction method 60A, three prediction blocks L, U and UR are used. The prediction motion coefficients a1p, b1p provided for block C may be derived from an average of the motion coefficients of the L, U and UR prediction blocks. The average may be, for example, the median of the motion coefficient values of block L, U and UR. In the second prediction method 60B, the prediction motion coefficients are derived from the motion coefficients of prediction block L. Similarly, in the third prediction method the prediction motion coefficients are derived from the motion coefficients of prediction block U and in the fourth prediction method they are derived from the motion coefficients of prediction block UR. The concept of presenting only one pixel location relating to a certain block, when only one prediction block is used in producing prediction motion coefficients for said block, and presenting more than one pixel locations relating to a block, when more than one prediction blocks are used in producing prediction motion coefficients for said block, is used also in FIGS. 7 and 8.
  • The segmentation of the neighboring macroblocks presented in FIG. 6 for prediction method 60A is just an example. When the prediction blocks are defined by pixel locations as presented in FIG. 6, the prediction blocks can be determined unambiguously in spite of the macroblock segmentation of the neighboring macroblocks or of the current macroblock. The three pixel locations in FIG. 6 are an example, the number of pixels can be different and they can be located at other places. Typically the pixel locations specifying the prediction blocks are associated with a current block C and they are at the edge of the current block C.
  • In a method according to a first preferred embodiment of the invention, there is a certain number of available macroblock segmentations and at least one prediction method relates to each macroblock segmentation. FIG. 7 illustrates schematically three macroblock segmentations 70, 71 and 72, which are an example of the available macroblock segmentations in a first preferred embodiment of the invention. In macroblock segmentation 70, the rectangular macroblock is actually not segmented, but is treated as a single block. In macroblock segmentation 71, the macroblock is divided with one vertical line into two rectangular blocks. Similarly, in macroblock segmentation 72 the macroblock is divided with one horizontal line into two rectangular blocks. The macroblock size may be 16×16 pixels and translational motion model, for example, may be used.
  • FIG. 7 furthermore illustrates some examples of prediction method alternatives related to the macroblock segmentations in a method according to the first preferred embodiment. As in FIG. 6, the prediction blocks for blocks within a current macroblock are specified using certain pixel locations which bear a spatial relationship to the blocks within the current macroblock. As an example, the pixel locations in FIG. 7 are the same as in FIG. 6. When the current macroblock is segmented according to example 70, the prediction coefficients for the single block that comprises the current macroblock can be derived using an average of the motion coefficients of the L, U and U prediction blocks (macroblock-segmentation—prediction-method pair 70A), or they can be derived from the motion coefficients of prediction block L (pair 70B), prediction block U (pair 70C) or prediction block UR (pair 70D).
  • FIG. 7 also presents some prediction method alternatives for example macroblock segmentations 71 and 72. As can be seen in FIG. 7, each block within a macroblock preferably has its own associated prediction blocks. The blocks within the current macroblock, which are already processed, may themselves act as prediction blocks for other blocks within the same macroblock. As an example, consider the macroblock-segmentation—prediction-method pair 71A, where prediction motion coefficients for each block C1 and C2 within the current macroblock are derived from an average of the motion coefficients of the block-specific prediction blocks. In this prediction method block C1 acts as a prediction block for the block C2. The macroblock-segmentation—prediction-method pairs 71B, 71C, 71D and 71E are further examples of possible prediction methods related to the macroblock segmentation 71. Similarly, various prediction method alternatives are presented for macroblock segmentation 72.
  • In a method according to the first preferred embodiment of the invention, usually the Lagrangian cost function for each of the macroblock-segmentation—prediction-method pairs 70A, 70B, 70C, 70D, 71A, 71B, 71C, 71D, 71E, 72A, 72B, 72C and 72D is evaluated and then the pair minimizing the cost function is chosen as the actual macroblock segmentation used in encoding the macroblock, as described above in connection with an encoding method according to the invention.
  • Furthermore, it is possible that the segmentation of the neighboring macroblocks affects the number of the macroblock-segmentation—prediction-method pairs available for the current macroblock. In other words, the segmentation of the neighboring macroblocks may lead to a situation in which that some of the pairs illustrated in FIG. 7 cannot be used for a current macroblock or where some extra macroblock-segmentation—prediction-method pairs are available for the current macroblock. If the macroblock segmentation of neighboring macroblocks limits the selection of the macroblock-segmentation—prediction-method pairs available for a certain macroblock segmentation to, for example, only one macroblock-segmentation—prediction-method pair, it may be unnecessary to transmit information indicating the selected prediction method in addition to the information indicating the segmentation of the current macroblock. The decoding entity can conclude the prediction method from the segmentation of the previously received macroblocks when, for example, a method according to the first preferred embodiment of the invention is used.
  • In a method according to a second preferred embodiment of the invention, there is only one available prediction method per macroblock segmentation. In this case, the information indicating a selected macroblock segmentation can be used to indicate implicitly the selected prediction method (cf. step 412 in FIG. 4). Typically in this case the cost function is evaluated in the encoding process for each available macroblock-segmentation—prediction-method pair, and the pair minimizing the cost function is selected for use in encoding the current macroblock. FIG. 8 illustrates an example of a plurality of macroblock-segmentation—prediction-method pairs that can be used in a method according to the second preferred embodiment.
  • FIG. 8 illustrates six possible macroblock segmentations: single block (macroblock segmentation 70), macroblock is divided once with a vertical dividing line (71) or with a horizontal dividing line (72), macroblock is divided once with a vertical dividing line and once with a horizontal dividing line (83), macroblock is divided once with a vertical deviding line and thrice with a horizontal dividing line (84) and thrice with a vertical dividing line and once with a horizontal dividing line (85). As in FIGS. 6 and 7, the small black squares in FIG. 8 illustrate schematically the prediction methods.
  • In this embodiment of the invention, prediction method 70A is associated with macroblock segmentation 70, prediction method 71B is used with macroblock segmentation 71 and prediction method 72B is used with macroblock segmentation 72. The selection of these macroblock-segmentation—prediction method pairs is quite intuitive, When the current macroblock is segmented using macroblock segmentation 71, it is reasonable to expect that the left block C1 and the right block C2 of the macroblock move somehow differently. It is quite natural to assume that the left block C1 would move in a similar way to the prediction block L and to derive the prediction motion coefficients for block C1 from the motion coefficients of prediction block L of block C1. Similarly, it makes sense to use the motion coefficients of prediction block UR of block C2 in deriving the prediction motion coefficients for the right block C2. Similar reasoning applies to the prediction method associated with macroblock segmentation 72. When the current macroblock is not segmented into smaller blocks (macroblock segmentation 70), it is not clear which of the neighboring blocks would provide good prediction motion coefficients, and the prediction motion coefficients are calculated as an average using the three prediction blocks L, U and UR in prediction method 70A.
  • In the prediction method related to macroblock segmentation 83, the prediction motion coefficients for each block within the current macroblock are derived as average values using three prediction blocks. For block C4 within the current macroblock, there is no available UR prediction block because that block is not yet processed. Therefore, the prediction motion coefficients for block C4 are derived using blocks C1, C2 and C3 within the current macroblock. The prediction motion coefficients for blocks C1, C3, C5 and C7 related to macroblock segmentation 84 are derived as averages of the prediction blocks, as specified in FIG. 8. For blocks C2, C4, C6 and C8 related to macroblock segmentation 84, prediction motion coefficients are derived from the motion coefficients of the block on the left hand side of each block, i.e. bock C1, C3, C5 and C7 of the current macroblock, respectively. The prediction motion coefficients for the blocks relating to macroblock segmentation 85 are produced as averages, as specified in FIG. 8. Again, there is no UR prediction block available for block C8 in macroblock segmentation 85, and therefore blocks C3, C4 and C7 within the same macroblock are used in producing prediction motion coefficients for that block. A second sensible alternative for the prediction method related to macroblock segmentation 85 is, for example, median prediction for the blocks in the upper row of the macroblock 85 and subsequent use of the motion coefficients of these blocks to derive prediction motion coefficients for the blocks in the lower row.
  • The number of prediction blocks and the choice of blocks to be used as prediction blocks may further depend on the position of the current macroblock in the frame and on the scanning order of the blocks/macroblocks within the frame. For example, if the encoding process starts from the top left-hand corner of the frame, the block in the top left-hand corner of the frame has no available prediction blocks. Therefore the prediction motion coefficients for this block are usually zero. For the blocks in the upper frame boundary, prediction using a prediction block to the left (prediction block L) is usually applied. For the blocks in the left-hand frame boundary, there are no left (L) prediction blocks available. The motion coefficients of these blocks may be assumed to be zero, if an average prediction is used for the blocks at the left frame boundary. Similarly, for the blocks at the right-hand frame boundary the upper right (UR) prediction block is missing. The prediction motion coefficients for these blocks can be derived, for example, in a manner similar to that described in connection with block C4 of macroblock segmentation 83 in FIG. 8.
  • The details of prediction methods used in a method according to the invention are not restricted median prediction or single block predictions. They are presented in the foregoing description as examples. Furthermore, any of the already processed blocks can be used in constructing the prediction motion field/coefficients for a certain block. The macroblock-segmentation—prediction-method pairs discussed above are also presented as examples of feasible pairs. In a method according to the invention the macroblock segmentations, prediction methods and mapping between the macroblock segmentations and prediction methods may be different from those described above.
  • FIG. 9 illustrates an example of a Motion Field Estimation block 11′ and a Motion Field Coding block 12′ according to the invention. FIG. 10 illustrates an example of a Motion Compensated Prediction block 13′/21′ according to the invention. An encoder according to the invention typically comprises all these blocks, and a decoder according to the invention typically comprises a Motion Compensated Prediction block 21′.
  • In the Motion Field Coding block 11′ there is a Macroblock Segmentation block 111, which segments an incoming macroblock into blocks. The Available Macroblock Segmentations block 112 comprises information about the possible macroblock segmentations Sk. In FIG. 9 the number of possible macroblock segmentations is illustrated by presenting each segmentation as a arrow heading away from the Macroblock Segmentation block 111. The various macroblock segmentations are processed in a Motion Vector Field Estimation block 113, and the initial motion coefficients a0 i, . . . , an i, b0 i, . . . , bn i corresponding to each macroblock segmentation are further transmitted to the Motion Compensated Prediction block 12′. There the Motion Vector Field Coding block 121 codes the estimated motion fields relating to each segmentation. The Segmentation—Prediction Method Mapping block 122 is responsible for indicating to the Prediction Motion Field block 123 the correct prediction method related to each macroblock segmentation. In the Difference Motion Coefficient Construction block 124 the motion fields of the blocks are presented as difference motion coefficients. The costs of the macroblock-segmentation—prediction-method pairs are calculated in the Macroblock Segmentation Selection block 125, and the most appropriate macroblock-segmentation—prediction-method pair is selected. The difference motion coefficients and some information indicating the selected segmentation are transmitted further. The information indicating the selected segmentation may also be implicit. For example, if there is only one macroblock segmentation producing four blocks and the format of the transmitted data reveals to the receiver that it is receiving four pairs of difference motion coefficients relating to a certain macroblock, it can determine the correct segmentation. If there are various available prediction methods per macroblock segmentation, there may be a need to transmit some information that also indicates the selected prediction method. Information about the prediction error frame is typically also transmitted to the decoder, to enable an accurate reconstruction of the image.
  • The Motion Compensated Prediction block 13′/21′ receives information about difference motion coefficients and (implicit or explicit) information about the segmentation of a macroblock. It may also receive information about the selected prediction method if there is more than one prediction method available per macroblock segmentation. The segmentation information is used to produce correct prediction motion coefficients in the Prediction Motion Coefficient Construction block 131. The Segmentation—Prediction Method Mapping block 132 is used to store information about the allowed pairs of macroblock segmentations and prediction methods. The constructed prediction motion coefficients and received difference motion coefficients are used to construct the motion coefficients in the Motion Coefficient Construction block 133. The motion coefficients are transmitted further to a Motion Vector Field Decoding block 134.
  • An encoder or a decoder according to the invention can be realized using hardware or software, or using a suitable combination of both. An encoder or decoder implemented in software may be, for example, a separate program or a software building block that can be used by various programs. In the above description and in the drawings the functional blocks are represented as separate units, but the functionality of these blocks can be implemented, for example, in one software program unit.
  • It is also possible to implement an encoder according to the invention and a decoder according to the invention in one functional unit. Such a unit is called a codec. A codec according to the invention may be a computer program or a computer program element, or it may implemented at least partly using hardware.
  • FIG. 11 shows a mobile station MS according to an embodiment of the invention. A central processing unit, microprocessor μP controls the blocks responsible for different fictions of the mobile station a random access memory RAM, a radio frequency block RF, a read only memory ROM, a user interface UI having a display DPL and a keyboard KBD, and a digital camera block CAM. The microprocessor's operating instructions, that is program code and the mobile station's basic functions have been stored in the mobile station in advance, for example during the manufacturing process, in the ROM. In accordance with its program, the microprocessor uses the RF block or transmitting and receiving messages on a radio path. The microprocessor monitors the state of the user interface UI and controls the digital camera block CAM. In response to a user command, microprocessor instructs the camera block CAM to record a digital image into the RAM. Once the image is captured or alternatively during the capturing process, the microprocessor segments the image into image segments and performs motion compensated encoding for the segments in order to generate a compressed image as explained in the foregoing description. A user may command the mobile station to display the image on its display or to send the compressed image using the RF block to another mobile station, a wired telephone or another telecommunications device. In a preferred embodiment, such transmission of image data is started as soon as the first segment is encoded so that the recipient can start a corresponding decoding process with a minimum delay. In an alternative embodiment, the mobile station comprises an encoder block ENC dedicated for encoding and possibly also for decoding of digital video data.
  • FIG. 12 is a schematic diagram of a mobile telecommunications network according to an embodiment of the invention. Mobile stations MS are in communication with base stations BTS by means of a radio link. The base stations BTS are further connected, through a so-called Abis interface, to a base station controller BSC, which controls and manages several base stations. The entity formed by a number of base stations BTS (typically, by a few dozen base stations) and a single base station controller BSC, controlling the base stations, is called a base station subsystem BSS. Particularly, the base station controller BSC manages radio communication channels and handovers. On the other hand, the base station controller BSC is connected, through a so-called. A interface, to a mobile services switching centre MSC, which co-ordinates the formation of connections to and from mobile stations. A further connection is made, through the mobile service switching centre MSC, to outside the mobile communications network. Outside the mobile communications network there may further reside other network(s) connected to the mobile communications network by gateway(s) GTW, for example the Internet or a Public Switched Telephone Network (PSTN). In such an external network, or in the telecommunications network, there may be located another video decoding or encoding stations, such as computers PC. In an embodiment of the invention, the mobile telecommunications network comprises a video server VSRVR to provide video data to a MS subscribing to such a service. This video data is compressed using the motion compensated video compression method as described earlier in this document. The video server may function as a gateway to an online video source or it may comprise previously recorded yield clips. Typical videotelephony applications may involve, for example, two mobile stations or one mobile station MS and a videotelephone connected to the PSTN, a PC connected to the Internet or a H.261 compatible terminal connected either to the Internet or to the PSTN.
  • In view of the foregoing description it will be evident to a person skilled in the art that various modifications may be made within the scope of the invention. While a number of preferred embodiments of the invention have been described in detail, it should be apparent that many modifications and variations thereto are possible, all of which fall within the true spirit and scope of the invention.

Claims (1)

1. A method for encoding video information, comprising the steps of:
segmenting a piece, of current video information into macroblocks,
defining a certain number of available macroblock segmentations for segmenting a macroblock into blocks,
defining for each available macroblock segmentation at least one available prediction method, each of which prediction methods produces prediction motion coefficients for blocks within said macroblock, resulting in a certain finite number of available macroblock-segmentation—prediction-method pairs,
selecting for a macroblock one of the available macroblock-segmentation—prediction-method pairs, and
segmenting the macroblock into blocks and producing prediction motion coefficients for the blocks within said macroblock using the selected macroblock-segmentation—prediction-method pair.
US11/219,917 2000-05-08 2005-09-07 Method for encoding and decoding video information, a motion compensated video encoder and a coresponding decoder Abandoned US20060013317A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US11/219,917 US20060013317A1 (en) 2000-05-08 2005-09-07 Method for encoding and decoding video information, a motion compensated video encoder and a coresponding decoder

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US09/566,020 US6711211B1 (en) 2000-05-08 2000-05-08 Method for encoding and decoding video information, a motion compensated video encoder and a corresponding decoder
US10/770,986 US6954502B2 (en) 2000-05-08 2004-02-03 Method for encoding and decoding video information, a motion compensated video encoder and a corresponding decoder
US11/219,917 US20060013317A1 (en) 2000-05-08 2005-09-07 Method for encoding and decoding video information, a motion compensated video encoder and a coresponding decoder

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US10/770,986 Continuation US6954502B2 (en) 2000-05-08 2004-02-03 Method for encoding and decoding video information, a motion compensated video encoder and a corresponding decoder

Publications (1)

Publication Number Publication Date
US20060013317A1 true US20060013317A1 (en) 2006-01-19

Family

ID=24261115

Family Applications (3)

Application Number Title Priority Date Filing Date
US09/566,020 Expired - Lifetime US6711211B1 (en) 2000-05-08 2000-05-08 Method for encoding and decoding video information, a motion compensated video encoder and a corresponding decoder
US10/770,986 Expired - Lifetime US6954502B2 (en) 2000-05-08 2004-02-03 Method for encoding and decoding video information, a motion compensated video encoder and a corresponding decoder
US11/219,917 Abandoned US20060013317A1 (en) 2000-05-08 2005-09-07 Method for encoding and decoding video information, a motion compensated video encoder and a coresponding decoder

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US09/566,020 Expired - Lifetime US6711211B1 (en) 2000-05-08 2000-05-08 Method for encoding and decoding video information, a motion compensated video encoder and a corresponding decoder
US10/770,986 Expired - Lifetime US6954502B2 (en) 2000-05-08 2004-02-03 Method for encoding and decoding video information, a motion compensated video encoder and a corresponding decoder

Country Status (13)

Country Link
US (3) US6711211B1 (en)
EP (1) EP1282982B1 (en)
JP (1) JP4369090B2 (en)
KR (1) KR100772576B1 (en)
CN (1) CN100581266C (en)
AU (1) AU2001258472A1 (en)
BR (1) BR0110627A (en)
CA (1) CA2408364C (en)
EE (1) EE05487B1 (en)
HU (1) HU229589B1 (en)
MX (1) MXPA02010964A (en)
WO (1) WO2001086962A1 (en)
ZA (1) ZA200208767B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080272994A1 (en) * 2007-05-03 2008-11-06 Novatek Microelectronics Corp. Apparatus for controlling the liquid crystal display
WO2010018916A1 (en) * 2008-08-11 2010-02-18 에스케이 텔레콤주식회사 Moving image coding device and method
US20110026845A1 (en) * 2008-04-15 2011-02-03 France Telecom Prediction of images by prior determination of a family of reference pixels, coding and decoding using such a prediction
US20120219232A1 (en) * 2009-10-20 2012-08-30 Tomoyuki Yamamoto Image encoding apparatus, image decoding apparatus, and data structure of encoded data
US20140169444A1 (en) * 2012-12-14 2014-06-19 Microsoft Corporation Image sequence encoding/decoding using motion fields
US11671603B2 (en) 2018-12-28 2023-06-06 Electronics Telecommunications Research Institute Image encoding/decoding method and apparatus, and recording medium storing bitstream

Families Citing this family (105)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7139317B2 (en) * 1999-04-17 2006-11-21 Altera Corporation Segment-based encoding system using exposed area filling performed by an encoder and a decoder
US8406301B2 (en) 2002-07-15 2013-03-26 Thomson Licensing Adaptive weighting of reference pictures in video encoding
EP1547264A4 (en) 2002-10-01 2010-11-03 Thomson Licensing Implicit weighting of reference pictures in a video decoder
US7801217B2 (en) 2002-10-01 2010-09-21 Thomson Licensing Implicit weighting of reference pictures in a video encoder
JP4240283B2 (en) 2002-10-10 2009-03-18 ソニー株式会社 Decoding device and decoding method
US8824553B2 (en) * 2003-05-12 2014-09-02 Google Inc. Video compression method
CN100344163C (en) * 2004-06-16 2007-10-17 华为技术有限公司 Video coding-decoding processing method
FR2872975A1 (en) 2004-07-06 2006-01-13 Thomson Licensing Sa METHOD AND DEVICE FOR CHOOSING AN ENCODING MODE
CN100359953C (en) * 2004-09-08 2008-01-02 华为技术有限公司 Image chroma prediction based on code in frame
JP4736456B2 (en) * 2005-02-15 2011-07-27 株式会社日立製作所 Scanning line interpolation device, video display device, video signal processing device
KR100746022B1 (en) * 2005-06-14 2007-08-06 삼성전자주식회사 Method and apparatus for encoding video signal with improved compression efficiency using model switching of sub pixel's motion estimation
JP2007116351A (en) 2005-10-19 2007-05-10 Ntt Docomo Inc Image prediction coding apparatus, image prediction decoding apparatus, image prediction coding method, image prediction decoding method, image prediction coding program, and image prediction decoding program
KR100750137B1 (en) * 2005-11-02 2007-08-21 삼성전자주식회사 Method and apparatus for encoding and decoding image
KR101365570B1 (en) 2007-01-18 2014-02-21 삼성전자주식회사 Method and apparatus for encoding and decoding based on intra prediction
KR101366241B1 (en) * 2007-03-28 2014-02-21 삼성전자주식회사 Method and apparatus for video encoding and decoding
JP4461165B2 (en) * 2007-09-26 2010-05-12 株式会社東芝 Image processing apparatus, method, and program
KR101337206B1 (en) 2007-10-12 2013-12-06 삼성전자주식회사 System and method for mostion estimation of image using block sampling
US8908765B2 (en) * 2007-11-15 2014-12-09 General Instrument Corporation Method and apparatus for performing motion estimation
BRPI0907748A2 (en) * 2008-02-05 2015-07-21 Thomson Licensing Methods and apparatus for implicit block segmentation in video encoding and decoding
US8259794B2 (en) * 2008-08-27 2012-09-04 Alexander Bronstein Method and system for encoding order and frame type selection optimization
US8385404B2 (en) 2008-09-11 2013-02-26 Google Inc. System and method for video encoding using constructed reference frame
US8325796B2 (en) 2008-09-11 2012-12-04 Google Inc. System and method for video coding using adaptive segmentation
US8326075B2 (en) 2008-09-11 2012-12-04 Google Inc. System and method for video encoding using adaptive loop filter
JP4957780B2 (en) * 2009-11-20 2012-06-20 カシオ計算機株式会社 Motion compensated predictive coding apparatus, motion compensated predictive coding method, and program
CN102447895B (en) 2010-09-30 2013-10-02 华为技术有限公司 Scanning method, scanning device, anti-scanning method and anti-scanning device
US9532059B2 (en) 2010-10-05 2016-12-27 Google Technology Holdings LLC Method and apparatus for spatial scalability for video coding
JP5594841B2 (en) * 2011-01-06 2014-09-24 Kddi株式会社 Image encoding apparatus and image decoding apparatus
CN102611884B (en) * 2011-01-19 2014-07-09 华为技术有限公司 Image encoding and decoding method and encoding and decoding device
US8891626B1 (en) 2011-04-05 2014-11-18 Google Inc. Center of motion for encoding motion fields
US8693547B2 (en) 2011-04-06 2014-04-08 Google Inc. Apparatus and method for coding using motion vector segmentation
US9154799B2 (en) 2011-04-07 2015-10-06 Google Inc. Encoding and decoding motion via image segmentation
US8638854B1 (en) 2011-04-07 2014-01-28 Google Inc. Apparatus and method for creating an alternate reference frame for video compression using maximal differences
US8989256B2 (en) * 2011-05-25 2015-03-24 Google Inc. Method and apparatus for using segmentation-based coding of prediction information
US9094689B2 (en) 2011-07-01 2015-07-28 Google Technology Holdings LLC Motion vector prediction design simplification
US8885706B2 (en) 2011-09-16 2014-11-11 Google Inc. Apparatus and methodology for a video codec system with noise reduction capability
US9185428B2 (en) 2011-11-04 2015-11-10 Google Technology Holdings LLC Motion vector scaling for non-uniform motion vector grid
US9247257B1 (en) 2011-11-30 2016-01-26 Google Inc. Segmentation based entropy encoding and decoding
US9014265B1 (en) 2011-12-29 2015-04-21 Google Inc. Video coding using edge detection and block partitioning for intra prediction
US8908767B1 (en) 2012-02-09 2014-12-09 Google Inc. Temporal motion vector prediction
US9262670B2 (en) 2012-02-10 2016-02-16 Google Inc. Adaptive region of interest
US9094681B1 (en) 2012-02-28 2015-07-28 Google Inc. Adaptive segmentation
US9131073B1 (en) 2012-03-02 2015-09-08 Google Inc. Motion estimation aided noise reduction
US9609341B1 (en) 2012-04-23 2017-03-28 Google Inc. Video data encoding and decoding using reference picture lists
US9426459B2 (en) 2012-04-23 2016-08-23 Google Inc. Managing multi-reference picture buffers and identifiers to facilitate video data coding
US9172970B1 (en) 2012-05-29 2015-10-27 Google Inc. Inter frame candidate selection for a video encoder
US9014266B1 (en) 2012-06-05 2015-04-21 Google Inc. Decimated sliding windows for multi-reference prediction in video coding
US11317101B2 (en) 2012-06-12 2022-04-26 Google Inc. Inter frame candidate selection for a video encoder
US9344729B1 (en) 2012-07-11 2016-05-17 Google Inc. Selective prediction signal filtering
US9332276B1 (en) 2012-08-09 2016-05-03 Google Inc. Variable-sized super block based direct prediction mode
US9380298B1 (en) 2012-08-10 2016-06-28 Google Inc. Object-based intra-prediction
US9288484B1 (en) 2012-08-30 2016-03-15 Google Inc. Sparse coding dictionary priming
US9407915B2 (en) 2012-10-08 2016-08-02 Google Inc. Lossless video coding with sub-frame level optimal quantization values
US9756346B2 (en) 2012-10-08 2017-09-05 Google Inc. Edge-selective intra coding
US9503746B2 (en) 2012-10-08 2016-11-22 Google Inc. Determine reference motion vectors
US9485515B2 (en) 2013-08-23 2016-11-01 Google Inc. Video coding using reference motion vectors
US9210432B2 (en) 2012-10-08 2015-12-08 Google Inc. Lossless inter-frame video coding
US9369732B2 (en) 2012-10-08 2016-06-14 Google Inc. Lossless intra-prediction video coding
US9210424B1 (en) 2013-02-28 2015-12-08 Google Inc. Adaptive prediction block size in video coding
US9300906B2 (en) 2013-03-29 2016-03-29 Google Inc. Pull frame interpolation
US9756331B1 (en) 2013-06-17 2017-09-05 Google Inc. Advance coded reference prediction
US9313493B1 (en) 2013-06-27 2016-04-12 Google Inc. Advanced motion estimation
US9438910B1 (en) 2014-03-11 2016-09-06 Google Inc. Affine motion prediction in video coding
US9392272B1 (en) 2014-06-02 2016-07-12 Google Inc. Video coding using adaptive source variance based partitioning
US9578324B1 (en) 2014-06-27 2017-02-21 Google Inc. Video coding using statistical-based spatially differentiated partitioning
US9153017B1 (en) 2014-08-15 2015-10-06 Google Inc. System and method for optimized chroma subsampling
US10102613B2 (en) 2014-09-25 2018-10-16 Google Llc Frequency-domain denoising
US9807416B2 (en) 2015-09-21 2017-10-31 Google Inc. Low-latency two-pass video coding
US10560712B2 (en) 2016-05-16 2020-02-11 Qualcomm Incorporated Affine motion prediction for video coding
CN109196864B (en) 2016-05-24 2023-07-11 韩国电子通信研究院 Image encoding/decoding method and recording medium therefor
KR20180001485A (en) 2016-06-24 2018-01-04 한국전자통신연구원 Method and apparatus for encoding and decoding a video image based on transform
CN116614639A (en) 2016-07-12 2023-08-18 韩国电子通信研究院 Image encoding/decoding method and recording medium therefor
CN116708785A (en) 2016-07-12 2023-09-05 韩国电子通信研究院 Image encoding/decoding method and recording medium therefor
CN116567226A (en) 2016-08-11 2023-08-08 Lx 半导体科技有限公司 Image encoding/decoding apparatus and image data transmitting apparatus
CN109804626B (en) 2016-10-04 2023-10-10 英迪股份有限公司 Method and apparatus for encoding and decoding image and recording medium for storing bit stream
US10448010B2 (en) 2016-10-05 2019-10-15 Qualcomm Incorporated Motion vector prediction for affine motion models in video coding
US11575885B2 (en) 2016-10-11 2023-02-07 Electronics And Telecommunications Research Institute Image encoding/decoding method and apparatus and recording medium for storing bitstream
WO2018097607A1 (en) 2016-11-22 2018-05-31 한국전자통신연구원 Image encoding/decoding image method and device, and recording medium storing bit stream
KR102283517B1 (en) 2016-11-28 2021-07-29 한국전자통신연구원 Method and apparatus for encoding/decoding image and recording medium for storing bitstream
US10462482B2 (en) 2017-01-31 2019-10-29 Google Llc Multi-reference compound prediction of a block using a mask mode
EP4351139A2 (en) 2017-06-09 2024-04-10 Electronics and Telecommunications Research Institute Video encoding/decoding method and device, and recording medium storing bit stream
WO2019022568A1 (en) 2017-07-28 2019-01-31 한국전자통신연구원 Image processing method, and image encoding/decoding method and device which use same
CN115442607A (en) 2017-07-31 2022-12-06 韩国电子通信研究院 Method for encoding and decoding image and computer readable medium storing bit stream
CN117499683A (en) 2017-09-20 2024-02-02 韩国电子通信研究院 Method and apparatus for encoding/decoding image
US11877001B2 (en) 2017-10-10 2024-01-16 Qualcomm Incorporated Affine prediction in video coding
CN111279695B (en) 2017-10-26 2024-03-29 韩国电子通信研究院 Method and apparatus for asymmetric subblock-based image encoding/decoding
EP3737093A4 (en) 2017-11-28 2022-02-09 Electronics and Telecommunications Research Institute Image encoding/decoding method and device, and recording medium stored with bitstream
CN115802034A (en) 2017-11-29 2023-03-14 韩国电子通信研究院 Image encoding/decoding method and apparatus using in-loop filtering
KR20190080805A (en) 2017-12-28 2019-07-08 한국전자통신연구원 Method and apparatus for encoding/decoding image and recording medium for storing bitstream
WO2019147067A1 (en) 2018-01-26 2019-08-01 한국전자통신연구원 Method and apparatus for image encoding and image decoding using temporal motion information
US11425390B2 (en) 2018-01-26 2022-08-23 Electronics And Telecommunications Research Institute Method and apparatus for image encoding and image decoding using temporal motion information
CA3194780A1 (en) 2018-03-19 2019-09-26 University-Industry Cooperation Group Of Kyung Hee University Method and apparatus for encoding/decoding image using geometrically modified reference picture
CN116866563A (en) 2018-03-21 2023-10-10 Lx 半导体科技有限公司 Image encoding/decoding method, storage medium, and image data transmission method
WO2019190224A1 (en) 2018-03-30 2019-10-03 한국전자통신연구원 Image encoding/decoding method and device, and recording medium in which bitstream is stored
WO2020004912A1 (en) 2018-06-25 2020-01-02 한국전자통신연구원 Method and apparatus for encoding/decoding image using quantization parameter, and recording medium storing bitstream
CA3108468A1 (en) 2018-08-06 2020-02-13 Electronics And Telecommunications Research Institute Image encoding/decoding method and device, and recording medium storing bitstream
KR20200033211A (en) 2018-09-19 2020-03-27 한국전자통신연구원 Method and apparatus for image encoding/decoding using boundary handling and recording medium for storing bitstream
US11729383B2 (en) 2018-09-19 2023-08-15 Electronics And Telecommunications Research Institute Image encoding/decoding method and apparatus, and recording medium storing bitstream
EP3855747A4 (en) 2018-09-20 2022-06-15 Electronics and Telecommunications Research Institute Image encoding/decoding method and device, and recording medium storing bitstream
WO2020060261A1 (en) 2018-09-20 2020-03-26 한국전자통신연구원 Method and device for encoding/decoding image, and recording medium for storing bitstream
CN112740694A (en) 2018-09-21 2021-04-30 韩国电子通信研究院 Method and apparatus for encoding/decoding image and recording medium for storing bitstream
US11616946B2 (en) 2018-09-21 2023-03-28 Electronics And Telecommunications Research Institute Image encoding/decoding method, device, and recording medium having bitstream stored therein
US11496737B2 (en) 2018-10-05 2022-11-08 Electronics And Telecommunications Research Institute Image encoding/decoding method and apparatus, and recording medium storing bitstream
US11838540B2 (en) 2018-12-21 2023-12-05 Electronics And Telecommunications Research Institute Image encoding/decoding method and device, and recording medium in which bitstream is stored
WO2020139008A1 (en) 2018-12-28 2020-07-02 한국전자통신연구원 Video encoding/decoding method, apparatus, and recording medium having bitstream stored thereon
KR20200083339A (en) 2018-12-31 2020-07-08 한국전자통신연구원 Method and apparatus for encoding/decoding image and recording medium for storing bitstream

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5412435A (en) * 1992-07-03 1995-05-02 Kokusai Denshin Denwa Kabushiki Kaisha Interlaced video signal motion compensation prediction system
US5453799A (en) * 1993-11-05 1995-09-26 Comsat Corporation Unified motion estimation architecture
US5563895A (en) * 1992-10-30 1996-10-08 Nokia Mobile Phones Ltd. Digital mobil E radio communication system
US6031826A (en) * 1996-08-27 2000-02-29 Ericsson Inc. Fast associated control channel technique for satellite communications
US6323914B1 (en) * 1999-04-20 2001-11-27 Lsi Logic Corporation Compressed video recording device with integrated effects processing
US6526096B2 (en) * 1996-09-20 2003-02-25 Nokia Mobile Phones Limited Video coding system for estimating a motion vector field by using a series of motion estimators of varying complexity
US6563872B2 (en) * 1996-10-30 2003-05-13 Hitachi, Ltd. Method and apparatus for image coding
US6847684B1 (en) * 2000-06-01 2005-01-25 Hewlett-Packard Development Company, L.P. Zero-block encoding

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3169783B2 (en) * 1995-02-15 2001-05-28 日本電気株式会社 Video encoding / decoding system
US6128047A (en) 1998-05-20 2000-10-03 Sony Corporation Motion estimation process and system using sparse search block-matching and integral projection

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5412435A (en) * 1992-07-03 1995-05-02 Kokusai Denshin Denwa Kabushiki Kaisha Interlaced video signal motion compensation prediction system
US5563895A (en) * 1992-10-30 1996-10-08 Nokia Mobile Phones Ltd. Digital mobil E radio communication system
US5453799A (en) * 1993-11-05 1995-09-26 Comsat Corporation Unified motion estimation architecture
US6031826A (en) * 1996-08-27 2000-02-29 Ericsson Inc. Fast associated control channel technique for satellite communications
US6526096B2 (en) * 1996-09-20 2003-02-25 Nokia Mobile Phones Limited Video coding system for estimating a motion vector field by using a series of motion estimators of varying complexity
US6563872B2 (en) * 1996-10-30 2003-05-13 Hitachi, Ltd. Method and apparatus for image coding
US6323914B1 (en) * 1999-04-20 2001-11-27 Lsi Logic Corporation Compressed video recording device with integrated effects processing
US6847684B1 (en) * 2000-06-01 2005-01-25 Hewlett-Packard Development Company, L.P. Zero-block encoding

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080272994A1 (en) * 2007-05-03 2008-11-06 Novatek Microelectronics Corp. Apparatus for controlling the liquid crystal display
US20110026845A1 (en) * 2008-04-15 2011-02-03 France Telecom Prediction of images by prior determination of a family of reference pixels, coding and decoding using such a prediction
US8787693B2 (en) * 2008-04-15 2014-07-22 Orange Prediction of images by prior determination of a family of reference pixels, coding and decoding using such a prediction
WO2010018916A1 (en) * 2008-08-11 2010-02-18 에스케이 텔레콤주식회사 Moving image coding device and method
KR101085963B1 (en) 2008-08-11 2011-11-22 에스케이플래닛 주식회사 Apparatus and Method for encoding video
US8705609B2 (en) 2008-08-11 2014-04-22 Sk Planet Co., Ltd. Moving image coding device and method
USRE47004E1 (en) 2008-08-11 2018-08-21 Sk Planet Co., Ltd. Moving image coding device and method
USRE48451E1 (en) 2008-08-11 2021-02-23 Sk Planet Co., Ltd. Moving image coding device and method
US20120219232A1 (en) * 2009-10-20 2012-08-30 Tomoyuki Yamamoto Image encoding apparatus, image decoding apparatus, and data structure of encoded data
US20140169444A1 (en) * 2012-12-14 2014-06-19 Microsoft Corporation Image sequence encoding/decoding using motion fields
US11671603B2 (en) 2018-12-28 2023-06-06 Electronics Telecommunications Research Institute Image encoding/decoding method and apparatus, and recording medium storing bitstream

Also Published As

Publication number Publication date
CN100581266C (en) 2010-01-13
ZA200208767B (en) 2003-10-30
CA2408364A1 (en) 2001-11-15
US20040156437A1 (en) 2004-08-12
EE200200627A (en) 2004-04-15
JP2003533142A (en) 2003-11-05
HU229589B1 (en) 2014-02-28
JP4369090B2 (en) 2009-11-18
US6954502B2 (en) 2005-10-11
EP1282982B1 (en) 2020-01-29
KR100772576B1 (en) 2007-11-02
HUP0302617A3 (en) 2005-08-29
EE05487B1 (en) 2011-10-17
WO2001086962A1 (en) 2001-11-15
HUP0302617A2 (en) 2003-11-28
KR20030011325A (en) 2003-02-07
EP1282982A1 (en) 2003-02-12
US6711211B1 (en) 2004-03-23
CA2408364C (en) 2008-07-15
CN1457606A (en) 2003-11-19
AU2001258472A1 (en) 2001-11-20
BR0110627A (en) 2003-03-18
MXPA02010964A (en) 2003-03-27

Similar Documents

Publication Publication Date Title
US6954502B2 (en) Method for encoding and decoding video information, a motion compensated video encoder and a corresponding decoder
EP1206881B1 (en) Apparatus and method for compressing a motion vector field
US8036273B2 (en) Method for sub-pixel value interpolation
EP1228645B1 (en) Adaptive motion vector field coding
US8630340B2 (en) Method for encoding and decoding video information, a motion compensated video encoder and a corresponding decoder
US8428136B2 (en) Dynamic image encoding method and device and program using the same
US20060120455A1 (en) Apparatus for motion estimation of video data
Heising et al. Video coding using spatial extrapolation based motion field segmentation
Matsuda et al. Block-based spatio-temporal prediction for video coding
GB2379820A (en) Interpolating values for sub-pixels
AU2007237319B2 (en) Method for sub-pixel value interpolation

Legal Events

Date Code Title Description
AS Assignment

Owner name: NOKIA CORPORATION, FINLAND

Free format text: MERGER;ASSIGNOR:NOKIA MOBILE PHONES LTD.;REEL/FRAME:019399/0264

Effective date: 20011001

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION