WO2007092193A2 - Method and apparatus for adaptive group of pictures (gop) structure selection - Google Patents

Method and apparatus for adaptive group of pictures (gop) structure selection Download PDF

Info

Publication number
WO2007092193A2
WO2007092193A2 PCT/US2007/002387 US2007002387W WO2007092193A2 WO 2007092193 A2 WO2007092193 A2 WO 2007092193A2 US 2007002387 W US2007002387 W US 2007002387W WO 2007092193 A2 WO2007092193 A2 WO 2007092193A2
Authority
WO
WIPO (PCT)
Prior art keywords
picture
pictures
group
selection
video sequence
Prior art date
Application number
PCT/US2007/002387
Other languages
French (fr)
Other versions
WO2007092193A3 (en
Inventor
Peng Yin
Jill Macdonald Boyce
Alexandros Tourapis
Original Assignee
Thomson Licensing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US12/087,637 priority Critical patent/US9602840B2/en
Application filed by Thomson Licensing filed Critical Thomson Licensing
Priority to EP07763138A priority patent/EP1982528A2/en
Priority to CN2007800043664A priority patent/CN101379828B/en
Priority to BRPI0707419-0A priority patent/BRPI0707419A2/en
Priority to JP2008553288A priority patent/JP5415084B2/en
Publication of WO2007092193A2 publication Critical patent/WO2007092193A2/en
Publication of WO2007092193A3 publication Critical patent/WO2007092193A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/87Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving scene cut or scene change detection in combination with video compression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/114Adapting the group of pictures [GOP] structure, e.g. number of B-frames between two anchor frames
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/142Detection of scene cut or scene change
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/177Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a group of pictures [GOP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • the present principles relate generally to video encoding and, more particularly, to a method and apparatus for adaptive Group of Pictures (GOP) structure selection.
  • GOP Group of Pictures
  • a Group of Pictures (GOP) structure only involves GOP length (N) and picture type (i.e., P-picture interval M) selection.
  • Such older video coding standards and recommendations include, for example, the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-1 (MPEG-1) standard, the ISO/IEC MPEG-2 standard, the International Telecommunication Union, Telecommunication Sector (ITU-T) H.263 recommendation.
  • a new video compression standard/recommendation the ISO/IEC MPEG-4 Part 10 Advanced Video Coding (AVC) standard/ITU-T H.264 recommendation (hereinafter the "MPEG-4 AVC standard"), provides several new tools to improve coding efficiency.
  • the MPEG-4 AVC standard uses/supports three different picture (slice) types (I, P and B pictures (slices)). Moreover, the MPEG-4 AVC standard includes new tools/features to improve coding efficiency.
  • the MPEG-4 AVC standard decouples the order of reference pictures from the display order.
  • prior video coding standards and recommendations there was a strict dependency between the ordering of pictures i from motion compensation purposes and the ordering of pictures for display purposes.
  • these restrictions are largely removed, allowing the encoder to choose the reference order and display order with more flexibility.
  • the MPEG-4 AVC standard decouples picture representation methods from picture referencing capability.
  • B pictures cannot be used as references for the prediction of other pictures in the video sequence.
  • the MPEG-4 AVC standard allows multiple reference pictures for motion compensation.
  • Most previous work related to the GOP structure has been concentrated on
  • GOP length and picture type selection The GOP length is, in general, fixed by the application.
  • dynamic GOP length the first picture after the scene change is coded as an l-picture, and the next GOP is merged to the current GOP.
  • a method is disclosed in which the GOP structure is adapted by taking into account temporal segmentation. That is, picture types are adjusted according to the temporal variation of the input video.
  • the optimal picture type in the GOP may be selected from possible candidates by solving a minimization problem with the Lagrange multiplier method.
  • a system is disclosed wherein macroblock activity information is used to decide picture type.
  • an apparatus includes an encoder for encoding a video sequence using a Group of Pictures structure by performing, for each Group of Pictures for the video sequence, picture coding order selection, picture type selection, and reference picture selection. The selections are based upon a Group of Pictures length.
  • a video encoding method includes encoding a video sequence using a Group of Pictures structure by performing, for each Group of Pictures for the video sequence, picture coding order selection, picture type selection, and reference picture selection. The selections are based upon a Group of Pictures length.
  • FIG. 1 shows a block diagram for an exemplary video encoder to which the present principles may be applied, in accordance with an embodiment of the present principles
  • FIG. 2 shows a flow diagram for an exemplary method for an adaptive Group of Picture (GOP) structure decision, in accordance with an embodiment of the present principles
  • FIG. 3 shows a flow diagram for an exemplary method for performing a Group of Pictures (GOP) length decision, in accordance with an embodiment of the present principles
  • FIG. 4 shows a flow diagram for an exemplary method for determining picture coding order, in accordance with an embodiment of the present principles
  • FIG. 5 shows a flow diagram for an exemplary method for selecting picture type, in accordance with an embodiment of the present principles.
  • the present principles are directed to a method and apparatus for adaptive Group of Pictures (GOP) structure selection.
  • GOP Group of Pictures
  • the present description illustrates the present principles. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the present principles and are included within its spirit and scope.
  • processor or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (“DSP”) hardware, read-only memory (“ROM”) for storing software, random access memory (“RAM”), and non-volatile storage.
  • DSP digital signal processor
  • ROM read-only memory
  • RAM random access memory
  • any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
  • any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function.
  • the present principles as defined by such claims reside in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
  • FIG. 1 an exemplary video encoder to which the present principles may be applied is indicated generally by the reference numeral 100.
  • a non-inverting input of a summing junction 110 and a first input of a motion estimator 180 are available as inputs to the video encoder 100.
  • An output of the summing junction 110 is connected in signal communication with an input of a transformer 115.
  • An output of the transformer 115 is connected in signal communication with an input of a quantizer 120.
  • An output of the quantizer 120 is connected in signal communication with an input of a variable length coder (VLC) 140.
  • VLC 140 variable length coder
  • the output of the quantizer 120 is further connected in signal communication with an input of an inverse quantizer 150.
  • An output of the inverse quantizer 150 is connected in signal communication with an input of an inverse transformer.
  • An output of the inverse transformer is connected in signal communication with an input of a reference picture store 170.
  • a first output of the reference picture store 170 is connected in signal communication with a second input of a motion estimator 180.
  • An output of the motion estimator 180 is connected in signal communication with a first input of a motion compensator 190.
  • a second output of the reference picture store 170 is connected in signal communication with a second input of the motion compensator 190.
  • An output of the motion compensator 190 is connected in signal communication with an inverting input of the summing junction 110.
  • a method and apparatus are provided for Group of Pictures (GOP) structure selection.
  • the method and apparatus for GOP structure selection may encompass within the decision process an analysis of GOP length, coding order of picture, picture type selection and picture reference decision. That is, the method and apparatus may jointly consider GOP length, picture coding order, picture type selection and reference picture selection for the GOP structure selection.
  • a shot detection is first performed. Then the GOP length is decided based on the temporal segmentation. Within each GOP, the picture coding order combined with the picture type and reference picture selection is then decided.
  • an exemplary method for an adaptive Group of Picture (GOP) structure decision is indicated generally by the reference numeral 200.
  • GOP Group of Picture
  • the method 200 includes an initialization block 205 that passes control to a function block 210.
  • the function block 210 performs a shot detection, and passes control to a function block 215.
  • the function block 215 decides the GOP length N, and passes control to a function block 220.
  • the function block 220 determines the picture coding order, performs a picture type selection, and passes control to a function block 225.
  • the function block 225 performs reference picture selection (e.g., based on Picture Order Count (POC) and/or correlation), and passes control to a function block 230.
  • the function block 230 encodes the pictures in the GOP, and passes control to a decision block 235.
  • the decision block 235 determines whether or no the sequence is ended. If so, then control is passed to an end block 240. Otherwise, control is returned to the function block 210.
  • GOP length is selected dynamically based on shot detection. Unlike prior methods, where only scene cut is detected, we also detect slow transitions, such as fade and dissolve.
  • the GOP length N is generally fixed by a pre-defined value. If a scene cut is detected, then a new GOP restarts from the first picture after the scene cut with length N. If a slow transition is detected, then a new GOP restarts from the starting point of transition and ends at the ending point of transition.
  • an exemplary method for performing a Group of Pictures (GOP) length decision is indicated generally by the reference numeral 300.
  • the method 300 relates to the function block 215 of the method 200 of FIG. 2.
  • the method 300 includes an initialization block 305 that passes control to a function block 310.
  • the function block 310 performs a shot detection, and passes control to a function block 315.
  • the function block 315 determines whether or not a scene cut has been detected. If so, then control is passed to a function block 335. Otherwise, control is passed to a decision block 320.
  • the function block 335 restarts a new GOP with a pre-defined length N, and passes control to an end block 330.
  • the decision block 320 determines whether or not a slow transition has been detected. If so, then control is passed to a function block 325. Otherwise, control is passed to a function block 340.
  • the function block 325 restarts a new GOP from the starting point of transition and ends at the end point of the transition, and passes control to the end block 330.
  • the function block 340 sets the GOP length to N, and passes control to the end block 330.
  • the picture coding order in each GOP is decided based on the characteristics of the content. For some specific feature, like cross-fades, a reverse coding of the fade-in sequence has higher coding efficiency. The detection of the switching point, from which reverse coding can happen, is considered in two cases.
  • the switching point is set to the minimum of the maximal picture number that can be reversed while satisfying the delay constraint, a Decoded Picture Buffer (DPB) size, and the end picture of the fade-in sequence. Since we code the fade-in as a single GOP, we can reversely code the picture at the beginning of the GOP and at end of the GOP.
  • DPB Decoded Picture Buffer
  • distortion /start
  • YFcur[ ⁇ >y]- YPstartfcy] distortion /end
  • Y denotes the luminance value of the picture
  • x specifies the column indices of the image
  • y specifies the row indices of the image
  • YF CU ⁇ denotes the luminance value of the current frame
  • Ypstart denotes the luminance value of the start frame
  • YF en d denotes the luminance value of the end frame.
  • a switching point is flagged as soon as distortion /start > distortion /end- Reverse coding is limited by application delay constraints and in the most open case, to Decoder Picture Buffer constraints specified in the MPEG-4 AVC standard.
  • an exemplary method for determining picture coding order is indicated generally by the reference numeral 400.
  • the method 400 relates to the function block 220 of the method 200 of FIG. 2.
  • the method 400 includes an initialization block 405 that passes control to a function block 410.
  • the function block 4 10 performs a shot detection, and passes control to a decision block 415.
  • the decision block 415 determines whether or not a fade-in or dissolve has been detected. If so, then control is passed to a function block 420. Otherwise, control is passed to a function block 425.
  • the function block 420 finds the switching point, and passes control to a function block 425.
  • the function block 425 decides the picture coding order, and passes control to an end block 430.
  • Table 1 illustrates picture type and coding order, where "BS" denotes a stored B picture and "B" denotes a disposable B picture.
  • the normalized distance between two consecutive P pictures in a GOP is used to decide M.
  • M is selected as the value which has the smallest distance.
  • Many distance metrics can be used, such as absolute difference of image, difference of histogram, histogram of difference, block histogram of difference, block variance difference, motion compensation error, and so forth.
  • the present principles are not limited to the use of any particular distance metric and, thus, any distance metric as readily contemplated by one of ordinary skill in this and related arts may be used in accordance with the teachings of the present principles, while maintaining the scope of the present principles.
  • histogram of difference i.e., the histogram of Y n -Ym, is denoted by hod(i) where ie[-g+1 ,-qr-1].
  • the distance measure is defined as follows:
  • a is a threshold for determining the closeness of the position to zero.
  • an exemplary method for selecting picture type is indicated generally by the reference numeral 500.
  • the method 500 relates to the function block 220 of the method 200 of FIG. 2.
  • the method 500 includes an initialization block 505 that initializes a variable min_dist to be equal to OxFFFF, and passes control to a loop limit block 510.
  • the function block 515 calculates the normalized distance norm_dist, and passes control to a decision block 520.
  • the decision block 520 determines whether or not norm_dist ⁇ min_dist. If so, then control is passed to a function block 525. Otherwise, control is passed to a loop limit block 530 that ends the loop.
  • the performing of reference picture selection for example, as performed by function block 225 of FIG.
  • Reference picture selection may be performed in two steps.
  • the first step involves deciding if the current encoded picture will be stored as a possible reference picture and which previously stored picture will be removed from the reference buffer.
  • the second step involves selecting the L reference pictures (L rs a pre-defined value by the encoder) from the reference list and deciding the order of the reference pictures, which will be used for each P/B picture encoding.
  • the first algorithm is based on the picture order count (POC) and is hereinafter referred to as the "POC algorithm”.
  • the second algorithm is based on a correlation metric and hereinafter referred to as the "CORRELATION algorithm”.
  • the removal of reference pictures is based on the order of POC: the picture with smallest POC number is removed first.
  • the reference list is first reordered and then we select the first L pictures as our reference pictures.
  • the reference list is the same as the initialization list.
  • the reference list is reordered according to the POC order, the same way as the initialization listO for B pictures.
  • step 1 is the same as POC algorithm.
  • step 2 a correlation metric is adopted for reference picture selection and reordering. The L reference pictures which have the highest correlation with the current picture are used.
  • YHistoDiff ref (j) YHisto cur (i) -YHisto ref (j)
  • YHistoDiff is the difference of luminance histogram
  • nb_bins means the number of bins
  • YHisto denotes histogram of luminance
  • ref denotes reference picture
  • cur denotes current picture.
  • a linear weight can be adopted as follows:
  • max_ref_distance denotes the maximum distance from the reference picture in the reference picture buffer to the current picture.
  • d(j) is the distance of reference picture j to current picture I, as defined earlier.
  • one advantage/feature is an apparatus that includes an encoder for encoding a video sequence using a Group of Pictures structure by performing, for each Group of Pictures for the video sequence, picture coding order selection, picture type selection, and reference picture selection. The selections are based upon a Group of Pictures length.
  • Another advantage/feature is the apparatus having the encoder as described above, wherein the encoder performs a shot detection to determine a temporal segmentation of the video sequence, decides the Group of Pictures length based on the temporal segmentation, and, within each of the Group of Pictures for the video sequence, performs the picture coding order selection, the picture type selection, and the reference picture selection.
  • Yet another advantage/feature is the apparatus having the encoder as described above, wherein the encoder sets the Group of Pictures length to a pre-defined value based on the temporal segmentation and absent any of a scene cut or a slow transition in the video sequence, restarts a new Group of Pictures for the video sequence from a first picture after the scene cut with the Group of Pictures length when the scene cut is detected, and restarts the new Group of Pictures from a starting point of the slow transition and ending at an ending point of the slow transition when the slow transition is detected.
  • another advantage/feature is the apparatus having the encoder as described above, wherein the video sequence includes a fade-in sequence, and the encoder uses reverse coding for fades and dissolves in the fade-in sequence. Further, another advantage/feature is the apparatus having the encoder that uses reverse coding as described above, wherein the encoder decides a switching point for the reverse coding based on a transition type. Also, another advantage/feature is the apparatus having the encoder that decides the switching point for the reverse coding as described above, wherein the encoder sets the switching point to a minimum of a maximal picture number that can be reversed while satisfying a delay constraint, a decoded picture buffer constraint, and an end picture of the fade-in sequence, when the transition type is pure fade-in. Additionally, another advantage/feature is the apparatus having the encoder that decides the switching point for the reverse coding as described above, wherein the encoder detects the switching point based on absolute differences of pictures, when the transition type is dissolve.
  • another advantage/feature is the apparatus having the encoder as described above, wherein the encoder selects a picture type from a pre-defined class of picture types, based on a normalized distance.
  • another advantage/feature is the apparatus having the encoder that selects the picture type from the pre-defined class of picture types as described above, wherein selection criteria for selecting the picture type from the pre-defined class of picture types includes at least one of absolute difference of image, difference of histogram, histogram of difference, block histogram of difference, block variance difference, or motion compensation error.
  • Another advantage/feature is the apparatus having the encoder as described above, wherein the encoder performs the reference picture selection based on at least one of Picture Order Count and correlation.
  • the teachings of the present principles are implemented as a combination of hardware and software.
  • the software may be implemented as an application program tangibly embodied on a program storage unit.
  • the application program may be uploaded to, and executed by, a machine comprising any suitable architecture.
  • the machine is implemented on a computer platform having hardware such as one or more central processing units (“CPU"), a random access memory (“RAM”), and input/output ("I/O") interfaces.
  • CPU central processing units
  • RAM random access memory
  • I/O input/output
  • the computer platform may also include an operating system and microinstruction code.
  • the various processes and functions described herein may be either part of the microinstruction code or part of the application program, or any combination thereof, which may be executed by a CPU.
  • various other peripheral units may be connected to the computer platform such as an additional data storage unit and a printing unit.

Abstract

There are provided a method and apparatus for adaptive Group of Pictures structure selection. The apparatus includes an encoder (100) for encoding a video sequence using a Group of Pictures structure by performing, for each Group of Pictures for the video sequence, picture coding order selection, picture type selection, and reference picture selection. The selections are based upon a Group of Pictures length.

Description

METHOD AND APPARATUS FOR ADAPTIVE GROUP OF PICTURES (GOP)
STRUCTURE SELECTION
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of U.S. Provisional Application Serial No. 60/765,552, filed 6 February, 2006, which is incorporated by reference herein in its entirety.
TECHNICAL FIELD
The present principles relate generally to video encoding and, more particularly, to a method and apparatus for adaptive Group of Pictures (GOP) structure selection.
BACKGROUND
In general, in older and current video coding standards and recommendations, a Group of Pictures (GOP) structure only involves GOP length (N) and picture type (i.e., P-picture interval M) selection. Such older video coding standards and recommendations include, for example, the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-1 (MPEG-1) standard, the ISO/IEC MPEG-2 standard, the International Telecommunication Union, Telecommunication Sector (ITU-T) H.263 recommendation. A new video compression standard/recommendation, the ISO/IEC MPEG-4 Part 10 Advanced Video Coding (AVC) standard/ITU-T H.264 recommendation (hereinafter the "MPEG-4 AVC standard"), provides several new tools to improve coding efficiency.
Similar to older video coding standards and recommendations, the MPEG-4 AVC standard uses/supports three different picture (slice) types (I, P and B pictures (slices)). Moreover, the MPEG-4 AVC standard includes new tools/features to improve coding efficiency.
For example, the MPEG-4 AVC standard decouples the order of reference pictures from the display order. In prior video coding standards and recommendations, there was a strict dependency between the ordering of pictures i from motion compensation purposes and the ordering of pictures for display purposes. In the MPEG-4 AVC standard, these restrictions are largely removed, allowing the encoder to choose the reference order and display order with more flexibility. Moreover, the MPEG-4 AVC standard decouples picture representation methods from picture referencing capability. In prior video coding standards and recommendations, B pictures cannot be used as references for the prediction of other pictures in the video sequence. In the MPEG-4 AVC standard, there is no such constraint. Any picture type can be used as a reference picture or a non- reference picture.
Further, the MPEG-4 AVC standard allows multiple reference pictures for motion compensation. With these new features, when a GOP structure is selected, we need to consider not only the GOP length and the picture type selection, but also the coding order of the picture and the reference picture selection. Most previous work related to the GOP structure has been concentrated on
GOP length and picture type selection. The GOP length is, in general, fixed by the application. When dynamic GOP length is allowed, the first picture after the scene change is coded as an l-picture, and the next GOP is merged to the current GOP.
In a first prior art approach, a method is disclosed in which the GOP structure is adapted by taking into account temporal segmentation. That is, picture types are adjusted according to the temporal variation of the input video.
!n a second prior art approach, it is disclosed that the optimal picture type in the GOP may be selected from possible candidates by solving a minimization problem with the Lagrange multiplier method. In a third prior art approach, a system is disclosed wherein macroblock activity information is used to decide picture type.
As noted above, most of the prior art related to the GOP structure has only concentrated on GOP length and picture type selection. However, the consideration of only GOP length and picture type selection disadvantageously limits the flexibility of the MPEG-4 AVC standard. SUMMARY These and other drawbacks and disadvantages of the prior art are addressed by the present principles, which are directed to a method and apparatus for adaptive Group of Pictures (GOP) structure selection.
According to an aspect of the present principles, there is provided an apparatus. The apparatus includes an encoder for encoding a video sequence using a Group of Pictures structure by performing, for each Group of Pictures for the video sequence, picture coding order selection, picture type selection, and reference picture selection. The selections are based upon a Group of Pictures length.
According to another aspect of the present principles, there is provided a video encoding method. The method includes encoding a video sequence using a Group of Pictures structure by performing, for each Group of Pictures for the video sequence, picture coding order selection, picture type selection, and reference picture selection. The selections are based upon a Group of Pictures length.
These and other aspects, features and advantages of the present principles will become apparent from the following detailed description of exemplary embodiments, which is to be read in connection with the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
The present principles may be better understood in accordance with the following exemplary figures, in which: FIG. 1 shows a block diagram for an exemplary video encoder to which the present principles may be applied, in accordance with an embodiment of the present principles;
FIG. 2 shows a flow diagram for an exemplary method for an adaptive Group of Picture (GOP) structure decision, in accordance with an embodiment of the present principles;
FIG. 3 shows a flow diagram for an exemplary method for performing a Group of Pictures (GOP) length decision, in accordance with an embodiment of the present principles; FIG. 4 shows a flow diagram for an exemplary method for determining picture coding order, in accordance with an embodiment of the present principles; and
FIG. 5 shows a flow diagram for an exemplary method for selecting picture type, in accordance with an embodiment of the present principles.
DETAILED DESCRIPTION
The present principles are directed to a method and apparatus for adaptive Group of Pictures (GOP) structure selection. The present description illustrates the present principles. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the present principles and are included within its spirit and scope.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the present principles and the concepts contributed by the inventor(s) to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions.
Moreover, all statements herein reciting principles, aspects, and embodiments of the present principles, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure. Thus, for example, it will be appreciated by those skilled in the art that the block diagrams presented herein represent conceptual views of illustrative circuitry embodying the present principles. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like represent various processes which may be substantially represented in computer readable media and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
The functions of the various elements shown in the figures may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term "processor" or "controller" should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor ("DSP") hardware, read-only memory ("ROM") for storing software, random access memory ("RAM"), and non-volatile storage.
Other hardware, conventional and/or custom, may also be included. Similarly, any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context. In the claims hereof, any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function. The present principles as defined by such claims reside in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
Reference in the specification to "one embodiment" or "an embodiment" of the present principles means that a particular feature, structure, characteristic, and so forth described in connection with the embodiment is included in at least one embodiment of the present principles. Thus, the appearances of the phrase "in one embodiment" or "in an embodiment" appearing in various places throughout the specification are not necessarily all referring to the same embodiment. Turning to FIG. 1 , an exemplary video encoder to which the present principles may be applied is indicated generally by the reference numeral 100.
A non-inverting input of a summing junction 110 and a first input of a motion estimator 180 are available as inputs to the video encoder 100. An output of the summing junction 110 is connected in signal communication with an input of a transformer 115. An output of the transformer 115 is connected in signal communication with an input of a quantizer 120. An output of the quantizer 120 is connected in signal communication with an input of a variable length coder (VLC) 140. An output of the VLC 140 is available as an output of the encoder 100.
The output of the quantizer 120 is further connected in signal communication with an input of an inverse quantizer 150. An output of the inverse quantizer 150 is connected in signal communication with an input of an inverse transformer. An output of the inverse transformer is connected in signal communication with an input of a reference picture store 170. A first output of the reference picture store 170 is connected in signal communication with a second input of a motion estimator 180. An output of the motion estimator 180 is connected in signal communication with a first input of a motion compensator 190. A second output of the reference picture store 170 is connected in signal communication with a second input of the motion compensator 190. An output of the motion compensator 190 is connected in signal communication with an inverting input of the summing junction 110.
Advantageously, a method and apparatus are provided for Group of Pictures (GOP) structure selection. In an embodiment, the method and apparatus for GOP structure selection may encompass within the decision process an analysis of GOP length, coding order of picture, picture type selection and picture reference decision. That is, the method and apparatus may jointly consider GOP length, picture coding order, picture type selection and reference picture selection for the GOP structure selection.
Although described in terms of a MPEG-4 AVC standard encoding scheme with adaptive GOP structure, which jointly considers GOP length, picture coding order, picture type selection and reference picture decision, it is to be appreciated that the present invention is not limited to the preceding considerations and is also not limited to the MPEG-4 AVC standard. That is, given the teachings of the present principles provided herein, one of ordinary skill in this and related arts will contemplate these and various other considerations and video coding standards/recommendations to which the present principles may be applied, while maintaining the scope of the present principles. In an embodiment, a shot detection is first performed. Then the GOP length is decided based on the temporal segmentation. Within each GOP, the picture coding order combined with the picture type and reference picture selection is then decided.
Turning to FIG. 2, an exemplary method for an adaptive Group of Picture (GOP) structure decision is indicated generally by the reference numeral 200.
The method 200 includes an initialization block 205 that passes control to a function block 210. The function block 210 performs a shot detection, and passes control to a function block 215. The function block 215 decides the GOP length N, and passes control to a function block 220. The function block 220 determines the picture coding order, performs a picture type selection, and passes control to a function block 225. The function block 225 performs reference picture selection (e.g., based on Picture Order Count (POC) and/or correlation), and passes control to a function block 230. The function block 230 encodes the pictures in the GOP, and passes control to a decision block 235. The decision block 235 determines whether or no the sequence is ended. If so, then control is passed to an end block 240. Otherwise, control is returned to the function block 210.
The selection of GOP length, for example, as performed by function block 215 of the method 200 of FlG. 2, will now be further described in accordance with an embodiment of the present principles. The GOP length is selected dynamically based on shot detection. Unlike prior methods, where only scene cut is detected, we also detect slow transitions, such as fade and dissolve. The GOP length N is generally fixed by a pre-defined value. If a scene cut is detected, then a new GOP restarts from the first picture after the scene cut with length N. If a slow transition is detected, then a new GOP restarts from the starting point of transition and ends at the ending point of transition.
Turning to FIG. 3, an exemplary method for performing a Group of Pictures (GOP) length decision is indicated generally by the reference numeral 300. The method 300 relates to the function block 215 of the method 200 of FIG. 2.
The method 300 includes an initialization block 305 that passes control to a function block 310. The function block 310 performs a shot detection, and passes control to a function block 315. The function block 315 determines whether or not a scene cut has been detected. If so, then control is passed to a function block 335. Otherwise, control is passed to a decision block 320. The function block 335 restarts a new GOP with a pre-defined length N, and passes control to an end block 330.
The decision block 320 determines whether or not a slow transition has been detected. If so, then control is passed to a function block 325. Otherwise, control is passed to a function block 340.
The function block 325 restarts a new GOP from the starting point of transition and ends at the end point of the transition, and passes control to the end block 330.
The function block 340 sets the GOP length to N, and passes control to the end block 330.
The determination of picture coding order, for example, as performed by function block 220 of the method 200 of FIG. 2, will now be further described in accordance with an embodiment of the present principles.
The picture coding order in each GOP is decided based on the characteristics of the content. For some specific feature, like cross-fades, a reverse coding of the fade-in sequence has higher coding efficiency. The detection of the switching point, from which reverse coding can happen, is considered in two cases.
In the first case, if the sequence is pure fade-in, then the switching point is set to the minimum of the maximal picture number that can be reversed while satisfying the delay constraint, a Decoded Picture Buffer (DPB) size, and the end picture of the fade-in sequence. Since we code the fade-in as a single GOP, we can reversely code the picture at the beginning of the GOP and at end of the GOP.
In the second case, if the sequence is dissolve, then the detection of the switching point is based on simple absolute differences of pictures. Of course, it is to be appreciated that other distortion metrics may also be used to detect the switching point, while maintaining the scope of the present principles. Distortion of the current pictures from the start and from the end pictures are computed as follows:
distortion /start = ∑|YFcur[χ>y]- YPstartfcy] distortion /end = ∑|YFcur.x»y]- YFend.x>y] > where Y denotes the luminance value of the picture, x specifies the column indices of the image, y specifies the row indices of the image, YFCUΓ denotes the luminance value of the current frame, Ypstart denotes the luminance value of the start frame, and YFend denotes the luminance value of the end frame. A switching point is flagged as soon as distortion /start > distortion /end- Reverse coding is limited by application delay constraints and in the most open case, to Decoder Picture Buffer constraints specified in the MPEG-4 AVC standard.
Turning to FIG. 4, an exemplary method for determining picture coding order is indicated generally by the reference numeral 400. The method 400 relates to the function block 220 of the method 200 of FIG. 2.
The method 400 includes an initialization block 405 that passes control to a function block 410. The function block 4 10 performs a shot detection, and passes control to a decision block 415. The decision block 415 determines whether or not a fade-in or dissolve has been detected. If so, then control is passed to a function block 420. Otherwise, control is passed to a function block 425.
The function block 420 finds the switching point, and passes control to a function block 425.
The function block 425 decides the picture coding order, and passes control to an end block 430.
The performing of picture type selection, for example, as performed by function block 220 of FIG. 2, will now be further described in accordance with an embodiment of the present principles.
We select picture type from, for example, M=1 , 2, 3, 4, as shown in Table 1 with deterministic coding order. It is to be appreciated that the present principles may also be applied to other picture types including, but not limited to, hierarchical B structures, while maintaining the scope of the present principles. Table 1 illustrates picture type and coding order, where "BS" denotes a stored B picture and "B" denotes a disposable B picture. The normalized distance between two consecutive P pictures in a GOP is used to decide M. M is selected as the value which has the smallest distance. Many distance metrics can be used, such as absolute difference of image, difference of histogram, histogram of difference, block histogram of difference, block variance difference, motion compensation error, and so forth. That is, the present principles are not limited to the use of any particular distance metric and, thus, any distance metric as readily contemplated by one of ordinary skill in this and related arts may be used in accordance with the teachings of the present principles, while maintaining the scope of the present principles. In an embodiment, we use histogram of difference, i.e., the histogram of Yn-Ym, is denoted by hod(i) where ie[-g+1 ,-qr-1]. The distance measure is defined as follows:
D(Yn Jn,) =
Λσd(0
where a is a threshold for determining the closeness of the position to zero.
Figure imgf000012_0001
TABLE 1
Turning to FIG. 5, an exemplary method for selecting picture type is indicated generally by the reference numeral 500. The method 500 relates to the function block 220 of the method 200 of FIG. 2.
The method 500 includes an initialization block 505 that initializes a variable min_dist to be equal to OxFFFF, and passes control to a loop limit block 510. The loop limit block 510 begins a loop (i=1:4) that loops over each of the different values of M in a Group of Pictures (GOP), and passes control to a function block 515. The function block 515 calculates the normalized distance norm_dist, and passes control to a decision block 520. The decision block 520 determines whether or not norm_dist < min_dist. If so, then control is passed to a function block 525. Otherwise, control is passed to a loop limit block 530 that ends the loop. The performing of reference picture selection, for example, as performed by function block 225 of FIG. 2, will now be further described in accordance with an embodiment of the present principles. Reference picture selection may be performed in two steps. The first step involves deciding if the current encoded picture will be stored as a possible reference picture and which previously stored picture will be removed from the reference buffer. The second step involves selecting the L reference pictures (L rs a pre-defined value by the encoder) from the reference list and deciding the order of the reference pictures, which will be used for each P/B picture encoding.
For illustrative purposes, two exemplary algorithms are provided herein. The first algorithm is based on the picture order count (POC) and is hereinafter referred to as the "POC algorithm". The second algorithm is based on a correlation metric and hereinafter referred to as the "CORRELATION algorithm".
In the POC algorithm, the removal of reference pictures is based on the order of POC: the picture with smallest POC number is removed first. For reference picture selection, the reference list is first reordered and then we select the first L pictures as our reference pictures. For B pictures, the reference list is the same as the initialization list. For P pictures, the reference list is reordered according to the POC order, the same way as the initialization listO for B pictures.
In the CORRELATION algorithm, step 1 is the same as POC algorithm. In step 2, a correlation metric is adopted for reference picture selection and reordering. The L reference pictures which have the highest correlation with the current picture are used.
Hereinafter, an algorithm is provided that uses difference of Histogram. However, it is to be appreciated that the present invention is not limited to solely using the difference of histogram in providing an adaptive Group of Pictures (GOP) structure selection and, thus, other metrics including, but not limited to, absolute difference of pixel, can also be used while maintaining the scope of the present principles.
We first compute the luminance histogram difference of the reference picture j to the current picture i as following:
YHistoDiffref (j) | YHistocur(i) -YHistoref(j)
Figure imgf000013_0001
where YHistoDiff is the difference of luminance histogram, nb_bins means the number of bins, and a(j) denotes the weight of the reference picture j, which has a distance d(j) = \POC(i)-POC(j)\ to the current picture / by assigning a smaller weight to the reference picture that is closer to the current picture. YHisto denotes histogram of luminance, ref denotes reference picture, and cur denotes current picture. A linear weight can be adopted as follows:
a (J) = (1 -(max_ref _dis tan ce-d(j))*0 Λ) ,
where max_ref_distance denotes the maximum distance from the reference picture in the reference picture buffer to the current picture. d(j) is the distance of reference picture j to current picture I, as defined earlier.
A description will now be given of some of the many attendant advantages/features of the present invention, some of which have been mentioned above. For example, one advantage/feature is an apparatus that includes an encoder for encoding a video sequence using a Group of Pictures structure by performing, for each Group of Pictures for the video sequence, picture coding order selection, picture type selection, and reference picture selection. The selections are based upon a Group of Pictures length.
Another advantage/feature is the apparatus having the encoder as described above, wherein the encoder performs a shot detection to determine a temporal segmentation of the video sequence, decides the Group of Pictures length based on the temporal segmentation, and, within each of the Group of Pictures for the video sequence, performs the picture coding order selection, the picture type selection, and the reference picture selection. Yet another advantage/feature is the apparatus having the encoder as described above, wherein the encoder sets the Group of Pictures length to a pre-defined value based on the temporal segmentation and absent any of a scene cut or a slow transition in the video sequence, restarts a new Group of Pictures for the video sequence from a first picture after the scene cut with the Group of Pictures length when the scene cut is detected, and restarts the new Group of Pictures from a starting point of the slow transition and ending at an ending point of the slow transition when the slow transition is detected.
Moreover, another advantage/feature is the apparatus having the encoder as described above, wherein the video sequence includes a fade-in sequence, and the encoder uses reverse coding for fades and dissolves in the fade-in sequence. Further, another advantage/feature is the apparatus having the encoder that uses reverse coding as described above, wherein the encoder decides a switching point for the reverse coding based on a transition type. Also, another advantage/feature is the apparatus having the encoder that decides the switching point for the reverse coding as described above, wherein the encoder sets the switching point to a minimum of a maximal picture number that can be reversed while satisfying a delay constraint, a decoded picture buffer constraint, and an end picture of the fade-in sequence, when the transition type is pure fade-in. Additionally, another advantage/feature is the apparatus having the encoder that decides the switching point for the reverse coding as described above, wherein the encoder detects the switching point based on absolute differences of pictures, when the transition type is dissolve.
Further, another advantage/feature is the apparatus having the encoder as described above, wherein the encoder selects a picture type from a pre-defined class of picture types, based on a normalized distance. Mover, another advantage/feature is the apparatus having the encoder that selects the picture type from the pre-defined class of picture types as described above, wherein selection criteria for selecting the picture type from the pre-defined class of picture types includes at least one of absolute difference of image, difference of histogram, histogram of difference, block histogram of difference, block variance difference, or motion compensation error.
Also, another advantage/feature is the apparatus having the encoder as described above, wherein the encoder performs the reference picture selection based on at least one of Picture Order Count and correlation.
These and other features and advantages of the present principles may be readily ascertained by one of ordinary skill in the pertinent art based on the teachings herein. It is to be understood that the teachings of the present principles may be implemented in various forms of hardware, software, firmware, special purpose processors, or combinations thereof.
Most preferably, the teachings of the present principles are implemented as a combination of hardware and software. Moreover, the software may be implemented as an application program tangibly embodied on a program storage unit. The application program may be uploaded to, and executed by, a machine comprising any suitable architecture. Preferably, the machine is implemented on a computer platform having hardware such as one or more central processing units ("CPU"), a random access memory ("RAM"), and input/output ("I/O") interfaces. The computer platform may also include an operating system and microinstruction code. The various processes and functions described herein may be either part of the microinstruction code or part of the application program, or any combination thereof, which may be executed by a CPU. In addition, various other peripheral units may be connected to the computer platform such as an additional data storage unit and a printing unit.
It is to be further understood that, because some of the constituent system components and methods depicted in the accompanying drawings are preferably implemented in software, the actual connections between the system components or the process function blocks may differ depending upon the manner in which the present principles are programmed. Given the teachings herein, one of ordinary skill in the pertinent art will be able to contemplate these and similar implementations or configurations of the present principles.
Although the illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the present principles is not limited to those precise embodiments, and that various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the scope or spirit of the present principles. All such changes and modifications are intended to be included within the scope of the present principles as set forth in the appended claims.

Claims

CLAIMS:
1. An apparatus, comprising: an encoder (100) for encoding a video sequence using a group of pictures structure by performing for a group of pictures for a video sequence: picture coding order selection, picture type selection, and reference picture selection, wherein the selections are based upon a group of pictures length.
2. The apparatus of claim 1 , wherein said encoder (100) performs a shot detection to determine a temporal segmentation of the video sequence, decides the group of pictures length based on the temporal segmentation, and, within each of the group of pictures for the video sequence, performs the picture coding order selection, the picture type selection, and the reference picture selection.
3. The apparatus of claim 2, wherein said encoder (100) sets the group of pictures length to a pre-defined value based on the temporal segmentation and absent any of a scene cut or a slow transition in the video sequence, restarts a new group of pictures for the video sequence from a first picture after the scene cut with the group of pictures length when the scene cut is detected, and restarts the new group of pictures from a starting point of the slow transition and ending at an ending point of the slow transition when the slow transition is detected.
4. The apparatus of claim 1 , wherein the video sequence includes a fade- in sequence, and said encoder uses reverse coding for fades and dissolves in the fade-in sequence.
5. The apparatus of claim 4, wherein said encoder (100) determines a switching point for the reverse coding based on a transition type.
6. The apparatus of claim 5, wherein said encoder (100) sets the switching point to a minimum of a maximal picture number that can be reversed while satisfying a delay constraint, a decoded picture buffer constraint, and an end picture of the fade-in sequence, when the transition type is pure fade-in.
7. The apparatus of claim 5, wherein said encoder (100) detects the switching point based on absolute differences of pictures, when the transition type is dissolve.
8. The apparatus of claim 1 , wherein said encoder (100) selects a picture type from a pre-defined class of picture types, based on a normalized distance.
9. The apparatus of claim 8, wherein selection criteria for selecting the picture type from the pre-defined class of picture types includes at least one of absolute difference of image, difference of histogram, histogram of difference, block histogram of difference, block variance difference, or motion compensation error. .
10. The apparatus of claim 1 , wherein said encoder (100) performs the reference picture selection based on at least one of a picture order count and a correlation.
11. A video encoding method, comprising: encoding (200) a video sequence using a group of pictures structure by performing for a group of pictures for a video sequence: picture coding order selection, picture type selection, and reference picture selection, wherein the selections are based upon a group of pictures length.
12. The method of claim 11, wherein said encoding step comprises: performing (210) a shot detection to determine a temporal segmentation of the video sequence; deciding (215) the group of pictures length based on the temporal segmentation; and performing (220, 225), within each group of pictures for the video sequence, the picture coding order selection, the picture type selection, and the reference picture selection.
13. The method of claim 12, wherein said encoding step further comprises: setting the group of pictures length to a pre-defined value based on the temporal segmentation and absent any of a scene cut or a slow transition in the video sequence; restarting (335) a new group of pictures for the video sequence from a first picture after the scene cut with the group of pictures length, when the scene cut is detected; and restarting (325) the new group of pictures from a starting point of the slow transition and ending at an ending point of the slow transition, when the slow transition is detected.
14. The method of claim 11 , wherein the video sequence includes a fade- in sequence, and said encoding step uses reverse coding for fades and dissolves in the fade-in sequence (400).
15. The method of claim 14, wherein said encoding step comprises deciding (420) a switching point for the reverse coding based on a transition type.
16. The method of claim 15, wherein said encoding step sets the switching point to a minimum of a maximal picture number that can be reversed while satisfying a delay constraint, a decoded picture buffer constraint, and an end picture of the fade-in sequence, when the transition type is pure fade-in (420).
17. The method of claim 15, wherein said encoding step detects the switching point based on absolute differences of pictures, when the transition type is dissolve (420).
18. The method of claim 11 , wherein said encoding step selects a picture type from a pre-defined class of picture types (220,500), based on a normalized distance (515).
19. The method of claim 18, wherein selection criteria for selecting the picture type from the pre-defined class of picture types includes at least one of absolute difference of image, difference of histogram, histogram of difference, block histogram of difference, block variance difference, or motion compensation error.
20. The method of claim 12, wherein said encoding step performs the reference picture selection based on at least one of a picture order count and a correlation (225).
PCT/US2007/002387 2006-02-06 2007-01-30 Method and apparatus for adaptive group of pictures (gop) structure selection WO2007092193A2 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
US12/087,637 US9602840B2 (en) 2006-02-06 2007-01-07 Method and apparatus for adaptive group of pictures (GOP) structure selection
EP07763138A EP1982528A2 (en) 2006-02-06 2007-01-30 Method and apparatus for adaptive group of pictures (gop) structure selection
CN2007800043664A CN101379828B (en) 2006-02-06 2007-01-30 Method and apparatus for adaptive group of pictures (GOP) structure selection
BRPI0707419-0A BRPI0707419A2 (en) 2006-02-06 2007-01-30 method and apparatus for adaptive image group structure (gop) selection
JP2008553288A JP5415084B2 (en) 2006-02-06 2007-01-30 Method and apparatus for adaptive picture group (GOP) structure selection

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US76555206P 2006-02-06 2006-02-06
US60/765,552 2006-02-06

Publications (2)

Publication Number Publication Date
WO2007092193A2 true WO2007092193A2 (en) 2007-08-16
WO2007092193A3 WO2007092193A3 (en) 2007-10-04

Family

ID=38283710

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2007/002387 WO2007092193A2 (en) 2006-02-06 2007-01-30 Method and apparatus for adaptive group of pictures (gop) structure selection

Country Status (6)

Country Link
US (1) US9602840B2 (en)
EP (1) EP1982528A2 (en)
JP (1) JP5415084B2 (en)
CN (1) CN101379828B (en)
BR (1) BRPI0707419A2 (en)
WO (1) WO2007092193A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2347518A1 (en) * 2008-11-12 2011-07-27 Thomson Licensing Light change coding
US9210431B2 (en) 2008-11-13 2015-12-08 Thomson Licensing Multiple thread video encoding using GOP merging and bit allocation

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9426477B2 (en) * 2010-02-25 2016-08-23 International Business Machines Corporation Method and apparatus for encoding surveillance video
FR2966679A1 (en) * 2010-10-25 2012-04-27 France Telecom METHODS AND DEVICES FOR ENCODING AND DECODING AT LEAST ONE IMAGE FROM A CORRESPONDING EPITOME, SIGNAL AND COMPUTER PROGRAM
GB2488816A (en) * 2011-03-09 2012-09-12 Canon Kk Mapping motion vectors from a plurality of reference frames to a single reference frame
CN102223535A (en) * 2011-06-07 2011-10-19 东莞电子科技大学电子信息工程研究院 Adaptive GOP (Group of Image) structure selection method based on SVC (Scalable Video Coding)
US20130094774A1 (en) * 2011-10-13 2013-04-18 Sharp Laboratories Of America, Inc. Tracking a reference picture based on a designated picture on an electronic device
US8768079B2 (en) 2011-10-13 2014-07-01 Sharp Laboratories Of America, Inc. Tracking a reference picture on an electronic device
US9866851B2 (en) * 2014-06-20 2018-01-09 Qualcomm Incorporated Full picture order count reset for multi-layer codecs
CN104506870B (en) * 2014-11-28 2018-02-09 北京奇艺世纪科技有限公司 A kind of video coding processing method and device suitable for more code streams
US10542283B2 (en) * 2016-02-24 2020-01-21 Wipro Limited Distributed video encoding/decoding apparatus and method to achieve improved rate distortion performance
KR20180076591A (en) * 2016-12-28 2018-07-06 삼성전자주식회사 Method of encoding video data, video encoder performing the same and electronic system including the same

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5592226A (en) * 1994-01-26 1997-01-07 Btg Usa Inc. Method and apparatus for video data compression using temporally adaptive motion interpolation
US20020028023A1 (en) * 2000-09-06 2002-03-07 Masahiro Kazayama Moving image encoding apparatus and moving image encoding method
US6771825B1 (en) * 2000-03-06 2004-08-03 Sarnoff Corporation Coding video dissolves using predictive encoders
US6959044B1 (en) * 2001-08-21 2005-10-25 Cisco Systems Canada Co. Dynamic GOP system and method for digital video encoding

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3221785B2 (en) * 1993-10-07 2001-10-22 株式会社日立製作所 Imaging device
JP3954656B2 (en) * 1994-09-29 2007-08-08 ソニー株式会社 Image coding apparatus and method
JP3384910B2 (en) * 1995-05-30 2003-03-10 株式会社日立製作所 Imaging device and image reproducing device
FR2764156B1 (en) 1997-05-27 1999-11-05 Thomson Broadcast Systems PRETREATMENT DEVICE FOR MPEG II CODING
US6195458B1 (en) * 1997-07-29 2001-02-27 Eastman Kodak Company Method for content-based temporal segmentation of video
JPH1175189A (en) * 1997-08-27 1999-03-16 Mitsubishi Electric Corp Image coding method
US6307886B1 (en) * 1998-01-20 2001-10-23 International Business Machines Corp. Dynamically determining group of picture size during encoding of video sequence
KR100571307B1 (en) * 1999-02-09 2006-04-17 소니 가부시끼 가이샤 Coding system and its method, coding device and its method, decoding device and its method, recording device and its method, and reproducing device and its method
JP2002010270A (en) 2000-06-27 2002-01-11 Mitsubishi Electric Corp Device and method for coding images
JP3815665B2 (en) 2000-12-27 2006-08-30 Kddi株式会社 Variable bit rate video encoding apparatus and recording medium
JP3907996B2 (en) * 2001-10-15 2007-04-18 日本電信電話株式会社 Image encoding apparatus, image decoding apparatus and method, image encoding program, and image decoding program
JP3888533B2 (en) 2002-05-20 2007-03-07 Kddi株式会社 Image coding apparatus according to image characteristics
US20040146108A1 (en) 2003-01-23 2004-07-29 Shih-Chang Hsia MPEG-II video encoder chip design
KR100596706B1 (en) 2003-12-01 2006-07-04 삼성전자주식회사 Method for scalable video coding and decoding, and apparatus for the same
WO2005055608A1 (en) 2003-12-01 2005-06-16 Samsung Electronics Co., Ltd. Method and apparatus for scalable video encoding and decoding
WO2005055606A1 (en) 2003-12-01 2005-06-16 Samsung Electronics Co., Ltd. Method and apparatus for scalable video encoding and decoding
KR100597402B1 (en) 2003-12-01 2006-07-06 삼성전자주식회사 Method for scalable video coding and decoding, and apparatus for the same

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5592226A (en) * 1994-01-26 1997-01-07 Btg Usa Inc. Method and apparatus for video data compression using temporally adaptive motion interpolation
US6771825B1 (en) * 2000-03-06 2004-08-03 Sarnoff Corporation Coding video dissolves using predictive encoders
US20020028023A1 (en) * 2000-09-06 2002-03-07 Masahiro Kazayama Moving image encoding apparatus and moving image encoding method
US6959044B1 (en) * 2001-08-21 2005-10-25 Cisco Systems Canada Co. Dynamic GOP system and method for digital video encoding

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BJONTEGAARD G ET AL: "Overview of the H.264/AVC video coding standard" IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, vol. 13, no. 7, July 2003 (2003-07), pages 560-576, XP011099249 ISSN: 1051-8215 *
DUMITRAS A ET AL: "I/P/B frame type decision by collinearity of displacements" IMAGE PROCESSING, 2004. ICIP '04. 2004 INTERNATIONAL CONFERENCE ON SINGAPORE 24-27 OCT. 2004, PISCATAWAY, NJ, USA,IEEE, 24 October 2004 (2004-10-24), pages 2769-2772, XP010786362 ISBN: 0-7803-8554-3 *
OZBEK N ET AL: "Fast H. 264/AVC Video Encoding with Multiple Frame References" IMAGE PROCESSING, 2005. ICIP 2005. IEEE INTERNATIONAL CONFERENCE ON GENOVA, ITALY 11-14 SEPT. 2005, PISCATAWAY, NJ, USA,IEEE, 11 September 2005 (2005-09-11), pages 597-600, XP010850820 ISBN: 0-7803-9134-9 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2347518A1 (en) * 2008-11-12 2011-07-27 Thomson Licensing Light change coding
JP2012509011A (en) * 2008-11-12 2012-04-12 トムソン ライセンシング Brightness change coding
EP2347518A4 (en) * 2008-11-12 2012-10-17 Thomson Licensing Light change coding
US9210431B2 (en) 2008-11-13 2015-12-08 Thomson Licensing Multiple thread video encoding using GOP merging and bit allocation

Also Published As

Publication number Publication date
JP2009526435A (en) 2009-07-16
JP5415084B2 (en) 2014-02-12
US20090122860A1 (en) 2009-05-14
US9602840B2 (en) 2017-03-21
WO2007092193A3 (en) 2007-10-04
BRPI0707419A2 (en) 2011-05-03
EP1982528A2 (en) 2008-10-22
CN101379828B (en) 2011-07-06
CN101379828A (en) 2009-03-04

Similar Documents

Publication Publication Date Title
US9602840B2 (en) Method and apparatus for adaptive group of pictures (GOP) structure selection
US11729415B2 (en) Method and device for encoding a sequence of images and method and device for decoding a sequence of images
EP1568222B1 (en) Encoding of video cross-fades using weighted prediction
US8498336B2 (en) Method and apparatus for adaptive weight selection for motion compensated prediction
JP2005191706A (en) Moving picture coding method and apparatus adopting the same
KR20080068716A (en) Method and apparatus for shot detection in video streaming
US20120207219A1 (en) Picture encoding apparatus, picture encoding method, and program
US9253493B2 (en) Fast motion estimation for multiple reference pictures
US20140169476A1 (en) Method and Device for Encoding a Sequence of Images and Method and Device for Decoding a Sequence of Image
WO2011019384A1 (en) Methods and apparatus for generalized weighted prediction
JP2007251996A (en) Moving picture coding method, and apparatus adopting same
JP2002051340A (en) Moving picture compression apparatus and method of the same
JP2004147305A (en) Image encoding apparatus, image decoding apparatus, and methods thereof

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 12087637

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 6142/DELNP/2008

Country of ref document: IN

WWE Wipo information: entry into national phase

Ref document number: 200780004366.4

Country of ref document: CN

Ref document number: 2007763138

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2008553288

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: PI0707419

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20080801