US20140072027A1 - System for video compression - Google Patents
System for video compression Download PDFInfo
- Publication number
- US20140072027A1 US20140072027A1 US13/611,959 US201213611959A US2014072027A1 US 20140072027 A1 US20140072027 A1 US 20140072027A1 US 201213611959 A US201213611959 A US 201213611959A US 2014072027 A1 US2014072027 A1 US 2014072027A1
- Authority
- US
- United States
- Prior art keywords
- encoding
- value
- parallel
- yuv
- color
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
- H04N19/436—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/15—Data rate or code amount at the encoder output by monitoring actual compressed data size at the memory before deciding storage at the transmission buffer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
- H04N19/423—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/182—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/184—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
Definitions
- the present invention relates to scalable video applications and more specifically to improving compression in scalable video applications.
- a video bit stream is called scalable when parts of the stream can be removed in a way that the resulting substream forms another valid bit stream for some target decoder, and the substream represents the source content with a reconstruction quality that is less than that of the complete original bit stream but is high when considering the lower quantity of remaining data.
- a system and method for providing video compression that includes encoding using an encoding engine a YUV stream wherein Y, U and V color values are encoded in parallel and patching together the Y, U and V color streams to form a compressed YUV output stream.
- the encoding engine further includes encoding each color value of the YUV stream in parallel using parallel encoding engines and a control engine for controlling operation all of the encoding engines in parallel.
- the YUV stream has an average bits per pixel value that varies from a first value to a second value that is a larger than (e.g., double) the first value.
- the encoding engine includes encoding the YUV stream in generally the same amount of time regardless of the average bits per pixel value.
- the encoding engine includes determining color values while avoiding null value registers and storing the determined color values in at least one buffer.
- the encoding engines further includes compressing level and register location of the stored determined color values from the at least one buffer in parallel.
- FIG. 1 is a block diagram of a computing system according to an embodiment of the present invention
- FIG. 2 is a block diagram of an entropy encoding engine according to an embodiment of the present invention
- FIG. 3 is a block diagram of an encoding engine according to an embodiment of the present invention.
- FIG. 4 is a diagram of collecting and buffering YUV color values according to an embodiment of the present invention.
- FIG. 5 is diagrammatic view of a MB residual compress engine according to an embodiment of the present invention.
- Embodiments of the invention as described herein provide a solution to the problems of conventional methods.
- various examples are given for illustration, but none are intended to be limiting.
- Embodiments include implementing a remote display system (either wired or wireless) using a standard, non-custom codec.
- H.264 refers to the standard for video compression that is also known as MPEG-4 Part 10, or MPEG-4 AVC (Advanced Video Coding).
- H.264 is one of the block-oriented motion-estimation-based codecs developed by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC Moving Picture Experts Group (MPEG).
- VCEG Video Coding Experts Group
- MPEG Moving Picture Experts Group
- other video formats could also be employed in alternative embodiments.
- Scalable Video Coding (SVC) that is gaining popularity for video conferencing type applications.
- SVC Scalable Video Coding
- a number of industry leading companies have standardized (or support the standard) using SVC in the UCIF (Universal Communications Interop Forum) for video conferencing.
- UCIF Universal Communications Interop Forum
- the H.264 standard supports the transmission of color video in the ‘YUV’ color format.
- ‘YUV represents the ‘luma’ value, or brightness
- ‘UV’ represents the color, or ‘chroma’ values.
- Each unique Y, U and V value comprises 8 bits, or one byte, of data.
- YUV standards support 24 bit per pixel (bpp) format for the YUV444 standard, 16 per pixel (bpp) format for the YUV422 standard, and 12 bit per pixel (bpp) format for the YUV411 standard and the YUV420 standard.
- the U and V color values are shared between every other pixel, which results in an average bit rate of 16.
- the U and V color values are shared between every four pixels, which results in an average bit rate of 12.
- the U and V color values are shared between every four pixels, which results in an average bit rate of 12, but the YUV are distributed in a reordered format. These bandwidth saving techniques take into account the human eye's lesser sensitivity to variations in color than in brightness.
- YUV444 format video is up to 2 times of the size of the space saving YUV420 format. Even so it is desirable to achieve compression speeds close to the YUV420 standard.
- some embodiments of the invention give a solution to this by compressing Y/U/V color values at a MacroBlock (MB) level in parallel, and doing reordering by concatenating the MB of the Y color value with its LTV color values. It will be appreciated by those skilled in the art that this embodiment is especially useful in large bit rate applications such as giga-bit wireless displays and avoids memory bandwidth consumption.
- MB MacroBlock
- Computers and other such data processing devices have at least one control processor that is generally known as a control processing unit (CPU). Such computers and processing devices operate in environments which can typically have memory, storage, input devices and output devices. Such computers and processing devices can also have other processors such as graphics processing units (GPU) that are used for specialized processing of various types and may be located with the processing devices or externally, such as, included the output device. For example, GPUs are designed to be particularly suited for graphics processing operations. GPUs generally comprise multiple processing elements that are ideally suited for executing the same instruction on parallel data streams, such as in data-parallel processing. In general, a CPU functions as the host or controlling processor and hands-off specialized functions such as graphics processing to other processors such as GPUs.
- CPU functions as the host or controlling processor and hands-off specialized functions such as graphics processing to other processors such as GPUs.
- multi-core CPUs where each CPU has multiple processing cores
- substantial processing capabilities that can also be used for specialized functions are available in CPUs.
- One or more of the computation cores of multi-core CPUs or GPUs can be part of the same die (e.g., AMD FusionTM) or in different dies (e.g., Intel XeonTM with NVIDIA GPU).
- hybrid cores having characteristics of both CPU and GPU e.g., CellSPETM, Intel LarrabeeTM
- GPGPU style of computing advocates using the CPU to primarily execute control code and to offload performance critical data-parallel code to the GPU.
- the GPU is primarily used as an accelerator.
- the combination of multi-core CPUs and GPGPU computing model encompasses both CPU cores and GPU cores as accelerator targets.
- Many of the multi-core CPU cores have performance that is comparable to GPUs in many areas.
- the floating point operations per second (FLOPS) of many CPU cores are now comparable to that of some GPU cores.
- Embodiments of the present invention may yield substantial advantages by enabling the use of the same or similar code base on CPU and GPU processors and also by facilitating the debugging of such code bases. While the present invention is described herein with illustrative embodiments for particular applications, it should be understood that the invention is not limited thereto. Those skilled in the art with access to the teachings provided herein will recognize additional modifications, applications, and embodiments within the scope thereof and additional fields in which the invention would be of significant utility.
- Embodiments of the present invention may be used in any computer system, computing device, entertainment system, media system, game systems, communication device, personal digital assistant, or any system using one or more processors. Such embodiments may be particularly useful where the system comprises a heterogeneous computing system.
- a “heterogeneous computing system,” as the term is used herein, is a computing system in which multiple kinds of processors are available.
- Embodiments of the present invention enable the same code base to be executed on different processors, such as GPUs and CPUs.
- Embodiments of the present invention can be particularly advantageous in processing systems having multi-core CPUs, and/or GPUs, because code developed for one type of processor can be deployed on another type of processor with little or no additional effort.
- code developed for execution on a GPU also known as GPU-kernels, can be deployed to be executed on a CPU, using embodiments of the present invention.
- Heterogeneous computing system 100 can include one or more processing units, such as processor 102 .
- Heterogeneous computing system 100 can also include at least one system memory 104 , at least one persistent storage device 106 , at least one system bus 108 , at least one input device 110 and output device 112 .
- a processing unit of the type suitable for heterogeneous computing are the accelerated processing units (APUs) sold under various brand names by Advanced Micro Devices of Sunnyvale, Calif. according to an embodiment of the present invention as illustrated by FIG. 2 .
- a heterogeneous processing unit includes one or more CPUs and one or more GPUs, such as a wide single instruction, multiple data (SIMD) processor and unified video decoder perform functions previously handled by a discrete GPU. It will be understood that when referring to the GPU structure and function, such functions are carried out by the SIMD.
- Heterogeneous processing units can also include at least one memory controller for accessing system memory and that also provides memory shared between the GPU and CPU and a platform interface for handling communication with input and output devices through, for example, a controller hub.
- a wide single instruction, multiple data (SIMD) processor for carrying out graphics processing instructions may be included to provide a heterogenous GPU capability in accordance with an embodiment of the present invention or a discrete GPU may be included separated from the CPU to implement the embodiment; however, as will be understood by those skilled in the art, additional latency my be experienced in an implementation of the present invention using a discrete GPU.
- SIMD single instruction, multiple data
- architecture of the types described above are well suited to provide a solution for implementing hardware encoding and/or decoding in higher resolution YUV standards, such as YUV444.
- YUV444 video streams there are two types supported, namely, a separate-color-plane YUV444 and non-separate-color-plane YUV444, where color is used in this context to also to refer to chroma and color plane is used in this context to also refer to Y/U/V color values.
- a separate-color-plane stream the 3 color values of YUV have no dependency and compress independently, and the 3 color values are joined together into one whole video stream at the end of each slice of video data, where typically, a slice is a frame.
- a MB level represents a compression unit in the H.264 specification and typically refers to a 16 ⁇ 16 pixel block in one frame, and they share the same prediction-mode.
- the average pixel size of YUV444 format video at 24 bits per pixel is 2 times of the average pixel size of YUV420 format at 12 bits per pixel.
- the Y/U/V color values are encoded and decoded in a sequential process.
- an embodiment of the present invention includes a hardware configuration to compress Y/U/V color values in parallel using 3 encode engines. Each encoder is dedicated to encode one of the Y, U or V color values. For a separate-color-plane stream, this embodiment concatenates the Y/U/V color values at the end of each slice. For a non-separate-color-plane stream, the embodiment concatenates the Y/U/V color values at the end of each MB, where for each MB the Y color value is concatenated with corresponding UV color values.
- CAVLC context-adaptive variable-length coding
- each Y/U/V color value may be compressed using a base encoding unit, such as a 4 ⁇ 4 pixel block.
- the entropy encoder includes two data-paths to compress each 4 ⁇ 4 block in parallel.
- FIG. 2 shows the block diagram of Y/U/V color values concatenating at the top level in which an exemplary YUV stream is described in connection with the entropy encoding engine 200 .
- the entropy encoding engine includes a top control (topctrl) engine 202 and three encoding engines 204 , 206 and 208 connected via a bus 209 to the topctrl engine 202 .
- Each of the encoding engines 204 , 206 and 208 receives respective Y, U and V data from a local memory 210 and outputs encoded respective Y, U and V values to respective local buffers 212 , 214 and 216 .
- the buffer 212 associated with the Y color value encoder 204 connects directly to the system memory 218 for outputting the final YUV compressed stream.
- the exemplary YUV stream is a non-separate-color-plane stream; however, it will be appreciated by those skilled in the art that the same features of the entropy encoding engine 200 may be implemented to process a separate-color-plane stream.
- the entropy encoder's firmware first checks the status of topctrl engine 202 and the 3 encoding engines 204 , 206 and 208 to confirm that they are ready to accept new YUV data, and then the topctrl engine 202 signals the encoding engines 204 , 206 and 208 to begin processing new YUV data.
- the encoding engines 204 , 206 and 208 begin to encode simultaneously. Each Y/U/V color value will go into each encoding engine 204 , 206 and 208 .
- Each Y/U/V color's output will be written into temporary local memory 212 , 214 , 216 .
- U and V color values have the same type of local memory 214 and 216 , but for the Y color value, the local memory 212 is connected to system memory 218 , and the local memory 212 content can be written into system memory 218 automatically.
- topctrl engine 202 Monitoring and control of the three encoding engines 204 , 206 and 208 at the same time is accomplished by the topctrl engine 202 using the following engines:
- an internal buffer may be used for local memory to eliminate data exchanges with external memory can be added. This is also do-able when the hardware is configured with a fast processor or as a heterogeneous computing platform described above.
- the data-flow for the Y color value encoding engine 300 is shown.
- the non-separate color stream is used to exemplify the data flow in which a compress unit is one MB in the form of a 16 ⁇ 16 block.
- the header information After reading the MB header from local memory 302 , the header information will be stored into local flops/buffer 304 , and then trigger the MB header compress 306 as part of the compressing engine 308 to begin compression of the header.
- the beginning of header compression is a trigger signal that will also trigger a residual buffer 310 to read residual 4 ⁇ 4 blocks from local memory and store them into the residual buffer.
- a Residual-pre-process engine 312 to monitor the status of the residual-buffer 310 , once there is one 4 ⁇ 4 block coefficient available and the Residual pre-process engine 312 will read out the 4 ⁇ 4 block, pre-process the data, store the result into a First-In, First-Out (FIFO) buffer 314 .
- FIFO First-In, First-Out
- a MB-residual-compress engine 316 within the compressing engine 308 monitors both the MB-header-compress 306 and the FIFO buffer 314 status. When the MB-header-compress 306 is done and there are valid data in the FIFO buffer 314 , the residual-compress engine 316 will begin to compress the residual.
- the Probability Interval Partitioning Entropy (PIPE) coding engine 318 is an inserted pipe-stage in order to break the big pipe delay in the data-flow from conventional data flow scenarios.
- a stream packer engine 322 has two tasks in which one is do some regular processing to conform the encoded YUV stream to by H.264 standard and the other is to sequentially read back the U and then V color values and patch them into the output after Y plane at MB level and written to the local memory 320 .
- an improved process provided by the residual-pre-process engine 312 of FIG. 3 is shown operating on a unit having a 4 ⁇ 4 block of residual data.
- the residual-pre-process engine 400 first scans the 4 ⁇ 4 2D arrays into 1D array 402 as described in the H.264 standard, and then begins to parse the 16 residuals.
- the 16 residuals in the 1 D array 402 is one by one, which need at least 16 cycles to complete one 4 ⁇ 4 blk.
- a fast parse process is used, which only parses the non-zero-residuals.
- a 1 D array 404 having four coefficients with 11 zeros and one trailing zero requires 5 cycles to complete parsing of the 1 D array.
- the FIFO buffer 406 stores only the data relevant to the residual information including the coefficient value 408 and location 410 based upon intervening zeros.
- a MB residual compress engine 500 is shown.
- the level steps 502 to 506 and run_before steps 508 to 510 are compressed sequentially.
- An embodiment using the improved FIFO buffer 408 ( FIG. 4 ) that includes two FIFO buffers for the coefficient value 408 and location 410 based upon intervening zeros improvement includes level steps 512 to 516 ( FIG. 5 ) and run_before steps 518 to 520 compresses the level and run_before in a parallel process.
- the run_before compress result will be stored into a local memory, once all the element before run_before are compressed, the data in local-memory will be read out and patch into the stream. It will be appreciated that this implementation the residual-pre-process engine 400 ( FIG. 4 ) and the MB residual compress 500 ( FIG. 5 ) will have similar process time, and make the pipe-line-delay more balanced.
- the entropy encoding speed will be generally totally determined by the kernel engine speed.
- the hardware described above can be implemented using a processor executing instruction from a non-transitory storage medium.
- a hardware description language that is a code for describing a circuit.
- An exemplary use of HDLs is the simulation of designs before the designer must commit to fabrication.
- the two most popular HDLs are VHSIC Hardware Description Language (VHDL) and VERILOG.
- VHDL was developed by the U.S. Department of Defense and is an open standard.
- VERILOG also called Open VERILOG International (OVI)
- OPI Open VERILOG International
- VHDL is an HDL defined by IEEE standard 1076.1.
- Boundary Scan Description Language (BSDL) is a subset of VHDL, and provides a standard machine- and human readable data format for describing how an IEEE Std 1149.1 boundary-scan architecture is implemented and operates in a device. Any HDL of the types described can be used to create instructions representative of the hardware description.
Abstract
A system and method for providing video compression that includes encoding using an encoding engine a YUV stream wherein Y, U and V color values are encoded in parallel and patching together the Y, U and V color streams to form a compressed YUV output stream. The encoding engine further includes encoding each color value of the YUV stream in parallel using parallel encoding engines and a control engine for controlling operation all of the encoding engines in parallel. The YUV stream has an average bits per pixel value that varies from a first value to a second value that is double the first value. The encoding engine includes encoding the YUV stream in generally the same amount of time regardless of the average bits per pixel value.
Description
- The present invention relates to scalable video applications and more specifically to improving compression in scalable video applications.
- Currently, the remote transfer and display of video data using consumer electronics devices has become a field of significant development. Generally, it is desirable to permit such streaming between devices with different display capabilities. With the advent of different video devices having different video resolutions, it is desirable to compress the video stream thereby increasing the amount of data transmitted to communicate the highest video resolution that can be transferred, yet it is also desirable to permit viewing of such video streams with devices that may only permit lower resolution video streams or may have throughput or slow processing capabilities that render such higher resolution video signals impracticable. These issues have become particularly pronounced with the advent of high definition (HD) video, although the problem should not be construed as being limited to HD video. Thus, scalable video streams are increasing in popularity. In general, a video bit stream is called scalable when parts of the stream can be removed in a way that the resulting substream forms another valid bit stream for some target decoder, and the substream represents the source content with a reconstruction quality that is less than that of the complete original bit stream but is high when considering the lower quantity of remaining data.
- The usual modes of compression can result in differences in the amount of time required to encode/decode higher resolution video (which may or may not conform to known “high definition” formats) in comparison to a lower resolution. In systems that support scalable video delays in processing, the video stream for higher resolution video can become a limiting factor in the overall system performance. Thus, the need exists for a way to reduce or eliminate the effects of delays due to compression of video.
- A system and method for providing video compression that includes encoding using an encoding engine a YUV stream wherein Y, U and V color values are encoded in parallel and patching together the Y, U and V color streams to form a compressed YUV output stream.
- In some embodiments, the encoding engine further includes encoding each color value of the YUV stream in parallel using parallel encoding engines and a control engine for controlling operation all of the encoding engines in parallel.
- The YUV stream has an average bits per pixel value that varies from a first value to a second value that is a larger than (e.g., double) the first value. The encoding engine includes encoding the YUV stream in generally the same amount of time regardless of the average bits per pixel value.
- In some embodiments the encoding engine includes determining color values while avoiding null value registers and storing the determined color values in at least one buffer.
- In some embodiments the encoding engines further includes compressing level and register location of the stored determined color values from the at least one buffer in parallel.
- Other aspects, advantages and novel features of embodiments of the invention will become more apparent from the following detailed description of the invention when considered in conjunction with the accompanying drawings wherein:
-
FIG. 1 is a block diagram of a computing system according to an embodiment of the present invention; -
FIG. 2 is a block diagram of an entropy encoding engine according to an embodiment of the present invention; -
FIG. 3 is a block diagram of an encoding engine according to an embodiment of the present invention; -
FIG. 4 is a diagram of collecting and buffering YUV color values according to an embodiment of the present invention; and -
FIG. 5 is diagrammatic view of a MB residual compress engine according to an embodiment of the present invention. - Embodiments of the invention as described herein provide a solution to the problems of conventional methods. In the following description, various examples are given for illustration, but none are intended to be limiting. Embodiments include implementing a remote display system (either wired or wireless) using a standard, non-custom codec.
- For purposes of this description, “H.264” refers to the standard for video compression that is also known as MPEG-4 Part 10, or MPEG-4 AVC (Advanced Video Coding). H.264 is one of the block-oriented motion-estimation-based codecs developed by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC Moving Picture Experts Group (MPEG). However, other video formats could also be employed in alternative embodiments.
- Included with in the features of H.264 is Scalable Video Coding (SVC) that is gaining popularity for video conferencing type applications. A number of industry leading companies have standardized (or support the standard) using SVC in the UCIF (Universal Communications Interop Forum) for video conferencing.
- The H.264 standard supports the transmission of color video in the ‘YUV’ color format. In ‘YUV,’ ‘Y’ represents the ‘luma’ value, or brightness, and ‘UV’ represents the color, or ‘chroma’ values.
- Each unique Y, U and V value comprises 8 bits, or one byte, of data. YUV standards support 24 bit per pixel (bpp) format for the YUV444 standard, 16 per pixel (bpp) format for the YUV422 standard, and 12 bit per pixel (bpp) format for the YUV411 standard and the YUV420 standard. In the YUV422 standard, the U and V color values are shared between every other pixel, which results in an average bit rate of 16. In the YUV411 standard, the U and V color values are shared between every four pixels, which results in an average bit rate of 12. In the YUV420 standard, the U and V color values are shared between every four pixels, which results in an average bit rate of 12, but the YUV are distributed in a reordered format. These bandwidth saving techniques take into account the human eye's lesser sensitivity to variations in color than in brightness.
- It will be appreciated by those skilled in the art that the size of YUV444 format video is up to 2 times of the size of the space saving YUV420 format. Even so it is desirable to achieve compression speeds close to the YUV420 standard. Advantageously, some embodiments of the invention give a solution to this by compressing Y/U/V color values at a MacroBlock (MB) level in parallel, and doing reordering by concatenating the MB of the Y color value with its LTV color values. It will be appreciated by those skilled in the art that this embodiment is especially useful in large bit rate applications such as giga-bit wireless displays and avoids memory bandwidth consumption.
- In the following description, numerous specific details are introduced to provide a thorough understanding of, and enabling description for, embodiments of the implementing low latency applications. One skilled in the relevant art, however, will recognize that these embodiments can be practiced without one or more of the specific details, or with other components, systems, etc. In other instances, well-known structures or operations are not shown, or are not described in detail, to avoid obscuring aspects of the disclosed embodiments.
- Computers and other such data processing devices have at least one control processor that is generally known as a control processing unit (CPU). Such computers and processing devices operate in environments which can typically have memory, storage, input devices and output devices. Such computers and processing devices can also have other processors such as graphics processing units (GPU) that are used for specialized processing of various types and may be located with the processing devices or externally, such as, included the output device. For example, GPUs are designed to be particularly suited for graphics processing operations. GPUs generally comprise multiple processing elements that are ideally suited for executing the same instruction on parallel data streams, such as in data-parallel processing. In general, a CPU functions as the host or controlling processor and hands-off specialized functions such as graphics processing to other processors such as GPUs.
- With the availability of multi-core CPUs where each CPU has multiple processing cores, substantial processing capabilities that can also be used for specialized functions are available in CPUs. One or more of the computation cores of multi-core CPUs or GPUs can be part of the same die (e.g., AMD Fusion™) or in different dies (e.g., Intel Xeon™ with NVIDIA GPU). Recently, hybrid cores having characteristics of both CPU and GPU (e.g., CellSPE™, Intel Larrabee™) have been generally proposed for General Purpose GPU (GPGPU) style computing. The GPGPU style of computing advocates using the CPU to primarily execute control code and to offload performance critical data-parallel code to the GPU. The GPU is primarily used as an accelerator. The combination of multi-core CPUs and GPGPU computing model encompasses both CPU cores and GPU cores as accelerator targets. Many of the multi-core CPU cores have performance that is comparable to GPUs in many areas. For example, the floating point operations per second (FLOPS) of many CPU cores are now comparable to that of some GPU cores.
- Embodiments of the present invention may yield substantial advantages by enabling the use of the same or similar code base on CPU and GPU processors and also by facilitating the debugging of such code bases. While the present invention is described herein with illustrative embodiments for particular applications, it should be understood that the invention is not limited thereto. Those skilled in the art with access to the teachings provided herein will recognize additional modifications, applications, and embodiments within the scope thereof and additional fields in which the invention would be of significant utility.
- Embodiments of the present invention may be used in any computer system, computing device, entertainment system, media system, game systems, communication device, personal digital assistant, or any system using one or more processors. Such embodiments may be particularly useful where the system comprises a heterogeneous computing system. A “heterogeneous computing system,” as the term is used herein, is a computing system in which multiple kinds of processors are available.
- Embodiments of the present invention enable the same code base to be executed on different processors, such as GPUs and CPUs. Embodiments of the present invention, for example, can be particularly advantageous in processing systems having multi-core CPUs, and/or GPUs, because code developed for one type of processor can be deployed on another type of processor with little or no additional effort. For example, code developed for execution on a GPU, also known as GPU-kernels, can be deployed to be executed on a CPU, using embodiments of the present invention.
- An example heterogeneous computing system 100, according to an embodiment of the present invention, is shown in
FIG. 1 . Heterogeneous computing system 100 can include one or more processing units, such as processor 102. Heterogeneous computing system 100 can also include at least one system memory 104, at least one persistent storage device 106, at least one system bus 108, at least one input device 110 and output device 112. - A processing unit of the type suitable for heterogeneous computing are the accelerated processing units (APUs) sold under various brand names by Advanced Micro Devices of Sunnyvale, Calif. according to an embodiment of the present invention as illustrated by
FIG. 2 . A heterogeneous processing unit includes one or more CPUs and one or more GPUs, such as a wide single instruction, multiple data (SIMD) processor and unified video decoder perform functions previously handled by a discrete GPU. It will be understood that when referring to the GPU structure and function, such functions are carried out by the SIMD. Heterogeneous processing units can also include at least one memory controller for accessing system memory and that also provides memory shared between the GPU and CPU and a platform interface for handling communication with input and output devices through, for example, a controller hub. - A wide single instruction, multiple data (SIMD) processor for carrying out graphics processing instructions may be included to provide a heterogenous GPU capability in accordance with an embodiment of the present invention or a discrete GPU may be included separated from the CPU to implement the embodiment; however, as will be understood by those skilled in the art, additional latency my be experienced in an implementation of the present invention using a discrete GPU.
- Advantageously, architecture of the types described above are well suited to provide a solution for implementing hardware encoding and/or decoding in higher resolution YUV standards, such as YUV444.
- In H.264 spec, there are two types of YUV444 video streams supported, namely, a separate-color-plane YUV444 and non-separate-color-plane YUV444, where color is used in this context to also to refer to chroma and color plane is used in this context to also refer to Y/U/V color values. In a separate-color-plane stream, the 3 color values of YUV have no dependency and compress independently, and the 3 color values are joined together into one whole video stream at the end of each slice of video data, where typically, a slice is a frame. In a non-separate-color-plane stream, the 3 color values of Y/U/V are integrated together at each MB level, where a MB level represents a compression unit in the H.264 specification and typically refers to a 16×16 pixel block in one frame, and they share the same prediction-mode.
- As described above, the average pixel size of YUV444 format video at 24 bits per pixel is 2 times of the average pixel size of YUV420 format at 12 bits per pixel. Conventionally, the Y/U/V color values are encoded and decoded in a sequential process. To achieve compression speeds close to YUV420, an embodiment of the present invention includes a hardware configuration to compress Y/U/V color values in parallel using 3 encode engines. Each encoder is dedicated to encode one of the Y, U or V color values. For a separate-color-plane stream, this embodiment concatenates the Y/U/V color values at the end of each slice. For a non-separate-color-plane stream, the embodiment concatenates the Y/U/V color values at the end of each MB, where for each MB the Y color value is concatenated with corresponding UV color values.
- It will be appreciated that to achieve the parallel compression of each color value in YUV, a re-design of the data-path, pipeline as well as parallelizing the entropy encoding process as much as possible is required to improve the performance.
- Furthermore, it has been found that parallel encoding is especially useful in large bit rate applications such as, but not limited to, giga-bit wireless displays. Additionally, it has been found that this solution adapts well with context-adaptive variable-length coding (CAVLC), which is a form of entropy coding used in the H.264 video encoding standard.
- In this embodiment of the invention, each Y/U/V color value may be compressed using a base encoding unit, such as a 4×4 pixel block. The entropy encoder includes two data-paths to compress each 4×4 block in parallel.
-
FIG. 2 shows the block diagram of Y/U/V color values concatenating at the top level in which an exemplary YUV stream is described in connection with theentropy encoding engine 200. The entropy encoding engine includes a top control (topctrl)engine 202 and three encodingengines bus 209 to thetopctrl engine 202. Each of theencoding engines local memory 210 and outputs encoded respective Y, U and V values to respectivelocal buffers buffer 212 associated with the Ycolor value encoder 204 connects directly to thesystem memory 218 for outputting the final YUV compressed stream. Thebuffers encoder 204 for the Y color value. As theentropy encoding engine 200 will be further described, the exemplary YUV stream is a non-separate-color-plane stream; however, it will be appreciated by those skilled in the art that the same features of theentropy encoding engine 200 may be implemented to process a separate-color-plane stream. In operation, as each MB in the non-separate-color-plane stream becomes available inlocal memory 210 for processing, the entropy encoder's firmware first checks the status oftopctrl engine 202 and the 3encoding engines engine 202 signals theencoding engines encoding engines encoding engines encoding engine local memory local memory local memory 212 is connected tosystem memory 218, and thelocal memory 212 content can be written intosystem memory 218 automatically. - Monitoring and control of the three
encoding engines topctrl engine 202 using the following engines: -
- a. An Idle
Ready engine 220 determines when theentropy encoder 200 is read to accept new data. - b. A
busy encoding engine 222 will then check all three encoding engines are all busy. - c. An encoding
complete engine 224 then waits and identifies when all three cores are idle. - d. A U color
value patching engine 226 then triggers theY encoding engine 204 to fetch U-color output from U'slocal memory 214, write the encoded U color value into Y'slocal memory 212 and wait for theY encoding engine 204 to finish. - e. A V color
value patching engine 228 then triggers theY encoding engine 204 to fetch V-color output from V'slocal memory 216, write the encoded V color value into Y'slocal memory 212 and wait for theY encoding engine 204 to finish. - f. Upon completion of the V color
value patching engine 228, the encode YUV data is written out to thesystem memory 218 and the topctrlengine 202 returns to the IDLEReady engine 220 to await the availability of additional YUV color values to begin another MB encoding loop.
- a. An Idle
- It will be appreciated by those skilled in the art that, if the
patch engines patch engines - Finally for the best performance, an internal buffer may be used for local memory to eliminate data exchanges with external memory can be added. This is also do-able when the hardware is configured with a fast processor or as a heterogeneous computing platform described above.
- With reference to
FIG. 3 , the data-flow for the Y colorvalue encoding engine 300 is shown. Once again the non-separate color stream is used to exemplify the data flow in which a compress unit is one MB in the form of a 16×16 block. - It will be appreciated that in order to speed up each color plane compressing as much as possible, this solution also pipelines the data-path, and makes each pipe-stage delay balanced.
- After reading the MB header from
local memory 302, the header information will be stored into local flops/buffer 304, and then trigger theMB header compress 306 as part of the compressingengine 308 to begin compression of the header. At the same time, the beginning of header compression is a trigger signal that will also trigger aresidual buffer 310 to read residual 4×4 blocks from local memory and store them into the residual buffer. - A Residual-
pre-process engine 312 to monitor the status of the residual-buffer 310, once there is one 4×4 block coefficient available and theResidual pre-process engine 312 will read out the 4×4 block, pre-process the data, store the result into a First-In, First-Out (FIFO)buffer 314. - A MB-residual-
compress engine 316 within the compressingengine 308 monitors both the MB-header-compress 306 and theFIFO buffer 314 status. When the MB-header-compress 306 is done and there are valid data in theFIFO buffer 314, the residual-compress engine 316 will begin to compress the residual. - The Probability Interval Partitioning Entropy (PIPE)
coding engine 318 is an inserted pipe-stage in order to break the big pipe delay in the data-flow from conventional data flow scenarios. - It will be appreciated by those skilled in the art, that the functionality of the U and
V encoding engines 206 and 208 (FIG. 2 ) have also now been described, where the data from the PIPE is written to thelocal memory 320. The remaining features described inFIG. 3 are unique to the Y color value encoding engine. - A
stream packer engine 322 has two tasks in which one is do some regular processing to conform the encoded YUV stream to by H.264 standard and the other is to sequentially read back the U and then V color values and patch them into the output after Y plane at MB level and written to thelocal memory 320. - With reference to
FIG. 4 , an improved process provided by the residual-pre-process engine 312 ofFIG. 3 is shown operating on a unit having a 4×4 block of residual data. The residual-pre-process engine 400 first scans the 4×4 2D arrays into1D array 402 as described in the H.264 standard, and then begins to parse the 16 residuals. In a conventional parsing process, the 16 residuals in the 1D array 402 is one by one, which need at least 16 cycles to complete one 4×4 blk. In an embodiment, a fast parse process is used, which only parses the non-zero-residuals. By way of example, but not by limitation, a 1D array 404 having four coefficients with 11 zeros and one trailing zero requires 5 cycles to complete parsing of the 1 D array. TheFIFO buffer 406 stores only the data relevant to the residual information including thecoefficient value 408 andlocation 410 based upon intervening zeros. - With reference to
FIG. 5 , a MBresidual compress engine 500 is shown. In a conventional embodiment, the level steps 502 to 506 andrun_before steps 508 to 510 are compressed sequentially. An embodiment using the improved FIFO buffer 408 (FIG. 4 ) that includes two FIFO buffers for thecoefficient value 408 andlocation 410 based upon intervening zeros improvement includes level steps 512 to 516 (FIG. 5 ) andrun_before steps 518 to 520 compresses the level and run_before in a parallel process. The run_before compress result will be stored into a local memory, once all the element before run_before are compressed, the data in local-memory will be read out and patch into the stream. It will be appreciated that this implementation the residual-pre-process engine 400 (FIG. 4 ) and the MB residual compress 500 (FIG. 5 ) will have similar process time, and make the pipe-line-delay more balanced. - 3 Result for Speed
- By the improvements described above and while excluding the local memory bandwidth, the entropy encoding speed will be generally totally determined by the kernel engine speed.
- Without considering local memory bandwidth, the analyze result as below shows:
- cycles/mb=(nzc+6)*(num—4×4—+1)*1.15+100cycles/header+UVbits/10, where “nzc” is the number of non-zero transform coefficient.
- Furthermore, it will be appreciated that by implementing this configuration, encoding times for YUV regardless of the whether YUV444 or YUV420 will have approximately the same processing time due to the parallel entropy encoding of the Y, U and V color values.
- In another exemplary embodiment, the hardware described above can be implemented using a processor executing instruction from a non-transitory storage medium. Those skilled in the art can appreciate that the instructions are created using a hardware description language (HDL) that is a code for describing a circuit. An exemplary use of HDLs is the simulation of designs before the designer must commit to fabrication. The two most popular HDLs are VHSIC Hardware Description Language (VHDL) and VERILOG. VHDL was developed by the U.S. Department of Defense and is an open standard. VERILOG, also called Open VERILOG International (OVI), is an industry standard developed by a private entity, and is now an open standard referred to as IEEE Standard 1364. A file written in VERILOG code that describes a Joint Test Access Group (JTAG) compliant device is called a VERILOG netlist. VHDL is an HDL defined by IEEE standard 1076.1. Boundary Scan Description Language (BSDL) is a subset of VHDL, and provides a standard machine- and human readable data format for describing how an IEEE Std 1149.1 boundary-scan architecture is implemented and operates in a device. Any HDL of the types described can be used to create instructions representative of the hardware description.
- Although the invention has been described in terms of exemplary embodiments, it is not limited thereto. Rather, the appended claims should be construed broadly, to include other variants and embodiments of the invention, which may be made by those skilled in the art without departing from the scope and range of equivalents of the invention.
Claims (20)
1. A system for video compression comprising:
an encoding engine for encoding a YUV stream wherein Y, U and V color values are encoded in parallel and for patching together said Y, U and V color streams to form a compressed YUV output stream.
2. The system of claim 1 wherein:
said YUV stream has an average bits per pixel value that varies from a first value to a second value that is double the first value; and
said encoding engine encodes said YUV stream in generally the same amount of time regardless of the average bits per pixel value.
3. The system of claim 2 wherein:
said encoding engine encodes said YUV stream at a rate generally determined by kernel engine speed.
4. The system of claim 1 wherein said encoding engine includes:
one encoding engine for each color value of said YUV stream; and
a control engine for operating all of said encoding engines in parallel.
5. The system of claim 4 wherein:
said control engine controls the patching of said encoded U and V color values with said encoded Y color value.
6. The system of claim 5 wherein:
said patching of said encoded U and V color values with said encoded Y color value are completed sequentially after parallel encoding of said Y, U and V color values.
7. The system of claim 1 said encoding engine including:
a residual pre-processing engine for determine color values while avoiding null value registers; and
at least one buffer for storing said determined color values.
8. The system of claim 7 said encoding engine including:
compressing level and register location of said stored determined color values from said at least one buffer in parallel.
9. A method for video compression comprising:
encoding using an encoding engine a YUV stream wherein Y, U and V color values are encoded in parallel; and
patching together said Y, U and V color streams to form a compressed YUV output stream.
10. The method of claim 9 wherein:
said YUV stream has an average bits per pixel value that varies from a first value to a second value that is double the first value; and
encoding said YUV stream in generally the same amount of time regardless of the average bits per pixel value.
11. The method of claim 10 wherein:
encoding said YUV stream at a rate generally determined by kernel engine speed.
12. The method of claim 9 wherein said encoding includes:
encoding each color value of said YUV stream in parallel using parallel encoding engines; and
controlling operation all of said encoding engines in parallel.
13. The method of claim 12 wherein:
controlling operation includes controlling the patching of said encoded U and V color values with said encoded Y color value.
14. The method of claim 13 wherein:
said patching of said encoded U and V color values with said encoded Y color value are completed sequentially after parallel encoding of said Y, U and V color values.
15. The method of claim 9 said encoding includes:
determining color values while avoiding null value registers; and
storing said determined color values in at least one buffer.
16. The method of claim 15 said encoding includes:
compressing level and register location of said stored determined color values from said at least one buffer in parallel.
17. A computer readable non-transitory medium including instructions which when executed in a processing system cause the system to provide video compression comprising:
encoding using an encoding engine a YUV stream wherein Y, U and V color values are encoded in parallel;
patching together said Y, U and V color streams to form a compressed YUV output stream; and
said encoding includes:
encoding each color value of said YUV stream in parallel using parallel encoding engines; and
controlling operation all of said encoding engines in parallel.
18. The computer readable non-transitory medium of claim 17 wherein:
said YUV stream has an average bits per pixel value that varies from a first value to a second value that is double the first value; and
encoding said YUV stream in generally the same amount of time regardless of the average bits per pixel value.
19. The computer readable non-transitory medium of claim 17 said encoding includes:
determining color values while avoiding null value registers; and
storing said determined color values in at least one buffer.
20. The computer readable non-transitory medium of claim 19 said encoding includes:
compressing level and register location of said stored determined color values from said at least one buffer in parallel.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/611,959 US20140072027A1 (en) | 2012-09-12 | 2012-09-12 | System for video compression |
US15/491,887 US10542268B2 (en) | 2012-09-12 | 2017-04-19 | System for video compression |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/611,959 US20140072027A1 (en) | 2012-09-12 | 2012-09-12 | System for video compression |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/491,887 Continuation US10542268B2 (en) | 2012-09-12 | 2017-04-19 | System for video compression |
Publications (1)
Publication Number | Publication Date |
---|---|
US20140072027A1 true US20140072027A1 (en) | 2014-03-13 |
Family
ID=50233258
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/611,959 Abandoned US20140072027A1 (en) | 2012-09-12 | 2012-09-12 | System for video compression |
US15/491,887 Active 2033-04-19 US10542268B2 (en) | 2012-09-12 | 2017-04-19 | System for video compression |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/491,887 Active 2033-04-19 US10542268B2 (en) | 2012-09-12 | 2017-04-19 | System for video compression |
Country Status (1)
Country | Link |
---|---|
US (2) | US20140072027A1 (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9661340B2 (en) | 2012-10-22 | 2017-05-23 | Microsoft Technology Licensing, Llc | Band separation filtering / inverse filtering for frame packing / unpacking higher resolution chroma sampling formats |
US9749646B2 (en) | 2015-01-16 | 2017-08-29 | Microsoft Technology Licensing, Llc | Encoding/decoding of high chroma resolution details |
US9854201B2 (en) | 2015-01-16 | 2017-12-26 | Microsoft Technology Licensing, Llc | Dynamically updating quality to higher chroma sampling rate |
CN107948652A (en) * | 2017-11-21 | 2018-04-20 | 青岛海信电器股份有限公司 | A kind of method and apparatus for carrying out image conversion |
CN108053452A (en) * | 2017-12-08 | 2018-05-18 | 浙江理工大学 | A kind of digital image colors extracting method based on mixed model |
US9979960B2 (en) | 2012-10-01 | 2018-05-22 | Microsoft Technology Licensing, Llc | Frame packing and unpacking between frames of chroma sampling formats with different chroma resolutions |
CN109120938A (en) * | 2017-06-26 | 2019-01-01 | 深圳市中兴微电子技术有限公司 | A kind of Camera middle layer image processing method and system on chip |
WO2019041222A1 (en) * | 2017-08-31 | 2019-03-07 | 深圳市大疆创新科技有限公司 | Encoding method, decoding method, encoding apparatus and decoding apparatus |
US20190222623A1 (en) * | 2017-04-08 | 2019-07-18 | Tencent Technology (Shenzhen) Company Limited | Picture file processing method, picture file processing device, and storage medium |
US10368080B2 (en) | 2016-10-21 | 2019-07-30 | Microsoft Technology Licensing, Llc | Selective upsampling or refresh of chroma sample values |
US11394396B2 (en) * | 2020-09-25 | 2022-07-19 | Advanced Micro Devices, Inc. | Lossless machine learning activation value compression |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020094031A1 (en) * | 1998-05-29 | 2002-07-18 | International Business Machines Corporation | Distributed control strategy for dynamically encoding multiple streams of video data in parallel for multiplexing onto a constant bit rate channel |
US20060256854A1 (en) * | 2005-05-16 | 2006-11-16 | Hong Jiang | Parallel execution of media encoding using multi-threaded single instruction multiple data processing |
US20080137731A1 (en) * | 2005-09-20 | 2008-06-12 | Mitsubishi Electric Corporation | Image encoding method and image decoding method, image encoder and image decoder, and image encoded bit stream and recording medium |
US20080253461A1 (en) * | 2007-04-13 | 2008-10-16 | Apple Inc. | Method and system for video encoding and decoding |
US7460725B2 (en) * | 2006-11-09 | 2008-12-02 | Calista Technologies, Inc. | System and method for effectively encoding and decoding electronic information |
US20090175548A1 (en) * | 2007-05-17 | 2009-07-09 | Sony Corporation | Information processing device and method |
US20100119167A1 (en) * | 2008-11-11 | 2010-05-13 | Sony Corporation | Image decoding apparatus, image decoding method and computer program |
US20100225655A1 (en) * | 2009-03-06 | 2010-09-09 | Microsoft Corporation | Concurrent Encoding/Decoding of Tiled Data |
US20110235699A1 (en) * | 2010-03-24 | 2011-09-29 | Sony Computer Entertainment Inc. | Parallel entropy coding |
US20120093234A1 (en) * | 2007-11-13 | 2012-04-19 | Elemental Technologies, Inc. | Video encoding and decoding using parallel processors |
US20120230598A1 (en) * | 2009-09-24 | 2012-09-13 | Sony Corporation | Image processing apparatus and image processing method |
US20140023286A1 (en) * | 2012-07-19 | 2014-01-23 | Xuanming Du | Decoder performance through quantization control |
Family Cites Families (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS59126368A (en) * | 1983-01-10 | 1984-07-20 | Hitachi Ltd | Coder and encoder |
JPH1013858A (en) * | 1996-06-27 | 1998-01-16 | Sony Corp | Picture encoding method, picture decoding method and picture signal recording medium |
JP4427827B2 (en) * | 1998-07-15 | 2010-03-10 | ソニー株式会社 | Data processing method, data processing apparatus, and recording medium |
US7233622B2 (en) * | 2003-08-12 | 2007-06-19 | Lsi Corporation | Reduced complexity efficient binarization method and/or circuit for motion vector residuals |
BRPI0609281A2 (en) * | 2005-04-13 | 2010-03-09 | Thomson Licensing | method and apparatus for video decoding |
US20080137744A1 (en) * | 2005-07-22 | 2008-06-12 | Mitsubishi Electric Corporation | Image encoder and image decoder, image encoding method and image decoding method, image encoding program and image decoding program, and computer readable recording medium recorded with image encoding program and computer readable recording medium recorded with image decoding program |
CN101783943B (en) * | 2005-09-20 | 2013-03-06 | 三菱电机株式会社 | Image encoder and image encoding method |
US8300694B2 (en) * | 2005-09-20 | 2012-10-30 | Mitsubishi Electric Corporation | Image encoding method and image decoding method, image encoder and image decoder, and image encoded bit stream and recording medium |
US8306112B2 (en) * | 2005-09-20 | 2012-11-06 | Mitsubishi Electric Corporation | Image encoding method and image decoding method, image encoder and image decoder, and image encoded bit stream and recording medium |
US8036517B2 (en) * | 2006-01-25 | 2011-10-11 | Qualcomm Incorporated | Parallel decoding of intra-encoded video |
US20080170793A1 (en) * | 2007-01-12 | 2008-07-17 | Mitsubishi Electric Corporation | Image encoding device and image encoding method |
JP2008193627A (en) * | 2007-01-12 | 2008-08-21 | Mitsubishi Electric Corp | Image encoding device, image decoding device, image encoding method, and image decoding method |
US8542748B2 (en) * | 2008-03-28 | 2013-09-24 | Sharp Laboratories Of America, Inc. | Methods and systems for parallel video encoding and decoding |
US8194736B2 (en) * | 2008-04-15 | 2012-06-05 | Sony Corporation | Video data compression with integrated lossy and lossless compression |
BR112012008770A2 (en) * | 2009-10-14 | 2018-11-06 | Sharp Kk | methods for parallel video encoding and decoding. |
WO2011088594A1 (en) * | 2010-01-25 | 2011-07-28 | Thomson Licensing | Video encoder, video decoder, method for video encoding and method for video decoding, separately for each colour plane |
US20120014431A1 (en) * | 2010-07-14 | 2012-01-19 | Jie Zhao | Methods and Systems for Parallel Video Encoding and Parallel Video Decoding |
US20120014429A1 (en) * | 2010-07-15 | 2012-01-19 | Jie Zhao | Methods and Systems for Parallel Video Encoding and Parallel Video Decoding |
US8344917B2 (en) * | 2010-09-30 | 2013-01-01 | Sharp Laboratories Of America, Inc. | Methods and systems for context initialization in video coding and decoding |
US9060173B2 (en) * | 2011-06-30 | 2015-06-16 | Sharp Kabushiki Kaisha | Context initialization based on decoder picture buffer |
US20130003823A1 (en) * | 2011-07-01 | 2013-01-03 | Kiran Misra | System for initializing an arithmetic coder |
-
2012
- 2012-09-12 US US13/611,959 patent/US20140072027A1/en not_active Abandoned
-
2017
- 2017-04-19 US US15/491,887 patent/US10542268B2/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020094031A1 (en) * | 1998-05-29 | 2002-07-18 | International Business Machines Corporation | Distributed control strategy for dynamically encoding multiple streams of video data in parallel for multiplexing onto a constant bit rate channel |
US20060256854A1 (en) * | 2005-05-16 | 2006-11-16 | Hong Jiang | Parallel execution of media encoding using multi-threaded single instruction multiple data processing |
US20080137731A1 (en) * | 2005-09-20 | 2008-06-12 | Mitsubishi Electric Corporation | Image encoding method and image decoding method, image encoder and image decoder, and image encoded bit stream and recording medium |
US7460725B2 (en) * | 2006-11-09 | 2008-12-02 | Calista Technologies, Inc. | System and method for effectively encoding and decoding electronic information |
US20080253461A1 (en) * | 2007-04-13 | 2008-10-16 | Apple Inc. | Method and system for video encoding and decoding |
US20090175548A1 (en) * | 2007-05-17 | 2009-07-09 | Sony Corporation | Information processing device and method |
US20120093234A1 (en) * | 2007-11-13 | 2012-04-19 | Elemental Technologies, Inc. | Video encoding and decoding using parallel processors |
US20100119167A1 (en) * | 2008-11-11 | 2010-05-13 | Sony Corporation | Image decoding apparatus, image decoding method and computer program |
US20100225655A1 (en) * | 2009-03-06 | 2010-09-09 | Microsoft Corporation | Concurrent Encoding/Decoding of Tiled Data |
US20120230598A1 (en) * | 2009-09-24 | 2012-09-13 | Sony Corporation | Image processing apparatus and image processing method |
US20110235699A1 (en) * | 2010-03-24 | 2011-09-29 | Sony Computer Entertainment Inc. | Parallel entropy coding |
US20140023286A1 (en) * | 2012-07-19 | 2014-01-23 | Xuanming Du | Decoder performance through quantization control |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9979960B2 (en) | 2012-10-01 | 2018-05-22 | Microsoft Technology Licensing, Llc | Frame packing and unpacking between frames of chroma sampling formats with different chroma resolutions |
US9661340B2 (en) | 2012-10-22 | 2017-05-23 | Microsoft Technology Licensing, Llc | Band separation filtering / inverse filtering for frame packing / unpacking higher resolution chroma sampling formats |
US10044974B2 (en) | 2015-01-16 | 2018-08-07 | Microsoft Technology Licensing, Llc | Dynamically updating quality to higher chroma sampling rate |
US9749646B2 (en) | 2015-01-16 | 2017-08-29 | Microsoft Technology Licensing, Llc | Encoding/decoding of high chroma resolution details |
US9854201B2 (en) | 2015-01-16 | 2017-12-26 | Microsoft Technology Licensing, Llc | Dynamically updating quality to higher chroma sampling rate |
US10368080B2 (en) | 2016-10-21 | 2019-07-30 | Microsoft Technology Licensing, Llc | Selective upsampling or refresh of chroma sample values |
US20190222623A1 (en) * | 2017-04-08 | 2019-07-18 | Tencent Technology (Shenzhen) Company Limited | Picture file processing method, picture file processing device, and storage medium |
US11012489B2 (en) * | 2017-04-08 | 2021-05-18 | Tencent Technology (Shenzhen) Company Limited | Picture file processing method, picture file processing device, and storage medium |
CN109120938A (en) * | 2017-06-26 | 2019-01-01 | 深圳市中兴微电子技术有限公司 | A kind of Camera middle layer image processing method and system on chip |
WO2019041222A1 (en) * | 2017-08-31 | 2019-03-07 | 深圳市大疆创新科技有限公司 | Encoding method, decoding method, encoding apparatus and decoding apparatus |
CN107948652A (en) * | 2017-11-21 | 2018-04-20 | 青岛海信电器股份有限公司 | A kind of method and apparatus for carrying out image conversion |
CN108053452A (en) * | 2017-12-08 | 2018-05-18 | 浙江理工大学 | A kind of digital image colors extracting method based on mixed model |
US11394396B2 (en) * | 2020-09-25 | 2022-07-19 | Advanced Micro Devices, Inc. | Lossless machine learning activation value compression |
Also Published As
Publication number | Publication date |
---|---|
US10542268B2 (en) | 2020-01-21 |
US20170223370A1 (en) | 2017-08-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10542268B2 (en) | System for video compression | |
US11057585B2 (en) | Image processing method and device using line input and output | |
US10659796B2 (en) | Bandwidth saving architecture for scalable video coding spatial mode | |
TW583883B (en) | System and method for multiple channel video transcoding | |
US5812791A (en) | Multiple sequence MPEG decoder | |
US7085320B2 (en) | Multiple format video compression | |
US9392292B2 (en) | Parallel encoding of bypass binary symbols in CABAC encoder | |
US20140153635A1 (en) | Method, computer program product, and system for multi-threaded video encoding | |
US20080170611A1 (en) | Configurable functional multi-processing architecture for video processing | |
US20060176955A1 (en) | Method and system for video compression and decompression (codec) in a microprocessor | |
US20230276023A1 (en) | Image processing method and device using a line-wise operation | |
US20060176960A1 (en) | Method and system for decoding variable length code (VLC) in a microprocessor | |
CN103986934A (en) | Video processor with random access to compressed frame buffer and methods for use therewith | |
CN103246499A (en) | Device and method for parallelly processing images | |
AU2019101272A4 (en) | Method and apparatus for super-resolution using line unit operation | |
US8427494B2 (en) | Variable-length coding data transfer interface | |
Kim et al. | A real-time MPEG encoder using a programmable processor | |
US7675972B1 (en) | System and method for multiple channel video transcoding | |
Shichao et al. | A scalable multi-pipeline JPEG encoding architecture | |
US9330060B1 (en) | Method and device for encoding and decoding video image data | |
US20070192393A1 (en) | Method and system for hardware and software shareable DCT/IDCT control interface | |
Zhu et al. | Hardware JPEG decoder and efficient post-processing functions for embedded application | |
WO1996036178A1 (en) | Multiple sequence mpeg decoder and process for controlling same | |
Rani et al. | Early Performance Analysis of Fully Pipelined JPEG Engine in the Simulation Environment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: ADVANCED MICRO DEVICES, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LI, HAIBIN;CHEN, ROY;ZHOU, JI;AND OTHERS;REEL/FRAME:028970/0359 Effective date: 20120910 Owner name: ATI TECHNOLOGIES ULC, CANADA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ZHANG, LEI;REEL/FRAME:028970/0370 Effective date: 20120911 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |