Suche Bilder Maps Play YouTube News Gmail Drive Mehr »
Erweiterte Patentsuche | Abbildungen der Seite | Webprotokoll | Anmelden

Patente

  
[graphic][merged small]

// file is the input sources for the MPEG streams

1 BitStream *inbs = BitStreamNew (65536);

2 BitParser *inbp = BitParserNew ();

3 MpegSeqHdr *seqhdr = MpegSeqHdrNew ();

4 MpegPicHdr *pichdr = MpegPicHdrNew ();

5 int w, h, vbvsize, done=0;

6 BitParserAttach (inbp, inbs);

7 BitStreamFileRead (inbs, file, 0);

8 MpegSeqHdrFind (inbp);

9 MpegSeqHdrParse (inbp, seqhdr);

10 w = seqhdr->width;

11 h = seqhdr->height;

12 vbvsize = seqhdr->vbv_buf_size;

13 r = ByteNew(w, h);

14 g = ByteNew(w, h);

15 b = ByteNew(w, h);

16 y = ByteNew(w, h);

17 u = ByteNew(w/2, h/2);

18 v = ByteNew(w/2, h/2);

19 dcty = DctNew(w/16, h/16);

20 dctu = DctNew(w/32, h/32);

21 dctv = DctNew(w/32, h/32);

22 while (!done) {

23 int marker;

24 mpeg_any_markerFind (inbp);

25 marker = MpegGetCurrMarker (inbp);

26 switch (marker) {

27 case PIC_HDR_MARKER:

28 MpegPicHdrParse (inbp, pichdr);

29 if (pichdr->type == I_FRAME) {

30 MpegPicIParse (inbp,dcty,dctu,dctv);

31 DctToByte (dcty, y);

32 DctToByte (dctu, u);

33 DctToByte (dctv, v);

34 YuvToRgb420 (y, u, v, r, g, b);

35 36 37 38 39 40 41 42 43 44 45 46

if (!done) {

UpdatelfUnderflow (inbp,inbs,file,vbvsize);

[blocks in formation]

FIG. 6

1 #define SIZE (128*1024)

2 int len, offset, start = 0;

3 MpegPktHdr *hdr = MpegPktHdrNew();

4 BitStream *bs = BitStreamNew (SIZE);

5 BitParser *bp = BitParserNew ();

6 BitStreamFilter * filter = BitStreamFilterNew();

7 BitParserAttach (bp, bs);

8 BitStreamFileRead (bs, file);

9 offset = MpegPktHdrFind (bp);

10 while (Seof(file) && !EndOfBitstream(bp)) {

11 MpegPktHdrParse (bp, hdr);

12 if (hdr->id ==32) {

13 len = hdr->len;

14 BitStreamFilterAdd(filter, offset, len);

15 start += UpdatelfUnderflow (bp,bs,file,SIZE);

16 offset = start + MpegPktHdrFind(bp);

17 }

18 )

FIG. 7

1

WEB-BASED VIDEO-EDITING METHOD
AND SYSTEM USING A HIGH-
PERFORMANCE MULTIMEDIA SOFTWARE
LIBRARY

5

STATEMENT OF GOVERNMENT INTEREST

This invention was partially funded by the Government under a grant from DARPA. The Government has certain rights in portions of the invention.

FIELD OF THE INVENTION

This invention relates generally to multimedia software and more particularly to libraries for use in building processing-intensive multimedia software for Web-based 15 video-editing applications.

BACKGROUND OF THE INVENTION

The multimedia research community has traditionally focused its efforts on the compression, transport, storage and 20 display of multimedia data. These technologies are fundamentally important for applications such as video conferencing and video-on-demand. The results of these efforts have made their way into many commercial products. For example, JPEG and MPEG, described below, are ubiquitous standards from image and audio/video compression.

There are, however, problems in content-based retrieval and understanding, video production, and transcoding for heterogeneity and bandwidth adaptation. The lack of a 3Q high-performance library, or "toolkit", that can be used to build processing-intensive multimedia applications is hindering development in multimedia applications. In particular, in the area of video-editing, large volumes of data need to be stored, accessed and manipulated in an efficient 35 manner. Also, special hardware, such as MPEG accelerators, are needed for video processing applications. Solutions to the problems of storing video data include client-server applications and editing over the World Wide Web (Web). Web-based video-editing is particularly desirable because it 4Q allows access to data stored in many different repositories, and special hardware may be distributed. With Web-based video-editing, any computer with Internet access may be used to do video-editing because no special storage capability or processing capability is needed at the local level. 45 The existing multimedia toolkits, however, do not have sufficiently high performance to make Web-based applications practical.

The data standards GIF, JPEG and MPEG dominate image and video data in the current state of the art. GIF 50 (Graphics Interchange Format) is a bit-mapped graphics file format used commonly on the Web. JPEG (Joint Photographic Experts Group) is the internationally accepted standard for image data. JPEG is designed for compressing full color or gray-scale still images. For video data, including 55 audio data, the international standard is MPEG (Moving Picture Experts Group). MPEG is actually a general reference to an evolving series of standards. For the sake of simplicity, the various MPEG versions will be referred to as the "MPEG standard" or simply "MPEG". The MPEG 60 standard achieves a high rate of data compression by storing only the changes from one frame to another instead of an entire image.

The MPEG standard has four types of image coding for processing, the I-frame, the P-frame, the B-frame and the 65 D-frame (from an early version of MPEG, but absent in later standards).

2

The I-frame (Intra-coded image) is self-contained, i.e. coded without any reference to other images. The I-frame is treated as a still image, and MPEG uses the JPEG standard to encode it. Compression in MPEG is often executed in real time and the compression rate of I-frames is the lowest within the MPEG standard. I-frames are used as points for random access in MPEG streams.

The P-frame (Predictive-coded frame) requires information of the previous I-frame in an MPEG stream, and/or all of the previous P-frames, for encoding and decoding. Coding of P-frames is based on the principle that areas of the image shift instead of change in successive images.

The B-frame (Bi-directionally predictive-coded frame) requires information from both the previous and the following I-frame and/or P-frame in the MPEG stream for encoding and decoding. B-frames have the highest compression ratio within the MPEG standard.

The D-frame (DC-coded frame) is intra-frame encoded. The D-frame is absent in more recent versions of the MPEG standard, however, applications are still required to deal with D-frames when working with the older MPEG versions. D-frames consist only of the lowest frequencies of an image. D-frames are used for display in fast-forward and fastrewind modes. These modes could also be accomplished using a suitable order of I-frames.

Video information encoding is accomplished in the MPEG standard using DCT (discrete cosine transform). This technique represents wave form data as a weighted sum of cosines. DCT is also used for data compression in the JPEG standard.

Currently, there are several inadequate options from which to choose in order to make up for the lack of a high-performance multimedia toolkit. First, code could be developed from scratch as needed in order to solve a particular problem, but this is difficult given the complex multimedia standards such as JPEG and MPEG. Second, existing code could be modified but this results in systems that are complex, unmanageable, and generally difficult to maintain, debug, and reuse. Third, existing standard libraries like 00MPEG of the MPEG standard, or Independent JPEG Group (UP) of the JPEG standard could be used, but the details of the functions in these libraries are hidden, and only limited optimizations can be performed.

It remains desirable to have a high-performance toolkit for multi-media processing.

It is an object of the present invention to provide a method and apparatus to enable client-server video-editing.

It is another object of the present invention to provide a method and apparatus to enable Web-based video-editing.

SUMMARY OF THE INVENTION

The problems of Web-based video-editing are solved by the present invention of incorporating a high-performance library as part of the video processing application. The Web-based video editor has a graphical user interface (GUI), a GUI-to-backend interface, and a backend video-editing engine. The high performance library enables the interface and the engine to perform video-editing tasks with low latency over the Web. The high-performance library includes a set of simple, interoperable, high-performance primitives and abstractions that can be composed to create higher level operations and data types. The libraries of the present invention lie between a high level API and low level C code. The libraries expose some low level operations and data structures but provide a higher level of abstraction than C code.

3

The libraries give users full control over memory utilization and input/output (I/O) because none of the library routines implicitly allocate memory or perform I/O. The libraries provide thin primitives, and functions which expose the structure of the bitstream. 5

The present invention together with the above and other advantages may best be understood from the following detailed description of the embodiments of the invention illustrated in the drawings, wherein:

10

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a Web-based video-editing system having high-performance libraries according to principles of the invention;

FIG. 2 is a schematic of memory clipping according to the principles of the invention;

FIG. 3 shows stereo samples interleaved in memory;

FIG. 4 shows a library function that performs the picture in picture operation according to principles of the invention; 20

FIG. 5 shows the format of an MPEG-1 video stream;

FIG. 6 shows a library function that decodes the I-frames in an MPEG video into RGB images according to principles of the invention; and,

25

FIG. 7 shows a library function which acts as a filter that can be used to copy the packets of a first video stream from a system stream stored in a first BitStream to a second Bitstream according to principles of the invention.

DETAILED DESCRIPTION OF PREFERRED 30
EMBODIMENTS

FIG. 1 shows a client/server Web-based video-editing system 10. A client computer (client) 15 is able to connect to a plurality of servers 20, 22, 24 over the Web 30. The 35 client 15 runs a video-editing graphical user interface (GUI) 32. In the present embodiment, the GUI 32 may be implemented in a language such as Java. Java is a cross-platform objected-oriented language designed for secure execution of code across a network. The GUT 32 creates a workspace on 40 the client computer 15 where video frames may be viewed, cut, copied and inserted in a video sequence. The processing and data storage, however, is done remotely over the Web 30 as will be described below.

In the present embodiment, the Web 30 is used to connect 45 the client 15 and the servers 20, 22, 24, however in alternative embodiments, other types of networks could be used to form the client-server connection. A GUI-to-backend interface 36 creates a connection between the GUI 32 on the client 15 and the video engines 38 on each of the Web 50 servers 20, 22, 24. The interface 36 is enabled using the libraries 34 of the present invention. The interface 36 includes a buffer where video sequence data is stored for processing at the servers 20, 22, 24. In the present embodiment of the invention, the video sequence data is com- 55 pressed video bitstream data in the MPEG format. The servers 20, 22, 24 store audio and visual video data, and the back-end video-processing engines 38 have multimedia processing applications to process that data.

In operation, the client 15 sends a request for a download 60 from one of the Web servers, for example Server A 20. The Web server 20 transfers the Java code which implements the GUI 32 to the client 15 and the client 15 makes a TCP connection to the Web server 20. The server 20 listens to a port whose number is embedded in the Java program trans- 65 mitted to the client 15. At the client's request for the TCP connection, the Web server 20 accepts the connection by

4

creating a thread for a new client handler on the server 20. The client handler acts as a message passing entity between the client 15 and the video-processing engines 38 on the server 20. The video operations are performed on the server end to minimize the traffic in the TCP socket connection over the network, and the client 15 sends simple video editing messages to the back end video engines 38 to perform the video operations.

In the present embodiment of the invention, the process of creating the video-editing client/server relationship is implemented using various data objects. A server object on the server starts a thread object by passing in the port to which the server listens to incoming clients. The thread object then creates an activempegvector object to track all of the MPEG files that are opened by remote users. At this point of the process, the Web server is ready to accept clients. When a client request is received, the thread object forks out a new clienthandler instance which creates a protocol object at the back-end interface 36. The clienthandler object handles the sending and receiving operations between the GUI 32 and the Web server 20, 22, 24. The clienthandler object does not process the actual messages, but instead calls a method in the protocol object to do so. The protocol object does the actual message parsing and breaks up the messages into a generic format. The protocol object then calls the appropriate methods to handle the operation. After the server has finished the operation, it sends the image file location to the TCP socket for the client to receive at the GUI 32.

The interface 36 and the servers 20, 22, 24 have, as part of the video processing application, a high-performance library (or "toolkit") 34 according to the principles of the present invention. The toolkit can be used to build customized commands for video-editing such as concatenation, zooming in, zooming out, cutting out a portion of the video sequence, and transitional effects. The present invention will be described in terms of the MPEG standard, however the principle of the invention may apply to other data standards. In addition, the MPEG standard is an evolving standard and the principles of the present invention may apply to MPEG standards yet to be developed.

The high-performance toolkit 34 provides code with performance competitive with hand-tuned C code, which allows optimizations to be performed without breaking open abstractions and is able to be composed by users in unforeseen ways. In order to accomplish high performance multimedia data processing with predictable performance, resource control, and replacability and extensibility (i.e. usable in many applications), the present invention provides a toolkit, or API, was designed with the following properties.

The first property of the toolkit 34 is resource control. Resource control refers to control at the language level of I/O execution and memory allocation including reduction and/or elimination of unnecessary memory allocation. None of the toolkit routines of this invention implicitly allocate memory or perform I/O. The few primitives in the toolkit which do perform I/O, are primitives that load or store Bitstream data. The Bitstream is the actual stream of multimedia data. The MPEG bitstream will be discussed below. All other toolkit primitives of the invention use Bitstream as a data source. Users have full control over memory utilization and I/O. This feature gives users tight control over performance-critical resources, an essential feature for writing applications with predictable performance. The toolkit also gives users mechanisms to optimize programs using techniques such as data copy avoidance and to structure programs for good cache behavior.

The separation of I/O in the present invention has three advantages. First, it makes the I/O method used transparent

« ZurückWeiter »