ON-DEMAND DECOMPRESSION AND DECRYPTION WITH RANDOM ACCESS TO INFORMATION
BACKGROUND
1. Field of the Invention
This invention relates to decompression and decryption. In particular, the invention relates to on- demand data decompression and decryption.
2. Description of Related Art
Thanks to advances in computer and communication technologies, on-line information retrieval has become popular. A device connected to a communication network such as the Internet can download a large amount of data from a server. Two fundamental requirements for network transmission for information retrieval are transmission efficiency and information security. Transmission efficiency refers to the speed of transmitting the data over the network. Information security refers to the confidentiality and integrity of the transmitted data.
Transmission efficiency can be achieved by compressing the data to reduce the amount of transmitted data. After receiving the compressed data, the receiving device can decompress the data to recover the original information. Information security can be achieved by using encryption/decryption techniques. Before transmission, the data, whether in compressed or uncompressed form, is encrypted with a secret key. When the receiving device receives the encrypted data, the receiving device decrypts the encrypted data to recover the original information. A transmission of information using both compression/decompression and
encryption/decryption techniques can achieve both transmission efficiency and information security.
When the receiving device requests and downloads a large amount of compressed and/or encrypted data over the network, the time to decompress and/or decrypt may be high. This processing time may cause inconvenience to the user of the receiving device. The problem is even more significant when the user does not need to decompress and/or decrypt the entire received data for use. One particular example is the electronic book.
An electronic book is a viewing device that allows a user to download over a network electronic versions of copyrighted materials or on-line information for later viewing. The content of such materials is referred to as a digital content. When a user of an electronic book downloads the digital content of a book, he or she usually downloads the entire materials into his or her electronic book for later reading or retrieval . The user usually does not need to read the entire digital content at one time. Rather, the user prefers to have the flexibility of accessing the digital contents randomly, at any place in the digital content. If the amount of downloaded data is large, the user may have to wait for some time for decompression and/or decryption. In addition, the amount of on-board storage may be required to be excessively large in order to be able to hold the entire content of the book.
When viewing such digital content the user experience is optimally like that of reading a book, i.e., the user can select what "pages" to view and may "jump" from one page/section to another with ease. In order to achieve this experience, the content should be randomly accessible by the viewer software. The combination of compact storage (compression) , a secure
encoding (encryption) and random access to the data presents serious implementation and performance problems .
Almost all competent lossless (compression followed by decompression will yield exactly the original data) compression algorithms (e.g. Z77, LZ78, LZSS, V.42bis) employ some sort of dictionary technique to achieve a compact data representation. The dictionary is usually built by analysis of previous data, thus allowing for more compact storage of subsequent data. A byte (8- bits) , or a sequence of bytes, is then encoded into a smaller number of bits than contained in the original data. Because these compression dictionaries are built on-the-fly (during compression and decompression) accessing byte "N" of a data stream involves starting at the first byte and reading forward "N" bytes (and building the compression dictionary) . From a performance perspective, it is not viable (when working with a large digital document) to read (and decompress) from the start of the document to access data near the end of the document .
Therefore there is a need in the technology to provide an efficient method to decompress and/or decrypt data in a random manner without requiring long processing time.
SUMMARY
The present invention is a method and apparatus for storing and decoding a digital content in a storage device. The digital content is encoded based on a predetermined block size to generate an encoded digital content. An offset table is created to contain encoded offsets pointing to the digital content at predetermined
block offsets. The encoded digital content and the offset table are stored in the storage device.
A page of information in an encoded content is decoded. The encoded content is encoded from an original content. An original offset of an original block in the original content is obtained. The original block contains the page of information. An encoded offset which corresponds to the original offset is obtained. The encoded offset corresponds to an encoded block in the encoded content. The encoded block is decoded to generate a decoded block which corresponds to the original block.
BRIEF DESCRIPTION OF THE DRAWINGS
The features and advantages of the present invention will become apparent from the following detailed description of the present invention in which:
Figure 1 is a diagram illustrating a system in which one embodiment of the invention can be practiced.
Figure 2 is a diagram illustrating an electronic book according to one embodiment of the invention.
Figure 3 is a diagram illustrating a logical mapping between the compressed and decompressed data according to one embodiment of the invention.
Figure 4 is a flowchart illustrating a process for decompressing according to one embodiment of the invention.
Figure 5A is a flowchart illustrating a first part of a process for decrypting according to one embodiment of the invention.
Figure 5B is a flowchart illustrating a second part of a process for decrypting according to one embodiment of the invention.
DESCRIPTION
The present invention is a method and apparatus for decoding a digital content on demand and with random access. The technique stores the encoded data in the receiving device together with an offset table that contains pointers to starting of blocks of encoded data. When requested by the user, the encoded data are decoded using the offset table to locate the blocks that contain the requested data. The technique saves storage space in the receiving device and allows the user to randomly access the stored information.
In the following description, for purposes of explanation, numerous details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one skilled in the art that these specific details are not required in order to practice the present invention. In other instances, well-known electrical structures and circuits are shown in block diagram form in order not to obscure the present invention. The encoding process may include any encoding technique. Examples of the encoding process include compression and encryption. The decoding process may include any decoding technique corresponding to the encoding technique. Examples of the decoding technique include decompression and decryption.
Figure 1 is a diagram illustrating a system in which one embodiment of the invention can be practiced.
Referring to Figure 1, the system 100 comprises: (a) at least one portable electronic book 10 operative to request a digital content from a catalog of distinct digital contents, to receive and display the requested digital content in readable form; (b) an information services system 20 which includes an authentication server 32 for authenticating the identity of the requesting portable electronic book 10 and a copyright protection server 22 for rendering the requested digital content sent to the requesting portable electronic book 10 readable only by the requesting portable electronic book 10; (c) at least one primary virtual bookstore 40 in electrical communication with the information services system 20, the primary virtual bookstore being a computer-based storefront accessible by the portable electronic book and including the catalog of distinct digital contents; and (d) a repository 50, in electrical communication with the primary virtual bookstore 40, for storing the distinct digital contents listed in the catalog.
The system 100 preferably includes more than one portable electronic book 10, to be commercially viable. This is illustrated in Figure 1 by including the portable electronic books 12 and 14. The system also preferably includes more than one primary virtual bookstore 40, each serving a different set of customers, each customer owning a portable electronic book.
In one embodiment of the invention, the system 100 further comprises a secondary virtual bookstore 60 in electrical communication with the information services system 20. In this case, the information services system 20 also includes a directory of virtual bookstores 26 in order to provide the portable
electronic book 10 with access to the secondary virtual bookstore 60 and its catalog of digital contents.
The information services system 20 can optionally include a notice board server 28 for sending messages from one of the virtual bookstores, primary or secondary, to a portable electronic book in the system.
The information services system 20 also includes a registration server 24 for keeping track of the portable electronic books that are considered active accounts in the system and for ensuring that each portable electronic book is associated with a primary virtual bookstore in the system. In the case where the optional notice board server 28 is included in the information services system 20, the registration server 24 also allows each portable electronic book user to define his/her own notice board and document delivery address.
The information services system 20 preferably comprises a centralized bookshelf 30 associated with each portable electronic book 10 in the system. Each centralized bookshelf 30 contains all digital contents requested and owned by the associated portable electronic book 10. Each portable electronic book 10 user can permanently delete any of the owned digital contents from the associated centralized bookshelf 30. Since the centralized bookshelf 30 contains all the digital contents owned by the associated portable electronic book 10, these digital contents may have originated from different virtual bookstores. The centralized bookshelf 30 is a storage extension for the portable electronic book 10. Such storage extension is needed since the portable electronic book 10 has limited non-volatile memory capacity.
The user of the portable electronic book 10 can add marks, such as bookmarks, inking, highlighting and underlining, and annotations on a digital content displayed on the screen of the portable electronic book, then stores this marked digital content in the nonvolatile memory of the electronic book 10. The user can also upload this marked digital content to the information services system 20 to store it in the centralized bookshelf 30 associated with the portable electronic book 10, for later retrieval. It is noted that there is no need to upload any unmarked digital content, since it was already stored in the centralized bookshelf 30 at the time it was first requested by the portable electronic book 10.
The information services system 20 further includes an Internet Services Provider (ISP) 34 for providing Internet network access to each portable electronic book in the system.
Figure 1 further illustrates that the electronic books 10, 12, and 14 are used in the field. The electronic book 10 has on-board storage devices to store encoded digital content 80 and an offset table 82. As is known by one skilled in the art, the technique in the present invention can be used in any other receiving device.
Figure 2 is a diagram illustrating an electronic book 10 according to one embodiment of the invention. The electronic book 10 includes a processor 210, a program code 220, the encoded digital content 80, an offset table 82, and a decoded digital content 250.
The processor 210 is any processor that can execute programs to perform specified functions. The processor 210 may be a general purpose microprocessor, a
microcontroller, or a special-purpose processor such as a digital signal processor. The program code 220 stores the program, code, functions, subroutines, subprograms, or code segments executed by the processor 210 to perform all or part of the techniques in the present invention. The program code 220 may be implemented by read only memory (ROM) , programmble ROM, flash memory, or any other non-volatile memory. The processor 210 performs decompression and/or decryption using several well known algorithms.
The encoded digital content 80 includes the data received or downloaded from the communication network in the compressed or encrypted form. The encoded digital content 80 may also represent data encoded in any other method. The encoded digital content 80 is implemented by random access memory (RAM) or any other convenient form of storage accessible to the processor 210.
The offset table 82 includes a table of offsets or pointers that maps between offsets in the uncompressed logical address space and the encoded address space. The offset table 82 is downloaded or transmitted to the electronic book together with the encoded digital content 80. The offset table 82 is stored in a RAM accessible to the processor 210.
The decoded digital content 250 includes the data decoded from the encoded digital content 80 using the offsets or pointers from the offset table 82. The decoded digital content 250 may represent the decompressed and/or decrypted data corresponding to the compressed and/or encrypted data in the encoded digital content 80. The decoded digital content 250 may be implemented in RAM or any other convenient form of storage accessible to the processor 210. In one embodiment, the decoded digital content 250 does not
correspond to the entire digital content, but rather only a portion of the digital content that the user wants to retrieve. For example, the user may wish to view a particular page of the book whose digital content is encoded and stored in the electronic book.
Figure 3 is a diagram illustrating a logical mapping 300 between the compressed and decompressed data according to one embodiment of the invention. The logical mapping 300 shows an address mapping of an encoded logical address space 310 and a decoded logical address space 320.
In the example illustrated in Figure 3, the encoded logical address space 310 includes five encoded blocks having the byte/bit offsets at addresses 0/0, 10352/3, 21768/5, 32113/1, and 39007/2. Each of these encoded blocks corresponds to a decoded block of known size. In this example, the size of each decoded block is 16KB (or 16384 bytes) . The decoded logical address space 320 includes five decoded blocks having the offsets at addresses 0, 16384, 32768, 49152, and 65536, corresponding to the encoded byte/bit offsets 0/0, 10352/3, 21768/5, 32113/1, and 39007/2, respectively. For example the byte offset of the block at address location 32768 in the unencoded address space corresponds to the byte offset of 21768 and the bit offset of 5 in the encoded address space.
Suppose a user wants to access a page of information at address location 50000 in the unencoded address space 320. The closest boundary of the block in the unencoded address space 320 that contains the address location 50000 is 49152. This is the fourth block in the address space. Therefore, the corresponding byte/bit offset in the encoded address space is the fourth item in the offset table. This
corresponds to the byte/bit offset of 32113/1. The block containing the encoded data having the byte/bit offset of 32113/1 is then decoded to provide the decoded block at address location 49152 in the unencoded address space. From this decoded block, the unnecessary decoded portion will be discarded. In this example, this unnecessary portion includes the 848 bytes between the address location 49152 to the address location 50000, which is above the desired page. Similarly, the unnecessary portion may also include the portion in the unencoded block below the desired page.
Figure 4 is a flowchart illustrating a process 400 for decompressing according to one embodiment of the invention.
Upon START, the process 400 determines if the offset is different from the current logical position
(Block 410). If no, the process 400 goes to block 455. If the offset is different from the current logical position, the process 400 finds the highest reset point before the requested offset (Block 415) . Then the process 400 sets the position to the absolute byte/bit position as obtained from the offset table (Block 420). The process 400 then resets the decompression dictionary
(Block 425) . The process 400 reads the requested data for decompression from the current position (Block 430) . After reading the data, the process 400 performs decompression (Block 435) . Unnecessary data are then discarded (Block 4400 and the position is updated (Block 445) . Then the process 400 determines if the current position is less than the offset (Block 450) . If yes, the process 400 goes back to block 430.
If the current position is not less than the offset, the process 400 goes to block 455 to read the data needed for the compression at the current position.
Then the process 400 decompresses the data (Block 460) and resets the compression dictionary when causing the block offsets (Block 465) and updates the current position (Block 470) . Then the process 400 determines if the remaining size is greater or equal to zero 9Block 475) . If no, the process 400 is terminated. If the remaining size is greater than or equal to zero, the process 400 returns to block 455.
Figure 5A is a flowchart illustrating a first part of a process 500 for decrypting according to one embodiment of the invention.
Upon START, the process 500 determines if the stream of data is decrypted (Block 510) . If not, the process 500 is terminated. If the data are decrypted, the process 500 determines if the offset is different than the even multiple of the encryption block size (Block 515). If not, the process 500 goes to block 540. Otherwise, the process 500 reads an encryption block at the previous encryption block size multiple (Block 520) . The process 500 then decrypts the block (Block 525) . Then the process 500 copies the requested decrypted data to the result buffer. The process 500 next updates the remaining size and the position (Block 535) and sets the position to the offset (Block 540) . The process 500 then proceeds to connector A.
Figure 5B is a flowchart illustrating a second part of a process for decrypting according to one embodiment of the invention.
From the connector A, the process 500 determines if the remaining size is greater or equal to the encryption block size (Block 545) . If yes, the process 500 reads N blocks of data from the current position (Block 550) . The process 500 then decrypts the read N blocks (Block
555) and updates the remaining size (Block 560) . The process 500 then goes back to block 545.
If the remaining size is not greater or equal to the encryption block size, the process 500 determines if the remaining size is greater than zero (Block 565) . If not, the process 500 is terminated. If the remaining size is greater than zero, the process 500 reads one block of data at the current position (Block 570) . The process 500 then decrypts- the read block (Block 575) , copies the requested decrypted data to the result buffer (Block 580) , and updates the current position (Block 585) . The process 500 is then terminated.
The present invention provides an efficient technique to decode digital content in a receiving device. The technique uses an offset table to point to blocks of unencoded information. The technique reduces on-board storage space and allows random access to the stored information.
While this invention has been described with reference to illustrative embodiments, this description is not intended to be construed in a limiting sense. Various modifications of the illustrative embodiments, as well as other embodiments of the invention, which are apparent to persons skilled in the art to which the invention pertains are deemed to lie within the spirit and scope of the invention.