EP1248996A1 - Internet-based archive service for electronic documents - Google Patents

Internet-based archive service for electronic documents

Info

Publication number
EP1248996A1
EP1248996A1 EP01906618A EP01906618A EP1248996A1 EP 1248996 A1 EP1248996 A1 EP 1248996A1 EP 01906618 A EP01906618 A EP 01906618A EP 01906618 A EP01906618 A EP 01906618A EP 1248996 A1 EP1248996 A1 EP 1248996A1
Authority
EP
European Patent Office
Prior art keywords
computer
electronic document
program code
code means
readable program
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP01906618A
Other languages
German (de)
French (fr)
Other versions
EP1248996A4 (en
Inventor
William E. Bankert
David Lee Smith
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zantazcom
Original Assignee
Zantazcom
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zantazcom filed Critical Zantazcom
Publication of EP1248996A1 publication Critical patent/EP1248996A1/en
Publication of EP1248996A4 publication Critical patent/EP1248996A4/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Definitions

  • the present invention relates generally to methods for storage and retrieval of data, and more specifically to an Internet-based service for archiving electronic documents.
  • Electronic commerce is fast becoming the method of choice for doing business. This trend demands a way to document electronic business transactions.
  • IRS Revenue Commission
  • SEC Securities and Exchange Commission
  • SEC Rule 17a-4 allow broker-dealers to use electronic storage media systems to store records they are required to retain.
  • SEC has imposed requirements to facilitate auditing of these electronic records.
  • the principal record keeping requirements under the Securities Exchange Act of 1934, Rules 17a-3 and 17a-4 require broker-dealers to preserve various records for a total of not less than six years (or not less than three years for some of the records), the first two years in an "easily accessible place.”
  • Rule 17a-4 allows businesses to use digital storage media services provided by independent third-party providers to help them comply with the rules, as long as the storage systems and the use of the systems by the broker-dealers meet certain requirements.
  • the SEC regulations specify that the data must be stored in a non- tamperable form, such as write-once, read-many (WORM) disk.
  • WORM write-once, read-many
  • the records must be made available promptly through a third party in case the regulated entity is unable or unwilling to produce the records.
  • the present invention is a method and computer program product for storage, archiving, and retrieving electronic documents over the Internet.
  • the method includes the steps of receiving an electronic document; storing the electronic document, associating the electronic document with a routing rule that specifies a meta variable and a bias variable, each defining a separate axes in a multi- dimensional data space; parsing the electronic document to obtain metadata corresponding to the meta variable and bias data corresponding to the bias variable; and associating the electronic document with a cell in the data space, the cell selected according to the metadata and the bias data.
  • the bias data includes temporal data indicating the time of receipt of the electronic document.
  • the method includes storing the electronic document on a storage device associated with the metadata. For example, if the metadata relates to a geographic area, then the electronic document is stored on the storage device located within that geographic area. In one aspect, the routing rule specifies a particular repository, and the electronic document is associated with a cell associated with that repository.
  • the routing rule is selected according to the identity of an entity associated with the electronic document.
  • the routing rule may be selected according to the sender or recipient of the electronic document.
  • the method includes the steps of receiving a query for an electronic document stored in a multi-dimensional data space having a metadata axis and a bias data axis, the query specifying a repository, a bias range and search criteria; selecting cells in the data space according to the repository and bias range; and searching for electronic documents associated with the selected cells using the search criteria.
  • the selecting step includes selecting one or more bias data axis bins based on the bias range; and selecting one or more metadata axis bins based on the repository.
  • the bias range specifies a time of arrival for the electronic document.
  • the method includes identifying one or more electronic documents as search results; and informing a sender of the query of the search results. In one aspect, the method includes receiving a selection identifying at least one of the search results as a target; and sending the electronic documents associated with the target to the sender of the selection.
  • FIG. 1 depicts a multi-dimensional storage space according to one implementation of the present invention.
  • FIG. 2 is a block diagram depicting a customer and an archive according to an implementation of the present invention.
  • FIG. 3 is a flow diagram depicting the electronic document storage operation according to an implementation of the present invention.
  • FIG. 4 is a flow diagram depicting an electronic document retrieval operation according to one implementation of the present invention.
  • FIG. 5 is a context diagram depicting two archives.
  • FIG. 6 is a block diagram depicting an implementation of an archive according to the present invention.
  • the present invention is a method and computer program product for storing, archiving, and retrieving electronic documents over the Internet.
  • implementations of the present invention employ a subdivided database, referred to as a multi-dimensional storage space.
  • a multi-dimensional storage space 100 is depicted in FIG. 1.
  • Storage space 100 is divided according to implementations of the present invention by two axes.
  • One axis is referred to as "metadata" axis 402, and the other axis is referred to as "bias data” axis 104.
  • Axes 102 and 104 define two dimensions of multi-dimensional storage space 100. Storage spaces including further dimensions and axes are within the scope of the present invention.
  • Each axis divides its associated dimension of the storage space into a plurality of "bins.”
  • metadata axis 102 divides space 100 into a plurality of metadata bins 106.
  • metadata describes particular repositories.
  • a repository is a "slice" of the data space that can be assigned to a particular customer.
  • each metadata bin refers to a particular repository.
  • bias data axis 104 divides the bias data dimension of storage space 100 into a plurality of bias data bins 108. In the example of FIG. 1, each bias data bin corresponds to a month of the year.
  • storage space 100 refers to a physical storage space, and electronic documents are actually stored within the physical cells within that storage space.
  • storage space 100 refers to a logical storage space, or "storage map.” While each electronic document is stored on a physical device, a pointer to that storage location is associated with the storage map cell or cells that were associated with that electronic document by the operation of the routing rule. In this manner, each electronic document is stored only once, but may be associated with two or more cells.
  • An advantage of this subdivided approach lies in the speed of retrieval of documents from the archive.
  • the user specifies one or more cells in the storage space to be searched.
  • the electronic documents associated with the selected cells need to be searched. This method increases the speed and efficiency of the retrieval.
  • each electronic document is assigned, on receipt of the archive, to one or more cells within the storage space.
  • This association between electronic documents and cells is implemented according to one or more routing rules.
  • a routing rule creates the association based on user-specified criteria. In one implementation, the user is permitted to edit the routing rule directly.
  • An Internet-based interface is provided to permit the definition of routing rules by users.
  • Each routing rule specifies at least one meta variable and at least one bias variable. For each variable, the routing rule specifies at least one value, and the action to be taken when an electronic document is received containing that value for that variable. For example, a routing rule for a particular customer may specify that each electronic document it archives is to be associated with both the month in which it is sent, and the month in which it is archived.
  • the routing rule for the customer may specify that each electronic document archived is to be associated with the repository corresponding to that customer. For example, a customer may specify repository A. Then when the document is archived, it is associated with the cells in the repository A metadata bin corresponding to January and March.
  • a customer When a customer wishes to retrieve an archived document, the customer can specify one or more cells within the data space. For example, a customer may wish to retrieve the document archived in the above example. Therefore, the customer specifies the cell in repository A associated with the month of January. The customer then specifies certain search criteria. The archive then searches the specified cell using the user-specified search criteria to obtain search results. The search results are displayed to the user as an HTML document. The user can select a particular search result as a target. The archive then sends a copy of the target electronic document to the user. Routing rules are also useful in retrieving documents. For example, a routing rule can be written after certain electronic documents are stored to create a new repository containing a particular subset of those documents.
  • an SEC audit may require all of the emails that were sent to Joe Smith by the customer of repository A during February.
  • a new routing rule can be generated very easily that will quickly populate a repository (for example, repository B) with only these emails.
  • the routing rule specifies that all of the electronic documents associated with repository A for the month of January (bias data bin January) be searched for emails sent to Joe Smith.
  • the routing rule searches the cell associated with the January bias data bin and the repository A metadata bin for emails sent to Joe Smith by the customer.
  • the resulting search results are then associated with repository B.
  • a key benefit of this method of retrieval is that an audited company need only produce those documents that match the scope of the audit, rather than producing all of the documents that might match the scope of the audit (for example, producing all emails sent during the month of January).
  • Implementations of the present invention also take advantage of the fact that not all data is equally likely to be accessed. In particular, the longer an electronic document has been stored, the less likely it is to be retrieved. Therefore, in one implementation, data is moved to successively less expensive storage media as it ages. Such media reduced the costs of storage, but increased the costs of retrieval. However, because the odds of retrieval are low, savings are achieved.
  • FIG. 2 is a block diagram depicting a customer 104 and an archive 102 according to an implementation of the present invention.
  • the customer exchanges electronic documents with third parties using customer SMTP server 226.
  • Electronic documents to be archived are sent to archive 102 by customer SMTP server 226 and received by archive SMTP server 206.
  • Customer 104 uses a customer web browser 222 to retrieve archived documents. Interaction with browser 222 is handled by HTTP server 212 at the archive.
  • the electronic documents are physically stored by archiver 208. The storage and retrieval of these electronic documents is managed by object database 210.
  • FIG. 3 is a flow diagram depicting the electronic document storage operation according to an implementation of the present invention.
  • the process begins with a message setup exchange 302 between customer SMTP server 226 and archive SMTP server 206.
  • message setup exchanges are well-known in the relevant arts.
  • the electronic document to be archived is transferred from the customer SMTP server to the archive SMTP server as an email 304.
  • the archive SMTP server sends the electronic document to archiver 208 for storage as a bitfile at 306.
  • the bitfile is immediately written to a WORM disk to ensure non-tamperability.
  • the archive server also parses the electronic document to extract metadata and bias data values for the electronic document.
  • the metadata can be extracted from the header of the email. If the document is sent by electronic data interchange (EDI), the metadata can be extracted from specific fields within the EDI form. These values are sent to the object database 210 at 308.
  • the object database applies routing rules to the metadata and bias data to place pointers to the electronic document in particular cells within the storage space.
  • the metadata can be used with the account database to identify a repository associated with the customer associated with the electronic document, as described below.
  • FIG. 4 is a flow diagram depicting an electronic document retrieval operation according to one implementation of the present invention.
  • customer web browser 222 is pointed to the home page of HTTP server 212
  • a login screen is sent to the customer browser at 402.
  • the customer browser provides a username and password to access the archive at 404.
  • Account database 214 conducts a user authentication process 406 to verify the identity of the customer.
  • the archive HTTP server contacts object database 210 to obtain a list of the repositories to which the customer has access at 408. The list is returned to the archive server at 410.
  • the archive HTTP server composes a search screen containing the list of authorized repositories, and sends the search screen to the customer browser at 412.
  • the customer uses the screen to compose a search request by selecting particular repositories and entering search criteria and optionally bias data.
  • the search request is sent to the archive at 414.
  • the archive HTTP server formulates a query based on the search request, and transmits this query to the object database at 416.
  • the object database confirms that the customer is authorized to access the repositories specified in the query at 418.
  • the object database executes the query to generate a set of query results, which are sent to the archive HTTP server at 420.
  • the archive HTTP server composes a search results screen, which is transmitted to the customer at 422.
  • the search screen includes one or more results that match the customer's search request.
  • the customer selects one or more of these search results as "targets.” These targets are sent to the archive HTTP server as a message request at 424.
  • the archive HTTP server composes a message retrieval request based on the message request, and forwards the message retrieval request to the object database at 426.
  • the object database again verifies repository access for the customer at 428. If access is authorized, the object database sends a bitfile ID for each target electronic document to archiver 208 at 430.
  • archiver 208 will accept a bitfile ID request only from object database 210. Because object database 210 verifies customer authorization before sending the bitfile ID to the archiver, security of the electronic documents is preserved. In response, the archiver returns a bitfile handle to the object database at 432.
  • the object database passes the bitfile handle to the archive HTTP server, which formulates a bitfile request.
  • the bitfile request is sent to the archiver at 436.
  • the archiver sends the bitfile for the target electronic document to the archive HTTP server at 438.
  • the archive HTTP server uses the bitfile to generate a formatted message, which is sent to the customer web browser at 440.
  • the formatted message presents the requested target electronic document to the user on the customer's web browser.
  • the electronic document includes an HTML representation of the document on the screen.
  • This representation can include hidden routing data, as specified by RFC821.
  • the customer can obtain the original email used to archive the electronic document by emailing it to himself, downloading it to an application such as Notepad, or forwarding it to someone else. This is especially useful for delivering stop trade confirms to customers of on-line brokerages; in this application the timestamp of a confirm is crucial.
  • multiple interconnected archives are provided.
  • FIG. 5 is a context diagram depicting two archives 502 A and 502B. The archives are connected by a private asynchronous transfer mode (ATM) backbone 512. Data is replicated between the two archives so that if one archive fails, the other can serve the customers.
  • ATM asynchronous transfer mode
  • firewalls 506 For enhanced security, access to each archive is protected by firewalls 506.
  • customer 504A connects to the archives via Internet 502. This connection is secured by firewalls 506A and 506B.
  • Customer 504B accesses archive 502B using a virtual private network (VPN) 510. This connection is protected by firewall 506B.
  • Customer 504C accesses archive 502 A directly by a private connection. This connection is protected by firewall 506C.
  • Customer 504D accesses the archives using a private frame relay network 508. This connection is protected by firewalls 506D and 506E.
  • VPN virtual private network
  • Implementations of the present invention provide other security features. Each customer is allowed to select a level of security based upon its own needs. For example, these security mechanisms can include a logon/password using secure sockets layer (SSL), validating certificates, using soft or hard tokens, encryption, and the like.
  • SSL secure sockets layer
  • the firewalls prevent hackers from exploiting holes in the operating system.
  • the authentication mechanisms described above prevent attacks at the application level, such as masquerade attacks.
  • separate front end servers are used.
  • the system back-end is insulated from the front-end by using separate servers.
  • data stored on WORM disks can be stored in an encrypted manner.
  • Non-tamperability of archive data can be demonstrated by various methods.
  • the electronic documents can be stored on WORM disks, as required by the SEC. However, these disks could be maliciously substituted.
  • electronic documents are signed electronically as they enter the archive. Thus, non-tamperability can be shown by demonstrating that the digital signature is still intact.
  • FIG. 6 is a block diagram depicting an implementation of an archive 600 according to the present invention.
  • Archive 600 includes a portal module 602, a migratory module 604, an archiver module 608, a billing server 610, and customer modules 606A and 606B.
  • the separation of portal module 602 and migratory module 604 permits greater scalability and also provides an additional ring for a "defense-in- depth" architecture for improved security.
  • portal modules are permitted to communicate only with migratory modules. This provides an additional ring in the security architecture.
  • Portal module 602 includes an SMTP front-end (FE) 612, an SMTP back-end (BE) 614, and an HTTP server 618.
  • the SMTP front-end handles incoming email, and the SMPT back-end handles outgoing email.
  • the HTTP server handles HTTP communications, as described above.
  • the present invention provides two methods for obtaining electronic documents via email for archiving. According to the first method described above, the customer merely sends a copy of each email to the archive. According to a second method, however, the customer uses the archive 600 as a store-and-forward site for all of the customer's email to be archived. Email that is inbound to the customer is rerouted to the archive SMTP front-end using the MX record according to well-known techniques.
  • the migratory module includes a buffer archiver 620, a queue manager
  • queue manager permits scalability of the SMTP front-ends and back-ends. For example, queue manager permits multiple portal modules to be used, or alternatively, permits multiple SMTP front-ends and back-ends to be used within each portal module.
  • Domain database 626 contains a listing of customers and their associated domains.
  • Object router 650 determines, using the domain database, which customer owns each incoming message. Based on this information, the object router causes the document to be transferred from buffer archiver 620 to the appropriate cache archiver 630 and the appropriate customer module 606. The document is eventually copied onto optical storage media and is also replicated to other archives within the system.
  • the log server stores events such as security events and errors.
  • the domain database uses the login information to direct the user to the appropriate account database.
  • the web server authenticates with the account database and so determines which repositories the user is authorized to access.
  • the account database performs user authentication and maintains an access control list for each repository.
  • the object database receives document descriptors, such as JavaBeans, XML DTD's (Document Type Descriptions), and the like, which create and populate repositories.
  • the object database maps user queries to SQL queries to search the archiver.
  • customer module 606A includes a cache archiver 630A, an object database 632A, and account database 634 A, a log server 636A, and a billing logger 638 A.
  • customer module 606B contains a cache archiver 630B, an object database 632B, and account database 634B, a log server 636B, and a billing logger 638B.
  • the account database 634A performs functions similar to those of account database 624.
  • Log server 636 A performs functions similar to those performed by log server 628.
  • Billing logger 638A collects billing events, which are fed to billing server 610.
  • Billing server 610 includes a log server 640 for logging events, and a master billing unit 642 for preparing customer bills.
  • Archiver 608 performs functions similar to those of archiver 208, as described above.
  • the above implementation is described in terms of archiving email messages.
  • the present invention is easily extended to implementations that accommodate other types of electronic documents, as would be apparent to one skilled in the relevant arts.
  • the present invention contemplates implementations for archiving electronic documents transmitted according to protocols including HTML, EDI, FTP, XML, and the like.
  • the invention can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them.
  • Apparatus of the invention can be implemented in a computer program product tangibly embodied in a machine-readable storage device for execution by a programmable processor; and method steps of the invention can be performed by a programmable processor executing a program of instructions to perform functions of the invention by operating on input data and generating output.
  • the invention can be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device.
  • Each computer program can be implemented in a high-level procedural or object-oriented programming language, or in assembly or machine language if desired; and in any case, the language can be a compiled or interpreted language.
  • Suitable processors include, by way of example, both general and special purpose microprocessors.
  • a processor will receive instructions and data from a read-only memory and/or a random access memory.
  • a computer will include one or more mass storage devices for storing data files; such devices include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks.
  • Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM disks. Any of the foregoing can be supplemented by, or incorporated in,
  • ASICs application-specific integrated circuits
  • the invention can be implemented on a computer system having a display device such as a monitor or LCD screen for displaying information to the user and a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer system.
  • the computer system can be programmed to provide a graphical user interface through which computer programs interact with users.

Abstract

A method and computer program product for storage, archiving, and retrieving electronic documents over the Internet. According to one embodiment, the method includes receiving an electronic document, storing the electronic document, associating the electronic document with a routing rule that specifies a meta variable defining metadata axis (102) and a bias variable defining a bias axis (104) in a multi-dimensional data space, parsing the electronic document to obtain metadata corresponding to the meta variable and bias data corresponding to the bias variable, and associating the electronic document with a cell in the data space, the cell selected according to the metadata and the bias data.

Description

INTERNET-BASED ARCHIVE SERVICE FOR ELECTRONIC DOCUMENTS
BACKGROUND
The present invention relates generally to methods for storage and retrieval of data, and more specifically to an Internet-based service for archiving electronic documents. Electronic commerce is fast becoming the method of choice for doing business. This trend demands a way to document electronic business transactions.
Conventionally, commercial transactions are documented on paper records. In order to facilitate permanent storage and ready retrieval, business spends billions each year on filing, storing, and retrieving these paper documents.
One advantage of electronic commerce is that it greatly reduces reliance on such paper documents, and the accompanying costs of handling them. Instead, electronic documents are used to execute and record electronic transactions. However, the need for accurate, permanent, and available records of these electronic transactions remains.
One source of this need is simply the desire to keep good records as a fundamental aspect of good business practice. Another source of this need, however, arises from legal requirements imposed upon businesses by regulatory agencies such as the Internal
Revenue Commission (IRS) and the Securities and Exchange Commission (SEC).
For example, the 1997 amendments to SEC Rule 17a-4 allow broker-dealers to use electronic storage media systems to store records they are required to retain. However, the SEC has imposed requirements to facilitate auditing of these electronic records. The principal record keeping requirements under the Securities Exchange Act of 1934, Rules 17a-3 and 17a-4 require broker-dealers to preserve various records for a total of not less than six years (or not less than three years for some of the records), the first two years in an "easily accessible place." Rule 17a-4 allows businesses to use digital storage media services provided by independent third-party providers to help them comply with the rules, as long as the storage systems and the use of the systems by the broker-dealers meet certain requirements.
For example, the SEC regulations specify that the data must be stored in a non- tamperable form, such as write-once, read-many (WORM) disk. In addition, the records must be made available promptly through a third party in case the regulated entity is unable or unwilling to produce the records.
In addition to these long-term storage requirements, huge volumes of data must be stored per day. It is estimated that three trillion emails were sent in 1998 alone. In addition, it is estimated that the average user of an archive service sends five million emails a day, each containing five to 50 kilobytes of data. Existing database structures simply cannot handle this volume of data.
SUMMARY The present invention is a method and computer program product for storage, archiving, and retrieving electronic documents over the Internet. According to one implementation, the method includes the steps of receiving an electronic document; storing the electronic document, associating the electronic document with a routing rule that specifies a meta variable and a bias variable, each defining a separate axes in a multi- dimensional data space; parsing the electronic document to obtain metadata corresponding to the meta variable and bias data corresponding to the bias variable; and associating the electronic document with a cell in the data space, the cell selected according to the metadata and the bias data.
In one aspect, the bias data includes temporal data indicating the time of receipt of the electronic document.
In one aspect, the method includes storing the electronic document on a storage device associated with the metadata. For example, if the metadata relates to a geographic area, then the electronic document is stored on the storage device located within that geographic area. In one aspect, the routing rule specifies a particular repository, and the electronic document is associated with a cell associated with that repository.
In one aspect, the routing rule is selected according to the identity of an entity associated with the electronic document. For example, the routing rule may be selected according to the sender or recipient of the electronic document. According to another implementation, the method includes the steps of receiving a query for an electronic document stored in a multi-dimensional data space having a metadata axis and a bias data axis, the query specifying a repository, a bias range and search criteria; selecting cells in the data space according to the repository and bias range; and searching for electronic documents associated with the selected cells using the search criteria.
In one aspect, the selecting step includes selecting one or more bias data axis bins based on the bias range; and selecting one or more metadata axis bins based on the repository. In one aspect, the bias range specifies a time of arrival for the electronic document.
In one aspect, the method includes identifying one or more electronic documents as search results; and informing a sender of the query of the search results. In one aspect, the method includes receiving a selection identifying at least one of the search results as a target; and sending the electronic documents associated with the target to the sender of the selection.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
DESCRIPTION OF DRAWINGS
FIG. 1 depicts a multi-dimensional storage space according to one implementation of the present invention. FIG. 2 is a block diagram depicting a customer and an archive according to an implementation of the present invention.
FIG. 3 is a flow diagram depicting the electronic document storage operation according to an implementation of the present invention.
FIG. 4 is a flow diagram depicting an electronic document retrieval operation according to one implementation of the present invention.
FIG. 5 is a context diagram depicting two archives.
FIG. 6 is a block diagram depicting an implementation of an archive according to the present invention.
Like reference symbols in the various drawings indicate like elements.
DETAILED DESCRIPTION
The present invention is a method and computer program product for storing, archiving, and retrieving electronic documents over the Internet.
Conventional archiving solutions generally employ a single large database for storage and retrieval. Consequently, these conventional mechanisms must search the entire database whenever an electronic document must be retrieved. This factor limits the effective size and speed of conventional archiving mechanisms. In contrast, implementations of the present invention employ a subdivided database, referred to as a multi-dimensional storage space. One such multi-dimensional storage space 100 is depicted in FIG. 1. Storage space 100 is divided according to implementations of the present invention by two axes. One axis is referred to as "metadata" axis 402, and the other axis is referred to as "bias data" axis 104. Axes 102 and 104 define two dimensions of multi-dimensional storage space 100. Storage spaces including further dimensions and axes are within the scope of the present invention.
Each axis divides its associated dimension of the storage space into a plurality of "bins." For example, metadata axis 102 divides space 100 into a plurality of metadata bins 106. In the example of FIG. 1, metadata describes particular repositories. A repository is a "slice" of the data space that can be assigned to a particular customer. In the example of FIG. 1 , each metadata bin refers to a particular repository. In FIG. 1 , three repositories A, B, and C are depicted. Likewise, bias data axis 104 divides the bias data dimension of storage space 100 into a plurality of bias data bins 108. In the example of FIG. 1, each bias data bin corresponds to a month of the year.
The intersection of a bias data bin and a metadata bin define a "cell" 110 within storage space 100. When an electronic document arrives at the archive, it is associated with one or more cells 110 according to one or more "routing rules," as discussed in detail below. According to one implementation, storage space 100 refers to a physical storage space, and electronic documents are actually stored within the physical cells within that storage space. In another implementation, storage space 100 refers to a logical storage space, or "storage map." While each electronic document is stored on a physical device, a pointer to that storage location is associated with the storage map cell or cells that were associated with that electronic document by the operation of the routing rule. In this manner, each electronic document is stored only once, but may be associated with two or more cells.
An advantage of this subdivided approach lies in the speed of retrieval of documents from the archive. When a user wishes to retrieve a document from the archive, the user specifies one or more cells in the storage space to be searched. In contrast to conventional database archive systems, only the electronic documents associated with the selected cells need to be searched. This method increases the speed and efficiency of the retrieval.
As mentioned above, each electronic document is assigned, on receipt of the archive, to one or more cells within the storage space. This association between electronic documents and cells is implemented according to one or more routing rules. A routing rule creates the association based on user-specified criteria. In one implementation, the user is permitted to edit the routing rule directly. An Internet-based interface is provided to permit the definition of routing rules by users. Each routing rule specifies at least one meta variable and at least one bias variable. For each variable, the routing rule specifies at least one value, and the action to be taken when an electronic document is received containing that value for that variable. For example, a routing rule for a particular customer may specify that each electronic document it archives is to be associated with both the month in which it is sent, and the month in which it is archived. Therefore, if the customer archives an electronic document that was sent in January and archived in March, it will be associated with the bias data bins for January and March. In addition, the routing rule for the customer may specify that each electronic document archived is to be associated with the repository corresponding to that customer. For example, a customer may specify repository A. Then when the document is archived, it is associated with the cells in the repository A metadata bin corresponding to January and March.
When a customer wishes to retrieve an archived document, the customer can specify one or more cells within the data space. For example, a customer may wish to retrieve the document archived in the above example. Therefore, the customer specifies the cell in repository A associated with the month of January. The customer then specifies certain search criteria. The archive then searches the specified cell using the user-specified search criteria to obtain search results. The search results are displayed to the user as an HTML document. The user can select a particular search result as a target. The archive then sends a copy of the target electronic document to the user. Routing rules are also useful in retrieving documents. For example, a routing rule can be written after certain electronic documents are stored to create a new repository containing a particular subset of those documents. For example, an SEC audit may require all of the emails that were sent to Joe Smith by the customer of repository A during February. A new routing rule can be generated very easily that will quickly populate a repository (for example, repository B) with only these emails. The routing rule specifies that all of the electronic documents associated with repository A for the month of January (bias data bin January) be searched for emails sent to Joe Smith. The routing rule then searches the cell associated with the January bias data bin and the repository A metadata bin for emails sent to Joe Smith by the customer. The resulting search results are then associated with repository B. A key benefit of this method of retrieval is that an audited company need only produce those documents that match the scope of the audit, rather than producing all of the documents that might match the scope of the audit (for example, producing all emails sent during the month of January). Implementations of the present invention also take advantage of the fact that not all data is equally likely to be accessed. In particular, the longer an electronic document has been stored, the less likely it is to be retrieved. Therefore, in one implementation, data is moved to successively less expensive storage media as it ages. Such media reduced the costs of storage, but increased the costs of retrieval. However, because the odds of retrieval are low, savings are achieved.
One implementation of the present invention provides for multi-level indexing. Referring to FIG. 1 , each cell within storage space 100 is further subdivided into multiple sub-cells. This allows for further data segmentation based on a hash function or other system generated criteria. FIG. 2 is a block diagram depicting a customer 104 and an archive 102 according to an implementation of the present invention. In the discussed implementation, the customer exchanges electronic documents with third parties using customer SMTP server 226. Electronic documents to be archived are sent to archive 102 by customer SMTP server 226 and received by archive SMTP server 206. Customer 104 uses a customer web browser 222 to retrieve archived documents. Interaction with browser 222 is handled by HTTP server 212 at the archive. The electronic documents are physically stored by archiver 208. The storage and retrieval of these electronic documents is managed by object database 210.
FIG. 3 is a flow diagram depicting the electronic document storage operation according to an implementation of the present invention. The process begins with a message setup exchange 302 between customer SMTP server 226 and archive SMTP server 206. Such message setup exchanges are well-known in the relevant arts. Next, the electronic document to be archived is transferred from the customer SMTP server to the archive SMTP server as an email 304. The archive SMTP server sends the electronic document to archiver 208 for storage as a bitfile at 306. In one implementation, the bitfile is immediately written to a WORM disk to ensure non-tamperability.
The archive server also parses the electronic document to extract metadata and bias data values for the electronic document. The metadata can be extracted from the header of the email. If the document is sent by electronic data interchange (EDI), the metadata can be extracted from specific fields within the EDI form. These values are sent to the object database 210 at 308. The object database applies routing rules to the metadata and bias data to place pointers to the electronic document in particular cells within the storage space. The metadata can be used with the account database to identify a repository associated with the customer associated with the electronic document, as described below.
When this process is complete, the object database sends and acknowledge message to the archive SMTP server at 310. In turn, the archive SMTP server sends an OK message to the customer SMTP server at 312. FIG. 4 is a flow diagram depicting an electronic document retrieval operation according to one implementation of the present invention. When customer web browser 222 is pointed to the home page of HTTP server 212, a login screen is sent to the customer browser at 402. The customer browser provides a username and password to access the archive at 404. Account database 214 conducts a user authentication process 406 to verify the identity of the customer. When the customer has been authenticated, the archive HTTP server contacts object database 210 to obtain a list of the repositories to which the customer has access at 408. The list is returned to the archive server at 410.
The archive HTTP server composes a search screen containing the list of authorized repositories, and sends the search screen to the customer browser at 412. The customer uses the screen to compose a search request by selecting particular repositories and entering search criteria and optionally bias data. The search request is sent to the archive at 414.
The archive HTTP server formulates a query based on the search request, and transmits this query to the object database at 416. For security reasons, the object database confirms that the customer is authorized to access the repositories specified in the query at 418. The object database executes the query to generate a set of query results, which are sent to the archive HTTP server at 420. The archive HTTP server composes a search results screen, which is transmitted to the customer at 422. The search screen includes one or more results that match the customer's search request. The customer selects one or more of these search results as "targets." These targets are sent to the archive HTTP server as a message request at 424. The archive HTTP server composes a message retrieval request based on the message request, and forwards the message retrieval request to the object database at 426. The object database again verifies repository access for the customer at 428. If access is authorized, the object database sends a bitfile ID for each target electronic document to archiver 208 at 430. In one implementation, archiver 208 will accept a bitfile ID request only from object database 210. Because object database 210 verifies customer authorization before sending the bitfile ID to the archiver, security of the electronic documents is preserved. In response, the archiver returns a bitfile handle to the object database at 432.
The object database passes the bitfile handle to the archive HTTP server, which formulates a bitfile request. The bitfile request is sent to the archiver at 436. In response, the archiver sends the bitfile for the target electronic document to the archive HTTP server at 438. The archive HTTP server uses the bitfile to generate a formatted message, which is sent to the customer web browser at 440. The formatted message presents the requested target electronic document to the user on the customer's web browser.
In one implementation, the electronic document includes an HTML representation of the document on the screen. This representation can include hidden routing data, as specified by RFC821. The customer can obtain the original email used to archive the electronic document by emailing it to himself, downloading it to an application such as Notepad, or forwarding it to someone else. This is especially useful for delivering stop trade confirms to customers of on-line brokerages; in this application the timestamp of a confirm is crucial. In one implementation, multiple interconnected archives are provided. FIG. 5 is a context diagram depicting two archives 502 A and 502B. The archives are connected by a private asynchronous transfer mode (ATM) backbone 512. Data is replicated between the two archives so that if one archive fails, the other can serve the customers. In addition, if the private ATM backbone goes down, the archives can operate independently. For enhanced security, access to each archive is protected by firewalls 506. For example, customer 504A connects to the archives via Internet 502. This connection is secured by firewalls 506A and 506B. Customer 504B accesses archive 502B using a virtual private network (VPN) 510. This connection is protected by firewall 506B. Customer 504C accesses archive 502 A directly by a private connection. This connection is protected by firewall 506C. Customer 504D accesses the archives using a private frame relay network 508. This connection is protected by firewalls 506D and 506E.
Implementations of the present invention provide other security features. Each customer is allowed to select a level of security based upon its own needs. For example, these security mechanisms can include a logon/password using secure sockets layer (SSL), validating certificates, using soft or hard tokens, encryption, and the like. The firewalls prevent hackers from exploiting holes in the operating system. The authentication mechanisms described above prevent attacks at the application level, such as masquerade attacks. In one implementation, separate front end servers are used. In addition, the system back-end is insulated from the front-end by using separate servers. Finally, data stored on WORM disks can be stored in an encrypted manner.
Non-tamperability of archive data can be demonstrated by various methods. For example, the electronic documents can be stored on WORM disks, as required by the SEC. However, these disks could be maliciously substituted. According to one implementation, electronic documents are signed electronically as they enter the archive. Thus, non-tamperability can be shown by demonstrating that the digital signature is still intact.
FIG. 6 is a block diagram depicting an implementation of an archive 600 according to the present invention. Archive 600 includes a portal module 602, a migratory module 604, an archiver module 608, a billing server 610, and customer modules 606A and 606B. The separation of portal module 602 and migratory module 604 permits greater scalability and also provides an additional ring for a "defense-in- depth" architecture for improved security. In one implementation, portal modules are permitted to communicate only with migratory modules. This provides an additional ring in the security architecture.
Portal module 602 includes an SMTP front-end (FE) 612, an SMTP back-end (BE) 614, and an HTTP server 618. The SMTP front-end handles incoming email, and the SMPT back-end handles outgoing email. The HTTP server handles HTTP communications, as described above. The present invention provides two methods for obtaining electronic documents via email for archiving. According to the first method described above, the customer merely sends a copy of each email to the archive. According to a second method, however, the customer uses the archive 600 as a store-and-forward site for all of the customer's email to be archived. Email that is inbound to the customer is rerouted to the archive SMTP front-end using the MX record according to well-known techniques.
Email that is outbound from the customer can simply be rerouted to the SMTP front-end of the archive. The migratory module includes a buffer archiver 620, a queue manager
622, an account database 624, a domain database 626, a log server 628, and an object router 650. When an email arrives at the SMTP front-end, a copy is sent to the buffer archiver, and a record of the email is placed in the queue manager. The queue manager permits scalability of the SMTP front-ends and back-ends. For example, queue manager permits multiple portal modules to be used, or alternatively, permits multiple SMTP front-ends and back-ends to be used within each portal module. Domain database 626 contains a listing of customers and their associated domains.
Object router 650 determines, using the domain database, which customer owns each incoming message. Based on this information, the object router causes the document to be transferred from buffer archiver 620 to the appropriate cache archiver 630 and the appropriate customer module 606. The document is eventually copied onto optical storage media and is also replicated to other archives within the system. The log server stores events such as security events and errors.
When a user logs in, the domain database uses the login information to direct the user to the appropriate account database. The web server authenticates with the account database and so determines which repositories the user is authorized to access. The account database performs user authentication and maintains an access control list for each repository.
The object database receives document descriptors, such as JavaBeans, XML DTD's (Document Type Descriptions), and the like, which create and populate repositories. In addition, the object database maps user queries to SQL queries to search the archiver.
Each customer module includes similar elements. For example, customer module 606A includes a cache archiver 630A, an object database 632A, and account database 634 A, a log server 636A, and a billing logger 638 A. Likewise, customer module 606B contains a cache archiver 630B, an object database 632B, and account database 634B, a log server 636B, and a billing logger 638B. The account database 634A performs functions similar to those of account database 624. Log server 636 A performs functions similar to those performed by log server 628. Billing logger 638A collects billing events, which are fed to billing server 610. Billing server 610 includes a log server 640 for logging events, and a master billing unit 642 for preparing customer bills. Archiver 608 performs functions similar to those of archiver 208, as described above.
The above implementation is described in terms of archiving email messages. However, the present invention is easily extended to implementations that accommodate other types of electronic documents, as would be apparent to one skilled in the relevant arts. For example, the present invention contemplates implementations for archiving electronic documents transmitted according to protocols including HTML, EDI, FTP, XML, and the like.
The invention can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them. Apparatus of the invention can be implemented in a computer program product tangibly embodied in a machine-readable storage device for execution by a programmable processor; and method steps of the invention can be performed by a programmable processor executing a program of instructions to perform functions of the invention by operating on input data and generating output. The invention can be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. Each computer program can be implemented in a high-level procedural or object-oriented programming language, or in assembly or machine language if desired; and in any case, the language can be a compiled or interpreted language. Suitable processors include, by way of example, both general and special purpose microprocessors. Generally, a processor will receive instructions and data from a read-only memory and/or a random access memory. Generally, a computer will include one or more mass storage devices for storing data files; such devices include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks. Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM disks. Any of the foregoing can be supplemented by, or incorporated in,
ASICs (application-specific integrated circuits).
To provide for interaction with a user, the invention can be implemented on a computer system having a display device such as a monitor or LCD screen for displaying information to the user and a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer system. The computer system can be programmed to provide a graphical user interface through which computer programs interact with users.
A number of embodiments of the invention have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other embodiments are within the scope of the following claims.

Claims

WHAT IS CLAIMED IS:
1. A method, comprising: receiving an electronic document storing the electronic document; associating the electronic document with a routing rule specifying a meta variable and a bias variable, each defining a separate axis in a multi-dimensional data space; parsing the electronic document to obtain metadata corresponding to the meta variable and bias data corresponding to the bias variable; and associating the electronic document with a cell in the data space, the cell selected according to the metadata and the bias data.
2. The method of claim 1, wherein the bias data includes temporal data indicating the time of receipt of the electronic document.
3. The method of claim 1, further comprising: storing the electronic document on a storage device associated with the metadata.
4. The method of claim 3, wherein the metadata indicates a geographic area, further comprising: selecting a storage device associated with the geographic area.
5. The method of claim 4, wherein the routing rule specifies a repository, and wherein the step of associating the electronic document with a cell comprises: selecting a cell associated with the repository.
6. The method of claim 5, wherein an entity is associated with the electronic document, and the step of associating the electronic document with a repository comprises: selecting the routing rule according to the identity of the entity.
7. A method comprising: receiving a query for an electronic document stored in a multi-dimensional data space having a metadata axis and a bias data axis, the query specifying a repository, a bias range and search criteria; selecting cells in the data space according to the repository and bias range; and searching for electronic documents associated with the selected cells using the search criteria.
8. The method of claim 7, wherein the selecting step comprises: selecting one or more bias data axis bins based on the bias range; and selecting one or more metadata axis bins based on the repository.
9. The method of claim 8, wherein the bias range specifies a time of arrival or the electronic document.
10. The method of claim 9, further comprising: identifying one or more electronic documents as search results; and informing a sender of the query of the search results.
11. The method of claim 10, further comprising: receiving a selection identifying at least one of the search results as a target; and sending the electronic documents associated with the target to the sender of the selection.
12. A computer program product comprising a computer usable medium having computer readable program code means embodied in said medium, said computer readable program code means comprising: computer readable program code means for causing a computer to receive an electronic document computer readable program code means for causing a computer to store the electronic document; computer readable program code means for causing a computer to associate the electronic document with a routing rule specifying a meta variable and a bias variable, each defining a separate axis in a multi-dimensional data space; computer readable program code means for causing a computer to parse the electronic document to obtain metadata corresponding to the meta variable and bias data corresponding to the bias variable; and computer readable program code means for causing a computer to associate the electronic document with a cell in the data space, the cell selected according to the metadata and the bias data.
13. The computer program product of claim 12, wherein the bias data includes temporal data indicating the time of receipt of the electronic document.
14. The computer program product of claim 12, further comprising: computer readable program code means for causing a computer to store the electronic document on a storage device associated with the metadata.
15. The computer program product of claim 14, wherein the metadata indicates a geographic area, further comprising: computer readable program code means for causing a computer to select a storage device associated with the geographic area.
16. The computer program product of claim 15, wherein the routing rule specifies a repository, and wherein the computer readable program code means for causing a computer to associate the electronic document with a cell comprises: computer readable program code means for causing a computer to select a cell associated with the repository.
17. The computer program product of claim 16, wherein an entity is associated with the electronic document, and wherein the computer readable program code means for causing a computer to associate the electronic document with a repository comprises: computer readable program code means for causing a computer to selecting the routing rule according to the identity of the entity.
18. A computer program product comprising a computer usable medium having computer readable program code means embodied in said medium, said computer readable program code means comprising: computer readable program code means for causing a computer to receive a query for an electronic document stored in a multi-dimensional data space having a metadata axis and a bias data axis, the query specifying a repository, a bias range and search criteria; computer readable program code means for causing a computer to select cells in the data space according to the repository and bias range; and computer readable program code means for causing a computer to search for electronic documents associated with the selected cells using the search criteria.
19. The computer program product of claim 18, wherein the computer readable program code means for causing a computer to select comprises: computer readable program code means for causing a computer to select one or more bias data axis bins based on the bias range; and computer readable program code means for causing a computer to select one or more metadata axis bins based on the repository.
20. The computer program product of claim 19, wherein the bias range specifies a time of arrival for the electronic document.
21. The computer program product of claim 20, further comprising: computer readable program code means for causing a computer to identify one or more electronic documents as search results; and computer readable program code means for causing a computer to inform a sender of the query of the search results.
22. The computer program product of claim 21 , further comprising: computer readable program code means for causing a computer to receive a selection identifying at least one of the search results as a target; and computer readable program code means for causing a computer to send the electronic documents associated with the target to the sender of the selection.
EP01906618A 2000-01-18 2001-01-18 Internet-based archive service for electronic documents Withdrawn EP1248996A4 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US48408500A 2000-01-18 2000-01-18
US484085 2000-01-18
PCT/US2001/002023 WO2001053995A1 (en) 2000-01-18 2001-01-18 Internet-based archive service for electronic documents

Publications (2)

Publication Number Publication Date
EP1248996A1 true EP1248996A1 (en) 2002-10-16
EP1248996A4 EP1248996A4 (en) 2003-03-12

Family

ID=23922677

Family Applications (1)

Application Number Title Priority Date Filing Date
EP01906618A Withdrawn EP1248996A4 (en) 2000-01-18 2001-01-18 Internet-based archive service for electronic documents

Country Status (3)

Country Link
EP (1) EP1248996A4 (en)
AU (1) AU2001234506A1 (en)
WO (1) WO2001053995A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7386439B1 (en) 2002-02-04 2008-06-10 Cataphora, Inc. Data mining by retrieving causally-related documents not individually satisfying search criteria used
US8135711B2 (en) 2002-02-04 2012-03-13 Cataphora, Inc. Method and apparatus for sociological data analysis
US7421660B2 (en) 2003-02-04 2008-09-02 Cataphora, Inc. Method and apparatus to visually present discussions for data mining purposes
US7519589B2 (en) 2003-02-04 2009-04-14 Cataphora, Inc. Method and apparatus for sociological data analysis
WO2003067497A1 (en) * 2002-02-04 2003-08-14 Cataphora, Inc A method and apparatus to visually present discussions for data mining purposes
EP1910949A4 (en) 2005-07-29 2012-05-30 Cataphora Inc An improved method and apparatus for sociological data analysis
US8819021B1 (en) 2007-01-26 2014-08-26 Ernst & Young U.S. Llp Efficient and phased method of processing large collections of electronic data known as “best match first”™ for electronic discovery and other related applications

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5806061A (en) * 1997-05-20 1998-09-08 Hewlett-Packard Company Method for cost-based optimization over multimeida repositories

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5555346A (en) * 1991-10-04 1996-09-10 Beyond Corporated Event-driven rule-based messaging system
US5793888A (en) * 1994-11-14 1998-08-11 Massachusetts Institute Of Technology Machine learning apparatus and method for image searching

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5806061A (en) * 1997-05-20 1998-09-08 Hewlett-Packard Company Method for cost-based optimization over multimeida repositories

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BEARD M K ET AL: "Multilevel and graphical views of metadata" RESEARCH AND TECHNOLOGY ADVANCES IN DIGITAL LIBRARIES, 1998. ADL 98. PROCEEDINGS. IEEE INTERNATIONAL FORUM ON SANTA BARBARA, CA, USA 22-24 APRIL 1998, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 22 April 1998 (1998-04-22), pages 256-265, XP010276895 ISBN: 0-8186-8464-X *
ISHIKAWA H ET AL: "Document warehousing based on a multimedia database system" DATA ENGINEERING, 1999. PROCEEDINGS., 15TH INTERNATIONAL CONFERENCE ON SYDNEY, NSW, AUSTRALIA 23-26 MARCH 1999, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 23 March 1999 (1999-03-23), pages 168-173, XP010326187 ISBN: 0-7695-0071-4 *
See also references of WO0153995A1 *
WANG JICHENG ET AL: "Web mining: knowledge discovery on the Web" SYSTEMS, MAN, AND CYBERNETICS, 1999. IEEE SMC '99 CONFERENCE PROCEEDINGS. 1999 IEEE INTERNATIONAL CONFERENCE ON TOKYO, JAPAN 12-15 OCT. 1999, PISCATAWAY, NJ, USA,IEEE, US, 12 October 1999 (1999-10-12), pages 137-141, XP010363423 ISBN: 0-7803-5731-0 *

Also Published As

Publication number Publication date
EP1248996A4 (en) 2003-03-12
AU2001234506A1 (en) 2001-07-31
WO2001053995A1 (en) 2001-07-26

Similar Documents

Publication Publication Date Title
US6965904B2 (en) Query Service for electronic documents archived in a multi-dimensional storage space
US8041719B2 (en) Personal computing device-based mechanism to detect preselected data
US8312553B2 (en) Mechanism to search information content for preselected data
US20080052284A1 (en) System and Method for the Capture and Archival of Electronic Communications
US9515998B2 (en) Secure and scalable detection of preselected data embedded in electronically transmitted messages
US7886359B2 (en) Method and apparatus to report policy violations in messages
US8566305B2 (en) Method and apparatus to define the scope of a search for information from a tabular data source
US7512814B2 (en) Secure and searchable storage system and method
CA2597083C (en) Method and apparatus for handling messages containing pre-selected data
US20060041533A1 (en) Encrypted table indexes and searching encrypted tables
US20060184549A1 (en) Method and apparatus for modifying messages based on the presence of pre-selected data
CN100424704C (en) Full text search system based on ciphertext
US20060224589A1 (en) Method and apparatus for handling messages containing pre-selected data
US7376652B2 (en) Personal portal and secure information exchange
JP4903386B2 (en) Searchable information content for pre-selected data
EP1248996A1 (en) Internet-based archive service for electronic documents
WO2007068279A1 (en) Method and computer system for updating a database from a server to at least one client
Shewale et al. Efficient Multi-Keyword Ranked Search over Encrypted Cloud Computing
Scholar Privacy Preserving Multi-Keyword Ranked Search over Encrypted Cloud Data

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20020718

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Free format text: AL;LT;LV;MK;RO;SI

A4 Supplementary search report drawn up and despatched

Effective date: 20030129

RIC1 Information provided on ipc code assigned before grant

Ipc: 7G 06F 17/30 A

17Q First examination report despatched

Effective date: 20030509

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20031120