US20090063587A1 - Method and system for function-specific time-configurable replication of data manipulating functions - Google Patents

Method and system for function-specific time-configurable replication of data manipulating functions Download PDF

Info

Publication number
US20090063587A1
US20090063587A1 US12/262,308 US26230808A US2009063587A1 US 20090063587 A1 US20090063587 A1 US 20090063587A1 US 26230808 A US26230808 A US 26230808A US 2009063587 A1 US2009063587 A1 US 2009063587A1
Authority
US
United States
Prior art keywords
replication
function
data
storage system
functions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/262,308
Inventor
Holger JAKOB
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Seagate Technology LLC
Original Assignee
Jakob Holger
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US12/140,296 external-priority patent/US20090019443A1/en
Application filed by Jakob Holger filed Critical Jakob Holger
Priority to US12/262,308 priority Critical patent/US20090063587A1/en
Publication of US20090063587A1 publication Critical patent/US20090063587A1/en
Assigned to III HOLDINGS 1, LLC reassignment III HOLDINGS 1, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: JAKOB, HOLGER
Assigned to III HOLDINGS 3, LLC reassignment III HOLDINGS 3, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: III HOLDINGS 1, LLC
Assigned to SEAGATE TECHNOLOGY LLC reassignment SEAGATE TECHNOLOGY LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: III HOLDINGS 3, LLC
Priority to US16/377,703 priority patent/US11467931B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2094Redundant storage or storage space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2056Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
    • G06F11/2082Data synchronisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2056Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
    • G06F11/2071Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring using a plurality of controllers

Definitions

  • the present invention relates generally to storage systems, which are able to store digital objects or files. More specifically, the present invention relates to data replication systems and methods.
  • Storage systems provide data replication capabilities for the purpose of either logical error recovery or disaster tolerance, which requires respectively high availability and relatively high integrity.
  • Storage systems allow block, object or file access and provide a means to replicate data from source data storage to a backup data storage system.
  • the method and system for function-specific time-configurable replication of data manipulating functions applies to storage systems allowing object and file access only.
  • Object-based storage arrays allow applications to integrate a set of commands, typically called an Application Programming Interface (API).
  • API Application Programming Interface
  • the API allows the creation of new objects as well as the modification of existing objects.
  • WORM Write-Once-Read-Many
  • File-oriented storage arrays provide users or applications the possibility of accessing the system using a file-share. These storage systems provide access to the installed capacity using standard file sharing protocols like NFS (meaning Network File System) or CIFS (meaning Common Internet File System). These protocols may also have proprietary extensions to implement special functionality like WORM file systems or WORM shares.
  • NFS meaning Network File System
  • CIFS meaning Common Internet File System
  • the storage array may also be a standard server running an operating system available from one of the many providers of operating systems.
  • the server would provide access to the available capacity using file shares similar to a file-oriented storage array.
  • the set of data manipulation functions for object or file oriented storage arrays usually contains functions like write, delete, update, write-disable until expiration date or delete-disable before expiration date. The exact implementation however is dependent on the storage array. Each individual function on a storage array is described in the array specific documentation. If the storage array provides special functions that are not standardized in the protocols like NFS and CIFS the array vendor provides a detailed description of the required integration with the storage array.
  • Existing object or file oriented storage arrays already provide ways to replicate data between two or more storage arrays.
  • the replication may be implemented on the storage array or on a dedicated system that performs the replication of data.
  • the replication may include or exclude specific functions. If a function is replicated, it is generally replicated as soon as possible.
  • the changes made to objects or file systems are made by the users or applications making these changes.
  • Users may typically access file oriented storage systems and perform the normal operations like writes, reads, updates or deletes of files.
  • Applications may access both object and/or file oriented storage arrays. As applications are programmed, they may implement rules to make data read-only up to a certain expiration date.
  • the capability to generate new versions of documents and other advanced functionality exist in various solutions available on the market. Among these advanced storage array functionalities in the prior art are applications which also use WORM functionality on storage arrays.
  • Data replication functionalities of current replication systems are based on fixed, pre-established and non-configurable delays. Consequently, deletion of data that is referred to by otherwise non-deleted files, objects or applications prevents recovery of such data.
  • U.S. Pat. No. 6,260,125 to McDowell discloses an asynchronous disk mirroring system for use within a network computer system, wherein a write queue operates to delay the time of receipt of write requests to storage volumes, with a view to increasing data replication performance.
  • the write queues include several write buffers, wherein the write requests pass through the write queue in a first-in, first-out (FIFO) sequence; and so transmission of write requests may be subject to a time-delay by either a pre-determined amount of time or when the storage or write buffer is full.
  • McDowell also discloses a log file configured to receive the delayed write requests, for log-based mirror reconstruction and check-pointing of the mirrored volumes.
  • the replication of data by the system of McDowell is limited to updating and writing and does not provide function-dependant data replication, nor does it provide configurable replication of data manipulating functions such as delete or write-disable.
  • Patent application number WO 99/507/747 to Arnon discloses a method and apparatus for asynchronously updating a mirror of data from a source device, whose purpose is to prevent the overwriting of data on a source storage that has not yet been committed to a target storage system.
  • the Arnon method and apparatus addresses the need for data integrity but does not allow a user to configure replication operations on a function base or time base, and only prevents overwrite of data on a source storage in the situation where data has not been replicated on target storage.
  • Patent application WO 02/25445 to Kamel discloses a method and system for electronic file lifecycle management. Similar applications are also called Hierarchical Storage Management (HSM) applications. File Lifecycle management and HSM software move files based on rules between different storage systems. The system might also create multiple copies on different storage systems if the defined rules or policies define the lifecycle of a file accordingly.
  • HSM Hierarchical Storage Management
  • What is needed is a system or method that allows synchronizing or configuring the time frame within which a data restore is possible from a target storage system and which enables replicating data manipulating functions performed on object or file based storage arrays.
  • the system and method of the invention provides for function-specific replication for data manipulating functions of digital data, such as files or objects, with a configurable time delay for each function to be replicated.
  • the system includes a source storage system from which a data manipulating function is to be replicated, a destination storage system(s) to which the replicated function on digital data is being replicated to and a replication management module for managing the function specific replication delay and the function replication between the source storage system(s) and the destination storage system(s).
  • the replication management module of the invention provides functionality allowing: (1) configuration of a delay after which a data manipulating function will be performed on the destination storage system when data stored on the source storage system, modified or created by the function, is replicated on corresponding data on the destination storage system; (2) the replication of the data manipulating function performed on data stored on the source storage system with the configured delay to the destination storage system; and (3) querying of function-specific changes to data of the source storage system in a given timeframe.
  • the system and method solves the business need of combining both data replication for high availability and disaster tolerance as well as providing recoverability of data in case of logical errors.
  • the combination of object or file replication for disaster tolerance with the ability to configure the delay of the replication for each function that can be performed on the stored objects or files provides both disaster tolerance and the ability of recovering from logical errors.
  • the method makes replication of data manipulating functions dependent on the function that was performed on the data as well as makes the delay of the replication time-configurable, in that the replication of new objects or files can be performed as quickly as possible but the replication of another function like deletes of objects or files may be delayed for a configurable amount of time, thereby providing a solution for both disaster tolerance and logical error recovery.
  • This allows the customer to ensure that data on storage arrays that is not backed up is recoverable for the same time that a restore and recovery of references to these objects or files is possible.
  • Such system thus guarantees that all objects and files are available for recovery as long as references to that data may be restored from backups.
  • the system and method of the invention delays the deletion of data from the source storage array for a N period until the data is also deleted from the target storage array, thereby allowing the restoring of an application database using the standard recovery procedure as well as providing the possibility of accessing the previously deleted data on the secondary storage array without having to have a complete backup of all data having ever been written to the source storage array.
  • the standard recovery procedure is no longer capable of restoring and recovering references to data, the file or object referenced can also be deleted on the target storage array.
  • FIG. 1 is a schematic diagram of a block-based storage system of the prior art where the replication management module is located in the source storage array.
  • FIG. 2 is a schematic diagram of an object or file based storage array of the prior art where the replication management module is implemented in a separate system.
  • FIG. 3A and FIG. 3B are schematic diagrams showing the elements of the system for function specific replication of data manipulating functions on digital data with a configurable time delay, where the replication management module is located on the source storage system.
  • FIG. 4 is a schematic diagram showing the elements of the system for function-specific replication of data manipulating functions with a configurable time delay, where the replication management module is located between the application or user and the source and the destination storage system, thereby providing access to the storage systems.
  • FIG. 5 is a schematic diagram showing the elements of the system for function specific replication of data manipulating functions on digital data with a configurable time delay, having several destination storage systems.
  • FIG. 6 is a schematic diagram showing the elements of the system for function specific replication of data manipulating functions on digital data with a configurable time delay, having several source storage systems.
  • FIG. 7 is a flow chart showing the necessary main steps to implement a function-specific function replication system and method of the present invention.
  • FIG. 8 is a flow chart showing the steps of the information gathering process of the invention for proprietary storage systems of a first class of storage arrays, such class not allowing the querying of the array for changes that were made to the objects or files that are stored on the array.
  • FIG. 9 is a flow chart showing the steps for implementing the replications monitoring process of the invention for proprietary storage systems of a first class of storage arrays for which the task of replication monitoring requires the creation of a replication monitoring database.
  • FIG. 10 is a flow chart showing the steps for implementing the replications monitoring process of the invention for a second class of storage arrays, such class not requiring the creation of the replication monitoring database.
  • FIG. 11 is a flow chart describing the steps necessary to maintain a consistent set of objects or files on the target storage array.
  • FIG. 12 is a flow chart showing the steps for implementing the delayed function-specific replication of data manipulating functions for a first class of storage arrays based on the replication monitoring database.
  • FIG. 13 is a flow chart showing the steps for implementing the delayed function-specific replication of data manipulating functions for a second class of storage arrays that do not require the replication monitoring database.
  • FIG. 14 is a schematic representation of the configuration table of the invention.
  • FIG. 15 is a schematic representation of the Source Change Table of the invention.
  • FIG. 16 is a schematic representation of the Outstanding Replications Table of the invention.
  • FIG. 17 is a schematic representation of the Replication Audit Table of the invention.
  • FIG. 18 lists examples of different customer requirements and how they are implemented in a configuration table.
  • a block-based source storage system 60 of the prior art provides a server 80 access to a certain disk capacity.
  • the operating system installed on server 80 possesses the knowledge of where and which object or file lies within this disk capacity. This information can, for example, be stored in the File Allocations Table or I-nodes.
  • An application or User 90 accessing a file on such a server 80 would therefore issue any function-based calls like write, update and delete to that server 80 who in turn knows where the file is located on the block-based source storage system 60 . Any function performed by an application or user 90 will result in an update or read of a block on the disk capacity available to the server 80 .
  • the replication of writes or update of a block on the source storage array 60 is embodied in the source storage system 60 .
  • object or file based storage arrays 65 and 75 provide the functionality of the server 80 mentioned above directly from within the storage array 65 and 75 .
  • the application or user 90 accessing a file issues the functions directly to the storage array.
  • a server 80 providing file based access to the available disk capacity on source storage array 65 is also contained in the file based storage arrays because whether he accesses the server or the storage array is not differentiable by an application or user.
  • file access protocols like CIFS or NFS.
  • the replication from the file or object based source storage system 65 to the corresponding target storage array 76 is embodied in the source storage system 65
  • a system 10 for function specific replication of data manipulating functions 12 on digital data, such as files or objects, allows for a configurable time delay 14 for each function to be replicated.
  • the system 10 includes a source storage system 20 from which performed data manipulating functions on data are replicated, at least one destination storage system 30 to which performed data manipulating functions are replicated to, a replication management module 40 for managing the function specific replication delay and the replication of data manipulating functions between the source storage systems and at least one destination storage system, optionally comprising a replication monitoring database 42 .
  • the system 10 provides replication for at least one standard data manipulating function of a group of functions including: write, delete, update, modify, write-disable, write disable until expiration date, delete-disable and delete-disable until expiration date.
  • the replication management module 40 provides several novel features.
  • One feature allows for the configuration of a delay after which a specific data manipulating function on data stored on the source storage system is replicated on corresponding data on the destination storage system.
  • Another feature allows for replication of the data manipulating function performed on data stored on the source storage system with the configured delay to the destination storage system.
  • Still another feature allows for querying function-specific changes to data of the source storage system in a given timeframe.
  • At least one destination storage system 30 is based on one of the following architectures: object-based storage arrays comprising an application programming interface, file-based storage arrays or a computer server, comprising memory 36 , a CPU 38 and an operating system 39 .
  • the system 10 may directly provide access to storage systems based on either of the following architectures: object-based storage systems having an application programming interface 34 , file-based storage arrays, and a computer server 80 , including memory 36 , a CPU 38 and an operating system 39 as shown in FIG. 5 .
  • the system 10 is adaptable to several different system configurations. Referring now to FIG. 3A , a configuration where the replication management module 40 is located on the source storage system 20 is shown.
  • the information about functions performed by applications or users 90 on objects or files stored is gathered by the replication management module from the source storage system 20 and used to replicate each data manipulating function with a configurable delay to the Destination Storage system 30 .
  • the information gathered may optionally be stored for future reference in the replication monitoring database 42 .
  • replication management module 40 is located between the application or user 90 and the source and destination storage systems 20 and 30 is shown.
  • the Replication management module 40 gathers the information for function-specific replication of data manipulating functions from the Source storage system 20 and replicates to multiple Destination Storage systems 30 .
  • a Destination storage system 30 may be used by a second Replication management module as the source storage system to replicate to a secondary destination storage system 32 .
  • One replication management system 40 is gathering information from multiple source storage systems 20 . All data manipulating functions performed on multiple source storage systems 20 are replicated to a common destination storage system 30 .
  • the source storage system 20 or the destination storage system 30 are file-based storage arrays, including a server 80 which enables file based access to the available storage capacity of the storage array.
  • the method 100 for implementing a function-specific replication of data using system 10 involves three parallel functions to be performed continuously in parallel or based on a schedule: Gathering information 120 , Pending replications monitoring 140 and Delayed function-specific data replication 160 .
  • FIG. 8 shows the gathering of information 120 required for the replication of data manipulating functions that are performed on data stored on a source storage system and replicated to a target storage system. This is achieved by:
  • the running of an information gathering process 122 includes the substeps of:
  • the priority and the delay are correlated to each other to ensure consistency in the target environment.
  • a typical priority order would assume that new objects created with a write function are of highest priority, changes performed with the update function are of mid-level priority and delete functions of lowest priority. The consequences are that the highest priority delay must be assigned the shortest delay and the lowest priority the longest one. Priority and corresponding delay times 14 are required to ensure the consistency of the target objects or files.
  • Consistency between source and target storage arrays with respect to replication of data manipulating functions is defined at the data manipulating function level and the integrity rules are defined by business criteria based on the objects to be achieved.
  • all data manipulating functions eg.create/write, update and delete
  • recoverability is the main objective, the priority for data creation functions would be high.
  • Data changes could be of medium priority and delay and most important data deletion functions would be replicated with lowest priority and longest delay. The delay would be configured to be as long as the required recoverability period.
  • Table 26 lists, by way of example, different customer requirements and their implementation in configuration tables.
  • the replication management module can be used to further specify the granularity on which the function-specific data replication should act. For example, the module would allow the replication of delete functions from a SEC compliant application as quickly as possible to ensure that content is deleted once it is permissible under the SEC rules to do so and to delay the replication of a delete function from a file archive application that does not fall under regulatory requirements. This behaviour is specified using the modifier 129 entry in the configuration table.
  • a differentiation based on a part of the UNC path may provide similar functionality.
  • Application functions performed by accessing the share ⁇ server1 ⁇ share1 can be replicated differently than functions performed by users accessing ⁇ server1 ⁇ share2 or ⁇ server2 ⁇ share1.
  • the pending replications monitoring process 140 is a monitoring process for pending replications, which watches for outstanding replications and passes them to the process who does the actual function replication.
  • the Pending replications monitoring periodically queries the source system for changes and inserts them into the database of what has happened on the source (the source change table). In simpler variations this just creates a list of objects if the source array allows querying based on timeframe and function performed)
  • the interval in which the pending replications monitoring takes place must be specified.
  • the inputs into the system 10 and method 100 of the invention implementing the function-specific replication of data manipulating functions 12 are gathered in a Graphical user interface 19 and stored in the replication monitoring database configuration input table 22 .
  • the required configuration information may be provided in a configuration file. This file may be created using a Graphical user interface or by editing the configuration file in a text editor.
  • the possibility of specifying more than one destination storage system 30 also allows replicating functions with a different delay for each target system.
  • the pending replications monitoring process In order to implement function-specific replication including a configurable time delay 14 , the pending replications monitoring process must provide a means for monitoring pending replications and for determining the delay 14 or exact time 16 to replicate the data manipulating function.
  • the replication time 16 to replicate a functional change may be stored in the replication monitoring database 42 and will be used by the pending replications monitoring process 140 and the delayed function-specific data replication process 160 .
  • Tracking of which function was performed on a storage array is dependant on the functionality that the specific storage array provides.
  • the functionality of the storage array also defines the granularity that can be provided to the user of the application.
  • the existence of the replication monitoring database 42 with all of the required information stored allows changing the delay with which the replication of a data manipulating function should be performed.
  • the replication time in the outstanding replications table 18 can be changed for data manipulating functions that are not yet replicated.
  • the pending replications monitoring process 140 takes into account the changed replication time to initiate the delayed function-specific replication of data manipulating functions 160 .
  • the delay might also be configured independently for each application, file system or subset of objects.
  • the replication monitoring database 42 must be configured for each source storage system, notably with regard to identification of the information to be gathered and tracked, so as to enable the correct and consistent replication of the data manipulating function of the present invention to be used.
  • An object based storage system does not require the same information as a file based storage system for the replication of data manipulating functions.
  • the required information is condensed into the least amount of data necessary to implement a function-specific and time delayed replication of data manipulating functions.
  • Storage virtualization software abstracts the physical storage systems into logical entities. Virtualization is a good way to mask ongoing migrations or systems being replaced with newer ones. Virtualization software thus knows which function is being performed on which file or object.
  • the method of the present invention in particular, the replication features thereof, can be implemented in a virtualization layer that provides direct access to source or target storage systems.
  • the system of the present invention can directly provide access to source and target storage systems as shown in FIG. 4 .
  • the way the function-specific information is retrievable from a storage array depends on the functionality that is implemented on a storage array. It also depends on other functionality aspects like the ability to install and run a process on that storage array.
  • file oriented storage may be implemented using hardware that provides file level access based on standard operating systems. These operating systems, such as UNIX, Linux or Windows, allow the installation of additional software that can be used to facilitate the implementation of the present invention.
  • object and File oriented storage arrays may be implemented using proprietary operating systems like Data “ONTAP”, “DART” or “CENTRASTAR”.
  • proprietary operating systems like Data “ONTAP”, “DART” or “CENTRASTAR”.
  • Standard Operating systems based storage allows the installation and creation of additional software and services on the server that provides the storage services in the network.
  • the pending replications monitoring process 140 runs as such a process on the storage array resp. storage server. Changes in the file systems may either be intercepted or detected and the required information for the function-specific delayed replication of data manipulating functions may be inserted in the database source change table directly from the pending replications monitoring process.
  • the whole system or an implementation of the method of the present invention may run on a standard operating system based storage server or storage array.
  • the implementation of the pending replications monitoring process for proprietary storage systems must at least provide the function-specific information for the process 160 for delayed function-specific replication of data manipulating functions. There are two general approaches that need to be differentiated depending on the class of the storage array.
  • a first class of storage arrays does not allow querying the array for changes that were made to the objects or files that are stored on the array.
  • the pending replications monitoring process 140 of the system implementing the function-specific delayed data replication is described in FIG. 9 .
  • FIG. 11 the process to maintain consistency in the replication of data manipulating functions is described.
  • FIG. 12 describes the replication of data manipulating functions.
  • the task of the pending replications monitoring process 140 does not require the creation of an additional database.
  • the pending replications monitoring process as described in FIG. 10 continuously, or in a scheduled way, queries the source storage arrays for changes made to objects or files based on the function to be replicated and additional information such as when or who performed the function.
  • FIG. 13 shows the delayed function-specific replication of data manipulating functions for the second class of storage arrays.
  • EMC CENTERA A good example in the category of object-based storage systems with this query functionality is “EMC CENTERA”, described at http URIs emc.com/products/family/emc-centera-family.htm, the content of which, including content in links therein.
  • the Query API allows the listing of content based on a timeframe the query is targeted to. The default query would provide a list of objects that may be used to find out when the object was written and who created it. With the same query functionality, the information gathering process 122 can determine which objects were deleted in order to replicate the delete function with the configured delay.
  • the available proprietary storage systems today already provide replication functionality based on independent software or software installed on the storage arrays. The implementation of a function-specific delayed replication on the storage systems has heretofore not been implemented.
  • the pending replications monitoring process 140 for the first class of storage arrays requiring a replications monitoring database is built in two steps:
  • the pending replications monitoring process 140 is made up of the following substeps which build the replications monitoring database source change table:
  • the pending replications monitoring process 140 is provided for the second class of storage arrays which does not require a replications monitoring database.
  • the pending replications monitoring is implemented in two stops:
  • the outstanding replications maintenance process 150 ensures the maintenance of a consistent set of files or objects on the target storage array.
  • the maintenance of the replications monitoring database requires two steps:
  • the maintenance process 150 itself consists of the steps:
  • the delayed function-specific replication process 162 is made up of several steps, including:
  • the delayed function-specific replication of data manipulating functions is performed for storage arrays with query functionality not requiring the maintenance of a replications monitoring database.
  • the delayed functions-specific replication process 170 invokes the data manipulation function replication process 171 continuously or based on a schedule.
  • the list of objects or files is available from the replications monitoring process 149 based on the function to be replicated with the configured delay.
  • the function is performed on the target storage array's objects or files.
  • the function-specific time-configurable replication of data manipulating functions for the first class of storage arrays requires a replication monitoring database that provides all the information required to implement a function-specific delayed replication in a consistent manner.
  • New entries in the Source Change Table might trigger a function that inserts the corresponding entry or entries in the outstanding replication table.
  • This outstanding replications maintenance process 150 may also run continuously or based on a schedule to update the outstanding replication table.
  • the replication table is based on the configuration table and the inserts into the source change table. Changes in the configuration may require the replication table to be rebuilt for entries in the source change table that were not yet completed
  • the audit table provides a means of proving which replications have already been performed.
  • the delayed function-specific replication of data manipulating functions 160 for delaying a deletion of data from the source storage system until the data is also deleted from the destination storage system, is achieved by configuring the delete function with the lowest priority and the longest delay used for the function-specific replication of data manipulating functions.
  • the delayed function-specific replication of data manipulating functions may run continuously or based on a schedule. In the scheduled way the replication is initiated in regular intervals at specific times. Every time this interval expires, the pending replications monitoring process updates the source change table with non-replicated data manipulating functions, the outstanding replications maintenance process is run if the replication takes place between storage arrays requiring the replication monitoring database and the delayed function-specific replication of data manipulating functions is run for functions with an expired delay.
  • the delayed function-specific replication of data manipulating functions needs to follow the same directions as described in the information gathering and pending replications monitoring sections herein.
  • the replication needs to be implemented for Standard Operating system based storage arrays differently than for proprietary storage systems and depends on the functionality of the storage arrays to be supported.
  • an Operating System based storage array that provides file sharing the function-specific replication would in case of a write oil ⁇ Server1 ⁇ Share1 ⁇ File1.doc create the new file on the target storage array under ⁇ Server2 ⁇ Share5 ⁇ File1.doc.
  • the method makes replication of data manipulating functions dependent on the function that was performed on the data as well as makes the delay of the replication time-configurable, thereby providing a solution for both disaster tolerance and logical error recovery.
  • This allows the customer to ensure that data on storage arrays is recoverable for the same time that a restore and recovery from the production references of the objects or files is possible.
  • Such system thus guarantees that all objects and files are available as long as references to that data may be restored from backups.
  • the system and method of the invention can extend existing function-specific replications without configurable delay by replicating some data manipulating functions with a specified delay.
  • the replication between a source and a destination storage array would continue to replicate write functions but the replication of delete functions from the source storage array would be delayed using the current invention for a N period until the data is also deleted from the target storage array, thereby allowing the restoring of an application database using the standard recovery procedure and would thus provide the possibility to access the previously deleted data on the secondary storage array without having to have a complete backup of all data having ever been written to the source storage array.
  • the standard recovery procedure is also no longer capable of recovering data, the file or object referenced can also be deleted on the target storage array by the delayed function-specific replication of data manipulating functions.

Abstract

The system (10) and method (100) of the invention provides for function-specific replication of data manipulating functions (12) performed on data, such as files or objects, with a configurable time delay (14) for each function to be replicated. The system (10) and method (100) includes a replication management module (40) for managing the consistent function specific replication of data manipulating functions (12) with a function-specific delay (14) between a source storage system(s) (20, 65) and a destination storage system(s) (30, 75) and optionally includes a replication monitoring database (42).

Description

    CROSS REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation-in-part of U.S. patent application Ser. No. 12/140,296, of the same title, filed Jun. 17, 2008, which is a continuation-in-part of U.S. patent application Ser. No. 11/939,633, of the same title, filed Nov. 14, 2007, which claims priority to US provisional application No. 60/949,357, of the same title, filed Jul. 12, 2007, the contents of which are incorporated by reference hereto.
  • FIELD OF THE INVENTION
  • The present invention relates generally to storage systems, which are able to store digital objects or files. More specifically, the present invention relates to data replication systems and methods.
  • BACKGROUND OF THE INVENTION
  • Several storage systems provide data replication capabilities for the purpose of either logical error recovery or disaster tolerance, which requires respectively high availability and relatively high integrity. Storage systems allow block, object or file access and provide a means to replicate data from source data storage to a backup data storage system. The method and system for function-specific time-configurable replication of data manipulating functions applies to storage systems allowing object and file access only.
  • Object-based storage arrays allow applications to integrate a set of commands, typically called an Application Programming Interface (API). The API allows the creation of new objects as well as the modification of existing objects. For Storage arrays that are also providing Write-Once-Read-Many (WORM) functionality, it may not be possible to modify already stored objects. Deletion of objects is possible and in case of WORM storage arrays, deletion is prevented before the specified retention time has expired.
  • File-oriented storage arrays provide users or applications the possibility of accessing the system using a file-share. These storage systems provide access to the installed capacity using standard file sharing protocols like NFS (meaning Network File System) or CIFS (meaning Common Internet File System). These protocols may also have proprietary extensions to implement special functionality like WORM file systems or WORM shares.
  • The storage array may also be a standard server running an operating system available from one of the many providers of operating systems. The server would provide access to the available capacity using file shares similar to a file-oriented storage array.
  • The set of data manipulation functions for object or file oriented storage arrays usually contains functions like write, delete, update, write-disable until expiration date or delete-disable before expiration date. The exact implementation however is dependent on the storage array. Each individual function on a storage array is described in the array specific documentation. If the storage array provides special functions that are not standardized in the protocols like NFS and CIFS the array vendor provides a detailed description of the required integration with the storage array.
  • Existing object or file oriented storage arrays already provide ways to replicate data between two or more storage arrays. The replication may be implemented on the storage array or on a dedicated system that performs the replication of data.
  • Existing systems also allow replicating changes to the target system. The replication may include or exclude specific functions. If a function is replicated, it is generally replicated as soon as possible.
  • The changes made to objects or file systems are made by the users or applications making these changes. Users may typically access file oriented storage systems and perform the normal operations like writes, reads, updates or deletes of files. Applications may access both object and/or file oriented storage arrays. As applications are programmed, they may implement rules to make data read-only up to a certain expiration date. The capability to generate new versions of documents and other advanced functionality exist in various solutions available on the market. Among these advanced storage array functionalities in the prior art are applications which also use WORM functionality on storage arrays.
  • Data replication functionalities of current replication systems are based on fixed, pre-established and non-configurable delays. Consequently, deletion of data that is referred to by otherwise non-deleted files, objects or applications prevents recovery of such data.
  • U.S. Pat. No. 6,260,125 to McDowell, the content of which is incorporated herein by reference thereto, discloses an asynchronous disk mirroring system for use within a network computer system, wherein a write queue operates to delay the time of receipt of write requests to storage volumes, with a view to increasing data replication performance. The write queues include several write buffers, wherein the write requests pass through the write queue in a first-in, first-out (FIFO) sequence; and so transmission of write requests may be subject to a time-delay by either a pre-determined amount of time or when the storage or write buffer is full. McDowell also discloses a log file configured to receive the delayed write requests, for log-based mirror reconstruction and check-pointing of the mirrored volumes. The replication of data by the system of McDowell is limited to updating and writing and does not provide function-dependant data replication, nor does it provide configurable replication of data manipulating functions such as delete or write-disable.
  • Patent application number WO 99/507/747 to Arnon, the content of which is incorporated herein by reference thereto, discloses a method and apparatus for asynchronously updating a mirror of data from a source device, whose purpose is to prevent the overwriting of data on a source storage that has not yet been committed to a target storage system. The Arnon method and apparatus addresses the need for data integrity but does not allow a user to configure replication operations on a function base or time base, and only prevents overwrite of data on a source storage in the situation where data has not been replicated on target storage.
  • User-controlled data replication of the prior art allows users to control whether replication occurs, but not when it occurs. A system designed by Denehy et al. (Bridging the Information Gap in Storage Protocol Stacks, Denehy and al., Proceedings of the general track, 2002, USENIX annual technical conference, USENIX Association, Berkeley Calif., USA, the content of which is incorporated by reference thereto) allows a user to prioritize data replication actions on specific files based on file designations such as “non-replicated”, “immediately replicated” or “lazily replicated.” However, such configuration only addresses system performance needs for short lifetime data storage systems, and does not address the needs for system integrity and accident recovery.
  • Patent application WO 02/25445 to Kamel, the content of which is incorporated herein by reference thereto, discloses a method and system for electronic file lifecycle management. Similar applications are also called Hierarchical Storage Management (HSM) applications. File Lifecycle management and HSM software move files based on rules between different storage systems. The system might also create multiple copies on different storage systems if the defined rules or policies define the lifecycle of a file accordingly.
  • Given the current interrelationship of data stored on networks, what is needed therefore is a way of ensuring that deleted data on devices that are not backed up may be recovered as long as a user wishes to preserve the ability to restore data including references to the deleted data of such devices from backups.
  • What is needed is a user-controlled replication system for function-specific replication of data manipulating functions that allows users to control both whether and when replication of data manipulating functions occurs.
  • What is needed is a system or method that allows synchronizing or configuring the time frame within which a data restore is possible from a target storage system and which enables replicating data manipulating functions performed on object or file based storage arrays.
  • Further, what is needed is a system which more fully addresses the needs for system high availability, integrity and accident recovery.
  • SUMMARY OF THE INVENTION
  • The system and method of the invention provides for function-specific replication for data manipulating functions of digital data, such as files or objects, with a configurable time delay for each function to be replicated. The system includes a source storage system from which a data manipulating function is to be replicated, a destination storage system(s) to which the replicated function on digital data is being replicated to and a replication management module for managing the function specific replication delay and the function replication between the source storage system(s) and the destination storage system(s).
  • The replication management module of the invention provides functionality allowing: (1) configuration of a delay after which a data manipulating function will be performed on the destination storage system when data stored on the source storage system, modified or created by the function, is replicated on corresponding data on the destination storage system; (2) the replication of the data manipulating function performed on data stored on the source storage system with the configured delay to the destination storage system; and (3) querying of function-specific changes to data of the source storage system in a given timeframe.
  • It is an object of the invention to provide a system and method which meets the business need of combining both data replication for high availability and disaster tolerance as well as recoverability of data in case of logical errors.
  • It is another object of the present invention to provide a system and method for function specific replication of data manipulating functions on digital data that is adaptable to a wide range of storage system architectures, including object-based storage arrays having an application programming interface, file-based storage arrays, and standard computer servers.
  • It is a further object of the present invention to provide a system and method for function specific replication of data manipulating functions on digital data that can be implemented in hardware abstraction and virtualization software.
  • It is yet a further object of the present invention to provide a system and method for function specific replication of data manipulating functions on digital data that is easily scalable to several and even a large number of destination storage systems.
  • It is an object of the invention to provide a system and method which replicates the data manipulating function itself and not the data changes.
  • In an advantage, the system and method solves the business need of combining both data replication for high availability and disaster tolerance as well as providing recoverability of data in case of logical errors.
  • In another advantage, the combination of object or file replication for disaster tolerance with the ability to configure the delay of the replication for each function that can be performed on the stored objects or files provides both disaster tolerance and the ability of recovering from logical errors.
  • In another advantage, the method makes replication of data manipulating functions dependent on the function that was performed on the data as well as makes the delay of the replication time-configurable, in that the replication of new objects or files can be performed as quickly as possible but the replication of another function like deletes of objects or files may be delayed for a configurable amount of time, thereby providing a solution for both disaster tolerance and logical error recovery. This allows the customer to ensure that data on storage arrays that is not backed up is recoverable for the same time that a restore and recovery of references to these objects or files is possible. Such system thus guarantees that all objects and files are available for recovery as long as references to that data may be restored from backups.
  • In another advantage, the system and method of the invention delays the deletion of data from the source storage array for a N period until the data is also deleted from the target storage array, thereby allowing the restoring of an application database using the standard recovery procedure as well as providing the possibility of accessing the previously deleted data on the secondary storage array without having to have a complete backup of all data having ever been written to the source storage array. Once the standard recovery procedure is no longer capable of restoring and recovering references to data, the file or object referenced can also be deleted on the target storage array.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a schematic diagram of a block-based storage system of the prior art where the replication management module is located in the source storage array.
  • FIG. 2 is a schematic diagram of an object or file based storage array of the prior art where the replication management module is implemented in a separate system.
  • FIG. 3A and FIG. 3B are schematic diagrams showing the elements of the system for function specific replication of data manipulating functions on digital data with a configurable time delay, where the replication management module is located on the source storage system.
  • FIG. 4 is a schematic diagram showing the elements of the system for function-specific replication of data manipulating functions with a configurable time delay, where the replication management module is located between the application or user and the source and the destination storage system, thereby providing access to the storage systems.
  • FIG. 5 is a schematic diagram showing the elements of the system for function specific replication of data manipulating functions on digital data with a configurable time delay, having several destination storage systems.
  • FIG. 6 is a schematic diagram showing the elements of the system for function specific replication of data manipulating functions on digital data with a configurable time delay, having several source storage systems.
  • FIG. 7 is a flow chart showing the necessary main steps to implement a function-specific function replication system and method of the present invention.
  • FIG. 8 is a flow chart showing the steps of the information gathering process of the invention for proprietary storage systems of a first class of storage arrays, such class not allowing the querying of the array for changes that were made to the objects or files that are stored on the array.
  • FIG. 9 is a flow chart showing the steps for implementing the replications monitoring process of the invention for proprietary storage systems of a first class of storage arrays for which the task of replication monitoring requires the creation of a replication monitoring database.
  • FIG. 10 is a flow chart showing the steps for implementing the replications monitoring process of the invention for a second class of storage arrays, such class not requiring the creation of the replication monitoring database.
  • FIG. 11 is a flow chart describing the steps necessary to maintain a consistent set of objects or files on the target storage array.
  • FIG. 12 is a flow chart showing the steps for implementing the delayed function-specific replication of data manipulating functions for a first class of storage arrays based on the replication monitoring database.
  • FIG. 13 is a flow chart showing the steps for implementing the delayed function-specific replication of data manipulating functions for a second class of storage arrays that do not require the replication monitoring database.
  • FIG. 14 is a schematic representation of the configuration table of the invention.
  • FIG. 15 is a schematic representation of the Source Change Table of the invention.
  • FIG. 16 is a schematic representation of the Outstanding Replications Table of the invention.
  • FIG. 17 is a schematic representation of the Replication Audit Table of the invention.
  • FIG. 18 lists examples of different customer requirements and how they are implemented in a configuration table.
  • DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
  • Referring now to FIG. 1, a block-based source storage system 60 of the prior art provides a server 80 access to a certain disk capacity. The operating system installed on server 80 possesses the knowledge of where and which object or file lies within this disk capacity. This information can, for example, be stored in the File Allocations Table or I-nodes. An application or User 90 accessing a file on such a server 80 would therefore issue any function-based calls like write, update and delete to that server 80 who in turn knows where the file is located on the block-based source storage system 60. Any function performed by an application or user 90 will result in an update or read of a block on the disk capacity available to the server 80. The replication of writes or update of a block on the source storage array 60 is embodied in the source storage system 60.
  • Referring now to FIG. 2, object or file based storage arrays 65 and 75, respectively, provide the functionality of the server 80 mentioned above directly from within the storage array 65 and 75. The application or user 90 accessing a file issues the functions directly to the storage array. For the purpose of abstraction, a server 80 providing file based access to the available disk capacity on source storage array 65 is also contained in the file based storage arrays because whether he accesses the server or the storage array is not differentiable by an application or user. To the application or user, they both provide the same functionality of file level access using file access protocols like CIFS or NFS. The replication from the file or object based source storage system 65 to the corresponding target storage array 76 is embodied in the source storage system 65
  • Referring now to FIGS. 3A to 6, a system 10 for function specific replication of data manipulating functions 12 on digital data, such as files or objects, allows for a configurable time delay 14 for each function to be replicated. The system 10 includes a source storage system 20 from which performed data manipulating functions on data are replicated, at least one destination storage system 30 to which performed data manipulating functions are replicated to, a replication management module 40 for managing the function specific replication delay and the replication of data manipulating functions between the source storage systems and at least one destination storage system, optionally comprising a replication monitoring database 42.
  • The system 10 provides replication for at least one standard data manipulating function of a group of functions including: write, delete, update, modify, write-disable, write disable until expiration date, delete-disable and delete-disable until expiration date.
  • The replication management module 40 provides several novel features. One feature allows for the configuration of a delay after which a specific data manipulating function on data stored on the source storage system is replicated on corresponding data on the destination storage system. Another feature allows for replication of the data manipulating function performed on data stored on the source storage system with the configured delay to the destination storage system. Still another feature allows for querying function-specific changes to data of the source storage system in a given timeframe.
  • As for the source storage system 20 for replicating data manipulating functions on digital data, at least one destination storage system 30 is based on one of the following architectures: object-based storage arrays comprising an application programming interface, file-based storage arrays or a computer server, comprising memory 36, a CPU 38 and an operating system 39.
  • The system 10 may directly provide access to storage systems based on either of the following architectures: object-based storage systems having an application programming interface 34, file-based storage arrays, and a computer server 80, including memory 36, a CPU 38 and an operating system 39 as shown in FIG. 5.
  • The system 10 is adaptable to several different system configurations. Referring now to FIG. 3A, a configuration where the replication management module 40 is located on the source storage system 20 is shown. The information about functions performed by applications or users 90 on objects or files stored is gathered by the replication management module from the source storage system 20 and used to replicate each data manipulating function with a configurable delay to the Destination Storage system 30. The information gathered may optionally be stored for future reference in the replication monitoring database 42.
  • Referring again to FIG. 4 a configuration where the replication management module 40 is located between the application or user 90 and the source and destination storage systems 20 and 30 is shown.
  • Referring now to FIG. 5, a configuration is shown with several destination storage systems 30, one being a secondary destination storage system 32. The Replication management module 40 gathers the information for function-specific replication of data manipulating functions from the Source storage system 20 and replicates to multiple Destination Storage systems 30. A Destination storage system 30 may be used by a second Replication management module as the source storage system to replicate to a secondary destination storage system 32.
  • Referring now to FIG. 6, a configuration with several source storage systems 20 is shown. One replication management system 40 is gathering information from multiple source storage systems 20. All data manipulating functions performed on multiple source storage systems 20 are replicated to a common destination storage system 30.
  • The source storage system 20 or the destination storage system 30 are file-based storage arrays, including a server 80 which enables file based access to the available storage capacity of the storage array.
  • The method 100 for implementing a function-specific replication of data using system 10, as shown in FIG. 7, involves three parallel functions to be performed continuously in parallel or based on a schedule: Gathering information 120, Pending replications monitoring 140 and Delayed function-specific data replication 160.
  • FIG. 8 shows the gathering of information 120 required for the replication of data manipulating functions that are performed on data stored on a source storage system and replicated to a target storage system. This is achieved by:
      • running an information gathering process 122 using information gathering software,
      • building a replication configuration database 123 including information on the data manipulating functions to be replicated, the source and target storage system, and
      • launching the pending replications monitoring process 140
  • The running of an information gathering process 122 includes the substeps of:
      • inserting information from the replication configuration database 123 for the function-specific delayed replication of data manipulating functions in the configuration table 22 of the replication monitoring database, directly from the information gathering software,
      • wherein the information that the information gathering software inserts into the database are:
        • the definition of the source storage array 124,
        • the definition(s) of the target storage array(s) 125,
        • the data manipulating function 126 to be replicated,
        • the priority 127 of the specified function,
        • the delay 128 after which the specified function is replicated, and
        • optionally the definition of a modifier 129 for more granular function-specific replication.
  • The priority and the delay are correlated to each other to ensure consistency in the target environment. A typical priority order would assume that new objects created with a write function are of highest priority, changes performed with the update function are of mid-level priority and delete functions of lowest priority. The consequences are that the highest priority delay must be assigned the shortest delay and the lowest priority the longest one. Priority and corresponding delay times 14 are required to ensure the consistency of the target objects or files.
  • Consistency between source and target storage arrays with respect to replication of data manipulating functions is defined at the data manipulating function level and the integrity rules are defined by business criteria based on the objects to be achieved. There are many different requirements that can be addressed with the current invention. If high availability and disaster recovery are the main requirements, all data manipulating functions (eg.create/write, update and delete) would be associated with a high priority and short delay. If recoverability is the main objective, the priority for data creation functions would be high. Data changes could be of medium priority and delay and most important data deletion functions would be replicated with lowest priority and longest delay. The delay would be configured to be as long as the required recoverability period.
  • There may be compliance reasons that require a change in priorities for a replication of data manipulating functions. If an employee who leaves a company requests his employer to delete his personal data according to local law, the current invention is able to handle this. In such a situation, the subset of data is configured to replicate deletion functions for this employee with highest priority and then the personal data is deleted. This would remove all pending write or update functions from the replication. In this case, the business requirement is to comply with compliance regulations and not to ensure recoverability or high availability.
  • In order to make the current invention suitable for today's changing business requirements, preference has been given to the implementation of a priority parameter that allows for the validating of the delay to correspond with the priority of the function. Of course, other implementations are possible should a skilled person be given this application and be asked to use its teachings to derive other implementations.
  • Referring now to FIG. 18, Table 26 lists, by way of example, different customer requirements and their implementation in configuration tables.
  • For storage systems that provide information on the authors of changes, data about the originating applications or users, the replication management module can be used to further specify the granularity on which the function-specific data replication should act. For example, the module would allow the replication of delete functions from a SEC compliant application as quickly as possible to ensure that content is deleted once it is permissible under the SEC rules to do so and to delay the replication of a delete function from a file archive application that does not fall under regulatory requirements. This behaviour is specified using the modifier 129 entry in the configuration table.
  • For file-based storage arrays, a differentiation based on a part of the UNC path may provide similar functionality. Application functions performed by accessing the share \\server1\share1 can be replicated differently than functions performed by users accessing \server1\share2 or \\server2\share1.
  • The pending replications monitoring process 140 is a monitoring process for pending replications, which watches for outstanding replications and passes them to the process who does the actual function replication. The Pending replications monitoring periodically queries the source system for changes and inserts them into the database of what has happened on the source (the source change table). In simpler variations this just creates a list of objects if the source array allows querying based on timeframe and function performed)
  • For source storage arrays 20 that do not allow sending event-based information of the functions performed, the interval in which the pending replications monitoring takes place must be specified.
  • The inputs into the system 10 and method 100 of the invention implementing the function-specific replication of data manipulating functions 12 are gathered in a Graphical user interface 19 and stored in the replication monitoring database configuration input table 22. When replicating data manipulating functions between storage systems that do not require a replication monitoring database, the required configuration information may be provided in a configuration file. This file may be created using a Graphical user interface or by editing the configuration file in a text editor.
  • The possibility of specifying more than one destination storage system 30 also allows replicating functions with a different delay for each target system.
  • In order to implement function-specific replication including a configurable time delay 14, the pending replications monitoring process must provide a means for monitoring pending replications and for determining the delay 14 or exact time 16 to replicate the data manipulating function. The replication time 16 to replicate a functional change may be stored in the replication monitoring database 42 and will be used by the pending replications monitoring process 140 and the delayed function-specific data replication process 160.
  • Tracking of which function was performed on a storage array is dependant on the functionality that the specific storage array provides. The functionality of the storage array also defines the granularity that can be provided to the user of the application.
  • The existence of the replication monitoring database 42 with all of the required information stored allows changing the delay with which the replication of a data manipulating function should be performed. The replication time in the outstanding replications table 18 can be changed for data manipulating functions that are not yet replicated. The pending replications monitoring process 140 takes into account the changed replication time to initiate the delayed function-specific replication of data manipulating functions 160. Depending on the environment, it allows increasing or decreasing of the delay based on the customer's actual needs. Based on the information that can be queried from the storage systems, the delay might also be configured independently for each application, file system or subset of objects.
  • The implementation of the system and method for function-specific replication of data manipulating functions requires different versions of the software. Function-specific replication between standard servers running standard operating systems cannot be implemented the same way as replication between proprietary API-based storage systems. Further detail is provided below of the different functions that need to be present dependant of the storage array.
  • The replication monitoring database 42 must be configured for each source storage system, notably with regard to identification of the information to be gathered and tracked, so as to enable the correct and consistent replication of the data manipulating function of the present invention to be used. As an example: An object based storage system does not require the same information as a file based storage system for the replication of data manipulating functions.
  • The required information is condensed into the least amount of data necessary to implement a function-specific and time delayed replication of data manipulating functions.
  • In order to reduce the complexity of today's storage environments, the virtualization of infrastructures is rapidly being adopted in the market. Storage virtualization software abstracts the physical storage systems into logical entities. Virtualization is a good way to mask ongoing migrations or systems being replaced with newer ones. Virtualization software thus knows which function is being performed on which file or object. The method of the present invention, in particular, the replication features thereof, can be implemented in a virtualization layer that provides direct access to source or target storage systems. The system of the present invention can directly provide access to source and target storage systems as shown in FIG. 4.
  • The way the function-specific information is retrievable from a storage array depends on the functionality that is implemented on a storage array. It also depends on other functionality aspects like the ability to install and run a process on that storage array.
  • Today, object or File oriented storage arrays are built based on two different approaches.
  • In a first approach, file oriented storage may be implemented using hardware that provides file level access based on standard operating systems. These operating systems, such as UNIX, Linux or Windows, allow the installation of additional software that can be used to facilitate the implementation of the present invention.
  • In a second approach, object and File oriented storage arrays may be implemented using proprietary operating systems like Data “ONTAP”, “DART” or “CENTRASTAR”. To allow maximum flexibility in changing the time delay for the function-specific replication of data manipulating functions, all detected performed data manipulating functions are gathered as quickly as possible. This means that a deletion of content is recorded once it is discovered by the pending replications monitoring process. This ensures that increasing or decreasing configured delays replicates all outstanding data manipulating functions even when changes are made to the replication delay. It allows updating the replication monitoring database with a new time of replication for all function-specific replications of data manipulating functions not yet completed.
  • Standard Operating System Based Storage
  • Standard Operating systems based storage allows the installation and creation of additional software and services on the server that provides the storage services in the network. The pending replications monitoring process 140 runs as such a process on the storage array resp. storage server. Changes in the file systems may either be intercepted or detected and the required information for the function-specific delayed replication of data manipulating functions may be inserted in the database source change table directly from the pending replications monitoring process.
  • The whole system or an implementation of the method of the present invention may run on a standard operating system based storage server or storage array.
  • Proprietary Storage Systems
  • The implementation of the pending replications monitoring process for proprietary storage systems must at least provide the function-specific information for the process 160 for delayed function-specific replication of data manipulating functions. There are two general approaches that need to be differentiated depending on the class of the storage array.
  • A first class of storage arrays does not allow querying the array for changes that were made to the objects or files that are stored on the array. In this situation, the pending replications monitoring process 140 of the system implementing the function-specific delayed data replication is described in FIG. 9. Referring to FIG. 11, the process to maintain consistency in the replication of data manipulating functions is described. FIG. 12 describes the replication of data manipulating functions.
  • In a second class of storage arrays, the task of the pending replications monitoring process 140 does not require the creation of an additional database. The pending replications monitoring process as described in FIG. 10 continuously, or in a scheduled way, queries the source storage arrays for changes made to objects or files based on the function to be replicated and additional information such as when or who performed the function. FIG. 13 shows the delayed function-specific replication of data manipulating functions for the second class of storage arrays.
  • A good example in the category of object-based storage systems with this query functionality is “EMC CENTERA”, described at http URIs emc.com/products/family/emc-centera-family.htm, the content of which, including content in links therein. The Query API allows the listing of content based on a timeframe the query is targeted to. The default query would provide a list of objects that may be used to find out when the object was written and who created it. With the same query functionality, the information gathering process 122 can determine which objects were deleted in order to replicate the delete function with the configured delay. The available proprietary storage systems today already provide replication functionality based on independent software or software installed on the storage arrays. The implementation of a function-specific delayed replication on the storage systems has heretofore not been implemented.
  • Now referring to FIG. 9, the pending replications monitoring process 140 for the first class of storage arrays requiring a replications monitoring database is built in two steps:
      • (1) running the pending replications monitoring process 142 continuously or based on a schedule; and
      • (2) using information gathered in step 150, the outstanding replications maintenance process, to ensure consistency in the replication process
  • The pending replications monitoring process 140 is made up of the following substeps which build the replications monitoring database source change table:
      • inserting the function-specific information required to replicate data manipulating functions 143;
      • adding the Source information 144 for the function to be replicated, made up the source storage system as well as the reference for the file or object the function was performed on;
      • inserting the function 145 that was performed on the referenced source file or object;
      • specifying the following:
        • date and time 146 the function was performed; and optionally,
        • the modifier 147 who performed the function, and if required,
        • the before and after image 148 required to perform the function with the configured delay on the target storage system
  • Referring now to FIG. 10, the pending replications monitoring process 140 is provided for the second class of storage arrays which does not require a replications monitoring database. The pending replications monitoring is implemented in two stops:
      • (1) listing the files or objects depending on the function to be replicated with the configured delay 149, and
      • (2) passing this information to the delayed function-specific replication of data manipulating functions 160
  • Referring to FIG. 11, the outstanding replications maintenance process 150 ensures the maintenance of a consistent set of files or objects on the target storage array. The maintenance of the replications monitoring database requires two steps:
      • (1) the outstanding replications maintenance process 152 implemented with several substeps detailed below, and
      • (2) once the consistency of the functions to be replicated is ensured, launching the delayed function-specific replication of data manipulating functions 160 for the class of storage arrays requiring the replications monitoring database.
  • The maintenance process 150 itself consists of the steps:
      • (1) using the outstanding replications monitoring process 153, checking the source change table for newly arrived functions to be replicated,
      • (2) inserting non-completed functions to be replicated in the outstanding replications table 154 with the required information to perform the change;
      • (3) determining whether the new function needs to be replicated 155, as, dependent on the source, reference and priority of the function it is decided if the function is replicated;
      • (4) if the function is replicated, it is inserted into the outstanding replications table 156 together with the target, reference, function and replication time for the data manipulating function to be replicated;
      • (5) the ensuring of the consistency of non-completed data manipulating functions is accomplished in step 157, wherein, if the new function has a higher priority and a shorter delay than other pending functions in the outstanding replications table for the same source and reference, this is accomplished by removing the already pending replications from the outstanding replications table and maintaining only the new function with higher priority;
      • (6) for all functions, updating the source change table 158 with the information that the corresponding function has completed it's maintenance step;
      • (7) now having ensured a consistent set of outstanding replications, invoking the delayed function-specific replication of data manipulating functions 160.
  • Referring now to FIG. 12, in which the delayed function-specific replication of data manipulating functions 160 for the first class of storage arrays, the delayed function-specific replication process 162 is made up of several steps, including:
  • (1) performing the delayed replication using the data manipulating function replication process 164;
  • (2) querying all pending functions to be replicated since the last invocation of the process from the outstanding replications table 165 with a replication time prior to the current time;
  • (3) performing the data manipulating function on the target storage arrays files or objects 166; and
  • (4) completing the functional replication updated in the replication monitoring database outstanding replications table 167, optionally inserting the completion of the replication of the data manipulating function in the replication audit table 168 for audit purposes.
  • Referring now to FIG. 13, the delayed function-specific replication of data manipulating functions is performed for storage arrays with query functionality not requiring the maintenance of a replications monitoring database. The delayed functions-specific replication process 170 invokes the data manipulation function replication process 171 continuously or based on a schedule. The list of objects or files is available from the replications monitoring process 149 based on the function to be replicated with the configured delay. In step 172, the function is performed on the target storage array's objects or files.
  • The function-specific time-configurable replication of data manipulating functions for the first class of storage arrays requires a replication monitoring database that provides all the information required to implement a function-specific delayed replication in a consistent manner.
  • The minimum information that must be available to implement a functional function-specific replication of data manipulating functions is found in the detail description of the four tables below.
  • Configuration Table 22 (FIG. 14)
      • Source: the source object or file based storage system that the function was performed on.
      • Targets: derived from the configuration input the target storage systems for each source target system.
      • Function: function to be replicated.
      • Priority: priority of the specified function
      • Delay: delay for the specified function.
      • Modifier: provides possibility to add a higher degree of granularity for the replication of data manipulating functions.
        Examples of different business requirements and their implementation in a configuration table are listed in in Table 26 in FIG. 18.
    Source Change Table 24 (FIG. 15)
      • Source: source object or file based storage system that the function was performed on.
      • Reference: object or file (UNC) reference that the function was performed on.
      • Function: function that was performed.
      • Time: date and time the function was performed on the object or file.
      • Modifier: additional information like application, user, part of the UNC path to provide a more granular data replication.
      • Completed: once a functional change has been treated by the outstanding replications maintenance process the completion is stored in the Source Change Table. Simple yes/no flag for quick rebuilds of the outstanding replication table.
      • Before/After image: All information required for the replication of the functional change on the object or file if necessary for the class of storage array.
  • New entries in the Source Change Table might trigger a function that inserts the corresponding entry or entries in the outstanding replication table. This outstanding replications maintenance process 150 may also run continuously or based on a schedule to update the outstanding replication table.
  • Outstanding Replication Table 18 (FIG. 16)
  • The replication table is based on the configuration table and the inserts into the source change table. Changes in the configuration may require the replication table to be rebuilt for entries in the source change table that were not yet completed
      • Target: target system the replication needs to be replicated to.
      • Reference: object or file (UNC) reference that the function will be replicated to.
      • Function: function to be replicated.
      • Replication Time: date and time at which the function needs to be replicated.
      • Completion: date and time the replication has been performed.
        An update with a Completion might trigger a function that creates an insert with the required information in the Replication Audit table
    Replication Audit Table 24 (FIG. 17)
  • The audit table provides a means of proving which replications have already been performed.
      • Source: source object or file based storage system that the function was performed on.
      • Reference: object or file (UNC) reference that the function was performed on.
      • Function: function that was performed.
      • Time: date and time the function was performed on the object or file.
      • Modifier: additional information like application, user, part of the UNC path to provide a more granular data replication.
      • Target: target system the replication needs to be replicated to.
      • Replication Time: date and time the function needs to be replicated by.
      • Completion: date and time the replication has been performed.
  • The delayed function-specific replication of data manipulating functions 160, for delaying a deletion of data from the source storage system until the data is also deleted from the destination storage system, is achieved by configuring the delete function with the lowest priority and the longest delay used for the function-specific replication of data manipulating functions.
  • Generally, the delayed function-specific replication of data manipulating functions may run continuously or based on a schedule. In the scheduled way the replication is initiated in regular intervals at specific times. Every time this interval expires, the pending replications monitoring process updates the source change table with non-replicated data manipulating functions, the outstanding replications maintenance process is run if the replication takes place between storage arrays requiring the replication monitoring database and the delayed function-specific replication of data manipulating functions is run for functions with an expired delay.
  • The delayed function-specific replication of data manipulating functions needs to follow the same directions as described in the information gathering and pending replications monitoring sections herein. The replication needs to be implemented for Standard Operating system based storage arrays differently than for proprietary storage systems and depends on the functionality of the storage arrays to be supported. In an example of an Operating System based storage array that provides file sharing the function-specific replication would in case of a write oil \\Server1\Share1\File1.doc create the new file on the target storage array under \\Server2\Share5\File1.doc. In case of a proprietary storage array like EMC Centera the function-specific replication would read object FGLSO3eJ90S2 from source storage array reachable at IP Address 192.168.2.1 and create the same object FGLSO3eJ90S2 on the source storage array at IP Address 156.172.50.33. In case of the Operating System based replication the replication involves standard file system operations and in the case of EMC Centera the function-specific replication needs to integrate the API required to access the source and target storage array.
  • In an advantage, the method makes replication of data manipulating functions dependent on the function that was performed on the data as well as makes the delay of the replication time-configurable, thereby providing a solution for both disaster tolerance and logical error recovery. This allows the customer to ensure that data on storage arrays is recoverable for the same time that a restore and recovery from the production references of the objects or files is possible. Such system thus guarantees that all objects and files are available as long as references to that data may be restored from backups.
  • In another advantage, the system and method of the invention can extend existing function-specific replications without configurable delay by replicating some data manipulating functions with a specified delay. As an example, the replication between a source and a destination storage array would continue to replicate write functions but the replication of delete functions from the source storage array would be delayed using the current invention for a N period until the data is also deleted from the target storage array, thereby allowing the restoring of an application database using the standard recovery procedure and would thus provide the possibility to access the previously deleted data on the secondary storage array without having to have a complete backup of all data having ever been written to the source storage array. Once the standard recovery procedure is also no longer capable of recovering data, the file or object referenced can also be deleted on the target storage array by the delayed function-specific replication of data manipulating functions.
  • The patents and articles mentioned above are hereby incorporated by reference herein, unless otherwise noted, to the extent that the same are not inconsistent with this disclosure.
  • Other characteristics and modes of execution of the invention are described in the appended claims.
  • Further, the invention should be considered as comprising all possible combinations of every feature described in the instant specification, appended claims, and/or drawing figures which may be considered new, inventive and industrially applicable.
  • Multiple variations and modifications are possible in the embodiments of the invention described here. Although certain illustrative embodiments of the invention have been shown and described here, a wide range of modifications, changes, and substitutions is contemplated in the foregoing disclosure. While the above description contains many specifics, these should not be construed as limitations on the scope of the invention, but rather as exemplifications of one or another preferred embodiment thereof. In some instances, some features of the present invention may be employed without a corresponding use of the other features. Accordingly, it is appropriate that the foregoing description be construed broadly and understood as being given by way of illustration and example only, the spirit and scope of the invention being limited only by the claims which ultimately issue in this application.

Claims (10)

1. A system (10) for function specific replication of data manipulating functions performed on files or objects stored on a source system (20, 65) and to be backed-up on at least one destination storage system (30, 75), the replication system comprising:
a replication management module (40) for managing consistent replication of data manipulating functions (12) from the source storage system (20, 65) to the destination storage system (30, 75), including replication of data manipulating functions (12) between the source storage system (20, 65) and the at least one destination storage system (30, 75), optionally comprising a replication monitoring database (42), the system (10) characterised in that the managing of replication includes replication of functions (12) with a configurable time delay (14) for each function to be replicated.
2. The replication system (10) of claim 1, wherein the replication system is adapted to replicate data manipulating functions (12) after receiving a command function selected from a group of functions consisting of write, delete, update, modify, write-disable, write disable until expiration date, delete-disable and delete-disable until expiration date.
3. The replication system (10) of claim 1 wherein the replication management module (40) provides functionality allowing:
configuration of a delay (14) after which a specific data manipulating function (12) performed on data stored on the source storage system (20, 65) is replicated on corresponding data on the destination storage system (30, 75),
replication of the data manipulating function (12) performed on data stored on the source storage system (20, 65) with the configured delay (14) to the destination storage system (30, 75), and
querying function-specific changes on data of the source storage system (20, 65) in a given timeframe.
4. The replication system (10) of claim 1, wherein the storage system (20, 65, 30, 75) is based on one of a group of architectures consisting of:
object-based storage arrays (60) comprising an application programming interface (34),
file-based storage arrays (60), and
a computer server (80), comprising memory (36), a CPU (38) and an operating system (39).
5. The replication system (10) of claim 1, wherein the instructions of the replication management module (40) are stored on one of either the source storage system (20, 65) or the destination storage system (30, 75).
6. The replication system (10) of claim 1, wherein the replication management module (40) is configured to provide access to storage systems (20, 65, 30, 75) based on one of a group of architectures consisting of:
object-based storage systems (60) comprising an application programming interface,
file-based storage arrays (60), and
a computer server (80), comprising memory (36), a CPU (38) and an operating system (39).
7. A computerized method (100) encoded on a computer readable medium (36), the method (100) managing consistent replication of data manipulating functions between a source storage system (20, 65) and at least one destination storage system (30, 75), the method comprising instructions for:
(a) configuration of a delay (14) after which a specific data manipulating function (12) performed on data stored on the source storage array (20, 65) will be replicated to data stored on the destination storage array(s) (30, 75);
(b) gathering information (120) on functions (12) that were performed on data stored on a source storage system (20, 65), optionally including the step of building a replication monitoring database (42) including information on the functions (12) that were performed on data stored on a source storage system (20, 65);
(c) querying the replication monitoring database (42) on the replication time (16) for outstanding data manipulating functions (12′) to be replicated by running a pending replications monitoring process (140); and
(d) replicating the data manipulating function (12) performed on the source storage system (20, 65) to the destination storage system(s) (30, 75).
8. The method (100) of claim 7, wherein the replication monitoring process (140) comprises configuring a query for a function-specific replication of data manipulating functions (12′) on a per function basis, using an input table (22) accessible to the user (90) via a user interface (19), comprising the steps of:
(1) defining a source storage system (20, 65) and at least one destination storage system (30, 75),
(2) listing the data manipulating functions (12) to be replicated between source and destination storage system,
(3) specifying a function-specific delay (14) for each function (12) and relationship of source to destination storage system (30, 75),
(4) specifying the frequency (26) at which the replication monitoring database (42) is queried for outstanding replications of data manipulating functions (12′) to be sent to the function replication processes (160),
(5) delaying function-specific replication of data manipulating functions (12), including the sub-steps of configuring the time delay (14) used for the function-specific replication of data manipulating functions, and specifying a function replication delay (14), thereby delaying execution of a function until predetermined conditions are met.
9. The method (100) of claim 7, wherein the source storage system (20, 65) is a storage array (65) comprising an operating system (39) that provides file level access to data, from which information on functions (12) that were performed on data can be obtained, and which stores self-installing information gathering software encoded with instructions for executing an information gathering process (122) allowing for installation and running on a client computer.
10. The method (100) of claim 9, wherein the step (122) of gathering information comprises the substeps of:
inserting information for the function-specific delayed replication of data manipulating functions (12) in a source change table (24) of a replication monitoring database (42), directly from the information gathering software,
wherein the information to be inserted into the database (42) by the information gathering software includes:
a file reference (144) in form of the UNC path to the file,
the function (12) that was performed on the file,
date and time the function was performed, and
optionally, the modifier (129) that performed the function, and
a before and after image (148) of the object or file modified by the function.
US12/262,308 2007-07-12 2008-10-31 Method and system for function-specific time-configurable replication of data manipulating functions Abandoned US20090063587A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US12/262,308 US20090063587A1 (en) 2007-07-12 2008-10-31 Method and system for function-specific time-configurable replication of data manipulating functions
US16/377,703 US11467931B2 (en) 2007-07-12 2019-04-08 Method and system for function-specific time-configurable replication of data manipulating functions

Applications Claiming Priority (7)

Application Number Priority Date Filing Date Title
US94935707P 2007-07-12 2007-07-12
EP07023056.0 2007-11-27
EP07023056 2007-11-28
EP08009002 2008-05-15
EP08009002.0 2008-05-15
US12/140,296 US20090019443A1 (en) 2007-07-12 2008-06-17 Method and system for function-specific time-configurable replication of data manipulating functions
US12/262,308 US20090063587A1 (en) 2007-07-12 2008-10-31 Method and system for function-specific time-configurable replication of data manipulating functions

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US12/140,296 Continuation-In-Part US20090019443A1 (en) 2007-07-12 2008-06-17 Method and system for function-specific time-configurable replication of data manipulating functions

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/377,703 Continuation US11467931B2 (en) 2007-07-12 2019-04-08 Method and system for function-specific time-configurable replication of data manipulating functions

Publications (1)

Publication Number Publication Date
US20090063587A1 true US20090063587A1 (en) 2009-03-05

Family

ID=40409167

Family Applications (2)

Application Number Title Priority Date Filing Date
US12/262,308 Abandoned US20090063587A1 (en) 2007-07-12 2008-10-31 Method and system for function-specific time-configurable replication of data manipulating functions
US16/377,703 Active 2029-09-16 US11467931B2 (en) 2007-07-12 2019-04-08 Method and system for function-specific time-configurable replication of data manipulating functions

Family Applications After (1)

Application Number Title Priority Date Filing Date
US16/377,703 Active 2029-09-16 US11467931B2 (en) 2007-07-12 2019-04-08 Method and system for function-specific time-configurable replication of data manipulating functions

Country Status (1)

Country Link
US (2) US20090063587A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100131468A1 (en) * 2008-11-25 2010-05-27 International Business Machines Corporation Arbitration Token for Managing Data Integrity and Data Accuracy of Information Services that Utilize Distributed Data Replicas
US20110191396A1 (en) * 2010-02-03 2011-08-04 Fujitsu Limited Storage device and data storage control method
US20110239309A1 (en) * 2008-12-08 2011-09-29 Nec Corporation Data dependence analyzer, information processor, data dependence analysis method and program
US20130054523A1 (en) * 2011-08-30 2013-02-28 International Business Machines Corporation Replication of data objects from a source server to a target server
US20150046398A1 (en) * 2012-03-15 2015-02-12 Peter Thomas Camble Accessing And Replicating Backup Data Objects
US9824131B2 (en) 2012-03-15 2017-11-21 Hewlett Packard Enterprise Development Lp Regulating a replication operation
US9892005B2 (en) * 2015-05-21 2018-02-13 Zerto Ltd. System and method for object-based continuous data protection
US10496490B2 (en) 2013-05-16 2019-12-03 Hewlett Packard Enterprise Development Lp Selecting a store for deduplicated data
US10592347B2 (en) 2013-05-16 2020-03-17 Hewlett Packard Enterprise Development Lp Selecting a store for deduplicated data
US10725708B2 (en) 2015-07-31 2020-07-28 International Business Machines Corporation Replication of versions of an object from a source storage to a target storage
US10754559B1 (en) * 2019-03-08 2020-08-25 EMC IP Holding Company LLC Active-active storage clustering with clock synchronization
US20200364181A1 (en) * 2015-08-31 2020-11-19 Netapp Inc. Event based retention of read only files

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10685114B2 (en) * 2015-09-23 2020-06-16 University Of Florida Research Foundation, Incorporated Malware detection via data transformation monitoring
US10021120B1 (en) * 2015-11-09 2018-07-10 8X8, Inc. Delayed replication for protection of replicated databases
FR3087021B1 (en) * 2018-10-04 2021-06-25 Amadeus Sas SOFTWARE DEFINED DATABASE REPLICATION LINKS

Citations (52)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5685696A (en) * 1994-06-10 1997-11-11 Ebara Corporation Centrifugal or mixed flow turbomachines
US5778165A (en) * 1995-10-20 1998-07-07 Digital Equipment Corporation Variable-level backup scheduling method and apparatus
US5799321A (en) * 1996-07-12 1998-08-25 Microsoft Corporation Replicating deletion information using sets of deleted record IDs
US6062819A (en) * 1995-12-07 2000-05-16 Ebara Corporation Turbomachinery and method of manufacturing the same
US6256634B1 (en) * 1998-06-30 2001-07-03 Microsoft Corporation Method and system for purging tombstones for deleted data items in a replicated database
US6260125B1 (en) * 1998-12-09 2001-07-10 Ncr Corporation Asynchronous write queues, reconstruction and check-pointing in disk-mirroring applications
US20010056425A1 (en) * 2000-06-19 2001-12-27 Hewlett-Packard Company Automatic backup/recovery process
US20020009991A1 (en) * 1997-09-11 2002-01-24 Interwave Communications International Ltd. Cellular private branch exchanges
US20030010132A1 (en) * 2001-06-09 2003-01-16 Romel Scorteanu Method and device to measure the brake force for railroad vehicles
US20040107199A1 (en) * 2002-08-22 2004-06-03 Mdt Inc. Computer application backup method and system
US6785786B1 (en) * 1997-08-29 2004-08-31 Hewlett Packard Development Company, L.P. Data backup and recovery systems
US6807632B1 (en) * 1999-01-21 2004-10-19 Emc Corporation Content addressable information encapsulation, representation, and transfer
US20050028915A1 (en) * 2001-12-28 2005-02-10 Michelin Recherche Et Technique S.A. Tire with reinforcement structure forming internal and external loops
US20050125411A1 (en) * 2003-12-09 2005-06-09 Michael Kilian Method and apparatus for data retention in a storage system
US20050193084A1 (en) * 2004-02-26 2005-09-01 Stephen Todd Methods and apparatus for increasing data storage capacity
US6976165B1 (en) * 1999-09-07 2005-12-13 Emc Corporation System and method for secure storage, transfer and retrieval of content addressable information
US20050278385A1 (en) * 2004-06-10 2005-12-15 Hewlett-Packard Development Company, L.P. Systems and methods for staggered data replication and recovery
US20050289152A1 (en) * 2004-06-10 2005-12-29 Earl William J Method and apparatus for implementing a file system
US20060004689A1 (en) * 2004-06-30 2006-01-05 Venkat Chandrasekaran Systems and methods for managing content on a content addressable storage system
US20060031653A1 (en) * 2004-08-04 2006-02-09 Emc Corporation Methods and apparatus for accessing content in a virtual pool on a content addressable storage system
US20060235893A1 (en) * 2005-04-15 2006-10-19 Emc Corporation Methods and apparatus for managing the storage of content
US20060235821A1 (en) * 2005-04-15 2006-10-19 Emc Corporation Methods and apparatus for retrieval of content units in a time-based directory structure
US20060235908A1 (en) * 2005-04-15 2006-10-19 Emc Corporation Methods and apparatus for managing the replication of content
US20060271605A1 (en) * 2004-11-16 2006-11-30 Petruzzo Stephen E Data Mirroring System and Method
US20060294163A1 (en) * 2005-06-23 2006-12-28 Emc Corporation Methods and apparatus for accessing content stored in a file system
US20060294115A1 (en) * 2005-06-23 2006-12-28 Emc Corporation Methods and apparatus for storing content in a file system
US7159070B2 (en) * 2003-12-09 2007-01-02 Emc Corp Methods and apparatus for caching a location index in a data storage system
US7162571B2 (en) * 2003-12-09 2007-01-09 Emc Corporation Methods and apparatus for parsing a content address to facilitate selection of a physical storage location in a data storage system
US20070050415A1 (en) * 2005-08-26 2007-03-01 Emc Corporation Methods and apparatus for scheduling an action on a computer
US7222233B1 (en) * 2000-09-14 2007-05-22 At&T Corp. Method for secure remote backup
US7249251B2 (en) * 2004-01-21 2007-07-24 Emc Corporation Methods and apparatus for secure modification of a retention period for data in a storage system
US20070185936A1 (en) * 2006-02-07 2007-08-09 Derk David G Managing deletions in backup sets
US7263576B2 (en) * 2003-12-09 2007-08-28 Emc Corporation Methods and apparatus for facilitating access to content in a data storage system
US7281084B1 (en) * 2005-01-12 2007-10-09 Emc Corporation Method and apparatus for modifying a retention period
US7320059B1 (en) * 2005-08-26 2008-01-15 Emc Corporation Methods and apparatus for deleting content from a storage system
US20080034018A1 (en) * 2006-08-04 2008-02-07 Pavel Cisler Managing backup of content
US7350041B1 (en) * 2005-08-26 2008-03-25 Emc Corporation Methods and apparatus for managing the storage of content
US7366836B1 (en) * 2004-12-23 2008-04-29 Emc Corporation Software system for providing storage system functionality
US7376681B1 (en) * 2004-12-23 2008-05-20 Emc Corporation Methods and apparatus for accessing information in a hierarchical file system
US7428621B1 (en) * 2005-01-12 2008-09-23 Emc Corporation Methods and apparatus for storing a reflection on a storage system
US7430645B2 (en) * 2004-01-21 2008-09-30 Emc Corporation Methods and apparatus for extending a retention period for data in a storage system
US7444389B2 (en) * 2003-12-09 2008-10-28 Emc Corporation Methods and apparatus for generating a content address to indicate data units written to a storage system proximate in time
US20080307527A1 (en) * 2007-06-05 2008-12-11 International Business Machines Corporation Applying a policy criteria to files in a backup image
US20090035122A1 (en) * 2007-08-03 2009-02-05 Manabu Yagi Centrifugal compressor, impeller and operating method of the same
US7539813B1 (en) * 2004-08-04 2009-05-26 Emc Corporation Methods and apparatus for segregating a content addressable computer system
US7580961B2 (en) * 2004-01-21 2009-08-25 Emc Corporation Methods and apparatus for modifying a retention period for data in a storage system
US7689599B1 (en) * 2005-01-31 2010-03-30 Symantec Operating Corporation Repair of inconsistencies between data and metadata stored on a temporal volume using transaction log replay
US7698516B2 (en) * 2005-01-12 2010-04-13 Emc Corporation Methods and apparatus for managing deletion of data
US7801920B2 (en) * 2004-01-21 2010-09-21 Emc Corporation Methods and apparatus for indirectly identifying a retention period for data in a storage system
US7805470B2 (en) * 2005-06-23 2010-09-28 Emc Corporation Methods and apparatus for managing the storage of content in a file system
US7966293B1 (en) * 2004-03-09 2011-06-21 Netapp, Inc. System and method for indexing a backup using persistent consistency point images
US20130006938A1 (en) * 2005-12-19 2013-01-03 Commvault Systems, Inc. Systems and methods for performing data replication

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6157991A (en) 1998-04-01 2000-12-05 Emc Corporation Method and apparatus for asynchronously updating a mirror of a source device
US7392234B2 (en) 1999-05-18 2008-06-24 Kom, Inc. Method and system for electronic file lifecycle management
US6405295B1 (en) 1999-09-07 2002-06-11 Oki Electric Industry, Co., Ltd. Data storage apparatus for efficient utilization of limited cycle memory material
KR100449708B1 (en) 2001-11-16 2004-09-22 삼성전자주식회사 Flash memory management method
JPWO2004102396A1 (en) * 2003-05-14 2006-07-13 富士通株式会社 Delay storage apparatus and delay processing method
US7369977B1 (en) * 2004-09-20 2008-05-06 The Mathworks, Inc. System and method for modeling timeouts in discrete event execution
JP4681374B2 (en) * 2005-07-07 2011-05-11 株式会社日立製作所 Storage management system
US7921268B2 (en) * 2007-07-12 2011-04-05 Jakob Holger Method and system for function-specific time-configurable replication of data

Patent Citations (54)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5685696A (en) * 1994-06-10 1997-11-11 Ebara Corporation Centrifugal or mixed flow turbomachines
US5778165A (en) * 1995-10-20 1998-07-07 Digital Equipment Corporation Variable-level backup scheduling method and apparatus
US6062819A (en) * 1995-12-07 2000-05-16 Ebara Corporation Turbomachinery and method of manufacturing the same
US5799321A (en) * 1996-07-12 1998-08-25 Microsoft Corporation Replicating deletion information using sets of deleted record IDs
US6785786B1 (en) * 1997-08-29 2004-08-31 Hewlett Packard Development Company, L.P. Data backup and recovery systems
US20020009991A1 (en) * 1997-09-11 2002-01-24 Interwave Communications International Ltd. Cellular private branch exchanges
US6256634B1 (en) * 1998-06-30 2001-07-03 Microsoft Corporation Method and system for purging tombstones for deleted data items in a replicated database
US6260125B1 (en) * 1998-12-09 2001-07-10 Ncr Corporation Asynchronous write queues, reconstruction and check-pointing in disk-mirroring applications
US6807632B1 (en) * 1999-01-21 2004-10-19 Emc Corporation Content addressable information encapsulation, representation, and transfer
US6976165B1 (en) * 1999-09-07 2005-12-13 Emc Corporation System and method for secure storage, transfer and retrieval of content addressable information
US20010056425A1 (en) * 2000-06-19 2001-12-27 Hewlett-Packard Company Automatic backup/recovery process
US7222233B1 (en) * 2000-09-14 2007-05-22 At&T Corp. Method for secure remote backup
US20030010132A1 (en) * 2001-06-09 2003-01-16 Romel Scorteanu Method and device to measure the brake force for railroad vehicles
US20050028915A1 (en) * 2001-12-28 2005-02-10 Michelin Recherche Et Technique S.A. Tire with reinforcement structure forming internal and external loops
US20040107199A1 (en) * 2002-08-22 2004-06-03 Mdt Inc. Computer application backup method and system
US7162571B2 (en) * 2003-12-09 2007-01-09 Emc Corporation Methods and apparatus for parsing a content address to facilitate selection of a physical storage location in a data storage system
US7159070B2 (en) * 2003-12-09 2007-01-02 Emc Corp Methods and apparatus for caching a location index in a data storage system
US20050125411A1 (en) * 2003-12-09 2005-06-09 Michael Kilian Method and apparatus for data retention in a storage system
US7444389B2 (en) * 2003-12-09 2008-10-28 Emc Corporation Methods and apparatus for generating a content address to indicate data units written to a storage system proximate in time
US7263576B2 (en) * 2003-12-09 2007-08-28 Emc Corporation Methods and apparatus for facilitating access to content in a data storage system
US7249251B2 (en) * 2004-01-21 2007-07-24 Emc Corporation Methods and apparatus for secure modification of a retention period for data in a storage system
US7430645B2 (en) * 2004-01-21 2008-09-30 Emc Corporation Methods and apparatus for extending a retention period for data in a storage system
US7580961B2 (en) * 2004-01-21 2009-08-25 Emc Corporation Methods and apparatus for modifying a retention period for data in a storage system
US7801920B2 (en) * 2004-01-21 2010-09-21 Emc Corporation Methods and apparatus for indirectly identifying a retention period for data in a storage system
US20050193084A1 (en) * 2004-02-26 2005-09-01 Stephen Todd Methods and apparatus for increasing data storage capacity
US7966293B1 (en) * 2004-03-09 2011-06-21 Netapp, Inc. System and method for indexing a backup using persistent consistency point images
US20050278385A1 (en) * 2004-06-10 2005-12-15 Hewlett-Packard Development Company, L.P. Systems and methods for staggered data replication and recovery
US20050289152A1 (en) * 2004-06-10 2005-12-29 Earl William J Method and apparatus for implementing a file system
US20060004689A1 (en) * 2004-06-30 2006-01-05 Venkat Chandrasekaran Systems and methods for managing content on a content addressable storage system
US7539813B1 (en) * 2004-08-04 2009-05-26 Emc Corporation Methods and apparatus for segregating a content addressable computer system
US20060031653A1 (en) * 2004-08-04 2006-02-09 Emc Corporation Methods and apparatus for accessing content in a virtual pool on a content addressable storage system
US20060271605A1 (en) * 2004-11-16 2006-11-30 Petruzzo Stephen E Data Mirroring System and Method
US7376681B1 (en) * 2004-12-23 2008-05-20 Emc Corporation Methods and apparatus for accessing information in a hierarchical file system
US7366836B1 (en) * 2004-12-23 2008-04-29 Emc Corporation Software system for providing storage system functionality
US7281084B1 (en) * 2005-01-12 2007-10-09 Emc Corporation Method and apparatus for modifying a retention period
US7698516B2 (en) * 2005-01-12 2010-04-13 Emc Corporation Methods and apparatus for managing deletion of data
US7428621B1 (en) * 2005-01-12 2008-09-23 Emc Corporation Methods and apparatus for storing a reflection on a storage system
US7689599B1 (en) * 2005-01-31 2010-03-30 Symantec Operating Corporation Repair of inconsistencies between data and metadata stored on a temporal volume using transaction log replay
US20060235908A1 (en) * 2005-04-15 2006-10-19 Emc Corporation Methods and apparatus for managing the replication of content
US7392235B2 (en) * 2005-04-15 2008-06-24 Emc Corporation Methods and apparatus for retrieval of content units in a time-based directory structure
US7765191B2 (en) * 2005-04-15 2010-07-27 Emc Corporation Methods and apparatus for managing the replication of content
US20060235821A1 (en) * 2005-04-15 2006-10-19 Emc Corporation Methods and apparatus for retrieval of content units in a time-based directory structure
US20060235893A1 (en) * 2005-04-15 2006-10-19 Emc Corporation Methods and apparatus for managing the storage of content
US7805470B2 (en) * 2005-06-23 2010-09-28 Emc Corporation Methods and apparatus for managing the storage of content in a file system
US20060294163A1 (en) * 2005-06-23 2006-12-28 Emc Corporation Methods and apparatus for accessing content stored in a file system
US20060294115A1 (en) * 2005-06-23 2006-12-28 Emc Corporation Methods and apparatus for storing content in a file system
US7320059B1 (en) * 2005-08-26 2008-01-15 Emc Corporation Methods and apparatus for deleting content from a storage system
US7350041B1 (en) * 2005-08-26 2008-03-25 Emc Corporation Methods and apparatus for managing the storage of content
US20070050415A1 (en) * 2005-08-26 2007-03-01 Emc Corporation Methods and apparatus for scheduling an action on a computer
US20130006938A1 (en) * 2005-12-19 2013-01-03 Commvault Systems, Inc. Systems and methods for performing data replication
US20070185936A1 (en) * 2006-02-07 2007-08-09 Derk David G Managing deletions in backup sets
US20080034018A1 (en) * 2006-08-04 2008-02-07 Pavel Cisler Managing backup of content
US20080307527A1 (en) * 2007-06-05 2008-12-11 International Business Machines Corporation Applying a policy criteria to files in a backup image
US20090035122A1 (en) * 2007-08-03 2009-02-05 Manabu Yagi Centrifugal compressor, impeller and operating method of the same

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8180730B2 (en) * 2008-11-25 2012-05-15 International Business Machines Corporation Arbitration token for managing data integrity and data accuracy of information services that utilize distributed data replicas
US20100131468A1 (en) * 2008-11-25 2010-05-27 International Business Machines Corporation Arbitration Token for Managing Data Integrity and Data Accuracy of Information Services that Utilize Distributed Data Replicas
US20110239309A1 (en) * 2008-12-08 2011-09-29 Nec Corporation Data dependence analyzer, information processor, data dependence analysis method and program
US9027123B2 (en) * 2008-12-08 2015-05-05 Nec Corporation Data dependence analyzer, information processor, data dependence analysis method and program
US8914336B2 (en) * 2010-02-03 2014-12-16 Fujitsu Limited Storage device and data storage control method
US20110191396A1 (en) * 2010-02-03 2011-08-04 Fujitsu Limited Storage device and data storage control method
US10664493B2 (en) 2011-08-30 2020-05-26 International Business Machines Corporation Replication of data objects from a source server to a target server
US20130054523A1 (en) * 2011-08-30 2013-02-28 International Business Machines Corporation Replication of data objects from a source server to a target server
US20130054524A1 (en) * 2011-08-30 2013-02-28 International Business Machines Corporation Replication of data objects from a source server to a target server
US10664492B2 (en) 2011-08-30 2020-05-26 International Business Machines Corporation Replication of data objects from a source server to a target server
US9904717B2 (en) * 2011-08-30 2018-02-27 International Business Machines Corporation Replication of data objects from a source server to a target server
US9910904B2 (en) * 2011-08-30 2018-03-06 International Business Machines Corporation Replication of data objects from a source server to a target server
US9824131B2 (en) 2012-03-15 2017-11-21 Hewlett Packard Enterprise Development Lp Regulating a replication operation
US20150046398A1 (en) * 2012-03-15 2015-02-12 Peter Thomas Camble Accessing And Replicating Backup Data Objects
US10496490B2 (en) 2013-05-16 2019-12-03 Hewlett Packard Enterprise Development Lp Selecting a store for deduplicated data
US10592347B2 (en) 2013-05-16 2020-03-17 Hewlett Packard Enterprise Development Lp Selecting a store for deduplicated data
US9892005B2 (en) * 2015-05-21 2018-02-13 Zerto Ltd. System and method for object-based continuous data protection
US10725708B2 (en) 2015-07-31 2020-07-28 International Business Machines Corporation Replication of versions of an object from a source storage to a target storage
US20200364181A1 (en) * 2015-08-31 2020-11-19 Netapp Inc. Event based retention of read only files
US11880335B2 (en) * 2015-08-31 2024-01-23 Netapp, Inc. Event based retention of read only files
US10754559B1 (en) * 2019-03-08 2020-08-25 EMC IP Holding Company LLC Active-active storage clustering with clock synchronization

Also Published As

Publication number Publication date
US20190235981A1 (en) 2019-08-01
US11467931B2 (en) 2022-10-11

Similar Documents

Publication Publication Date Title
US11467931B2 (en) Method and system for function-specific time-configurable replication of data manipulating functions
US7921268B2 (en) Method and system for function-specific time-configurable replication of data
US20090019443A1 (en) Method and system for function-specific time-configurable replication of data manipulating functions
US10831614B2 (en) Visualizing restoration operation granularity for a database
US10671635B2 (en) Decoupled content and metadata in a distributed object storage ecosystem
US7509468B1 (en) Policy-based data protection
US8108429B2 (en) System for moving real-time data events across a plurality of devices in a network for simultaneous data protection, replication, and access services
US7689602B1 (en) Method of creating hierarchical indices for a distributed object system
US9495264B2 (en) Data replication techniques using incremental checkpoints
EP3040886A1 (en) Service oriented data management and architecture
US7631151B2 (en) Systems and methods for classifying and transferring information in a storage network
US8171246B2 (en) Ranking and prioritizing point in time snapshots
US20160048427A1 (en) Virtual subdirectory management
US20060230244A1 (en) System and method for performing auxillary storage operations
EP3685268A1 (en) File system point-in-time restore using recycle bin and version history
WO2010090761A1 (en) System, method, and computer program product for allowing access to backup data
US10628298B1 (en) Resumable garbage collection
EP3796174B1 (en) Restoring a database using a fully hydrated backup
US10809922B2 (en) Providing data protection to destination storage objects on remote arrays in response to assignment of data protection to corresponding source storage objects on local arrays
US11500738B2 (en) Tagging application resources for snapshot capability-aware discovery
US11436089B2 (en) Identifying database backup copy chaining
EP2060973B1 (en) Method and system for function-specific time-configurable replication of data manipulating functions
US20210334165A1 (en) Snapshot capability-aware discovery of tagged application resources
KR102089710B1 (en) Continous data mangement system and method
US11137931B1 (en) Backup metadata deletion based on backup data deletion

Legal Events

Date Code Title Description
AS Assignment

Owner name: III HOLDINGS 1, LLC, DELAWARE

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:JAKOB, HOLGER;REEL/FRAME:033524/0673

Effective date: 20131219

AS Assignment

Owner name: III HOLDINGS 3, LLC, DELAWARE

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:III HOLDINGS 1, LLC;REEL/FRAME:046274/0626

Effective date: 20180614

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

AS Assignment

Owner name: SEAGATE TECHNOLOGY LLC, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:III HOLDINGS 3, LLC;REEL/FRAME:048167/0679

Effective date: 20180619

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION