WO2001004755A1 - Modular backup and retrieval system with an integrated storage area file system - Google Patents

Modular backup and retrieval system with an integrated storage area file system Download PDF

Info

Publication number
WO2001004755A1
WO2001004755A1 PCT/US2000/019363 US0019363W WO0104755A1 WO 2001004755 A1 WO2001004755 A1 WO 2001004755A1 US 0019363 W US0019363 W US 0019363W WO 0104755 A1 WO0104755 A1 WO 0104755A1
Authority
WO
WIPO (PCT)
Prior art keywords
storage system
information
component
media
backup
Prior art date
Application number
PCT/US2000/019363
Other languages
French (fr)
Inventor
John Crescenti
Srinivas Kavuri
David Alan Oshinsksy
Anand Prahlad
Original Assignee
Commvault Systems, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Commvault Systems, Inc. filed Critical Commvault Systems, Inc.
Publication of WO2001004755A1 publication Critical patent/WO2001004755A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments

Definitions

  • TITLE MODULAR BACKUP AND RETRIEVAL SYSTEM WITH AN
  • the present invention is directed towards backup systems for computer networks.
  • the present invention is directed towards the implementation- of a distributed, modular backup system with a storage area network (SAN) system, and the use of the modular backup system under the SAN file system.
  • SAN storage area network
  • Conventional backup devices usually employ a monolithic backup and retrieval system servicing a single server with attached storage devices. These systems usually control all aspects of a data backup or retrieval, including timing the backup, directing the files to be backed up, directing the mode of the archival request, and directing the storage process itself through attached library media. Further, these backup and retrieval systems are not scalable, and often direct only one type of backup and retrieval system, whether it is a network backup or a single machine backup.
  • a user on an archived network has no easy and readily understandable way to access archived data for informational or retrieval purposes.
  • the archived backups and their information are not easily accessible in a meaningful and clear manner over a network.
  • Various aspects of the present invention may be realized through a storage system having a computing device for storing information in the storage system.
  • the computing device includes a management component that directs the storing of information in the storage system, and at least one client component operating on at least one other computing device.
  • the management component coordinates the storing of information in the storage system by interaction with the at least one client component.
  • the management component of the storage system manages backup and retrieval of information according to predetermined storage policies.
  • the storage policies of the storage system may consist of the following: scheduling policies, aging policies, index pruning policies, drive cleaning policies, configuration information, tracking all running and waiting jobs, allocating drives, selecting a type of backup, tracking different applications running on each client, and tracking media types.
  • the management component contains scheduling information for a timetable of backups for the computing devices and the computing devices may be interconnected.
  • the storage system may include a modular backup system that works in conjunction with a storage area network (SAN) system.
  • the information in the storage system may be data or files and the computing device may include an attached data storage device, to which it can store data and files locally.
  • the computing devices of the storage system may be connected to the SAN system via a direct fiber channel connection, a SCSI connection, or another equivalent type connection.
  • the storage system may also archive information on a library media managed by the media component, which maintains an index to easily locate the particular information that has been archived.
  • At least one of the media component, the client component, or the management component may use the indices created during each backup and retrieval request to create a logical file system extension based on the contents of the library media, which contains all the archived information.
  • at least one of the media component, the client component, or the management component may create a pathname to any particular archived information, the pathname indicating to the file processor that the information is stored as an archive in the media library.
  • the at least one of the media component or the client component use the pathname information to correlate the pathname with the actual storage location in the library media by reconverting the pathname to the index as created by the media component at the time of backup.
  • the method includes configuring a computing device for storing information in the storage system; directing the storing of information in the computing device of the storage system with a management component; and coordinating the storing of information in the storage system by interaction with at least one client component, the at least one client component operating on at least one other computing device.
  • Fig. 1 is a schematic block diagram of a modular backup and retrieval system built in accordance with principles according to the present invention.
  • Fig. 2 is a schematic block diagram of a modular backup system working in conjunction with a storage area network (SAN) system according to principles of the present invention.
  • SAN storage area network
  • Fig. 3 is a schematic block diagram of the interaction of the library media of Fig. 2 with the SAN system.
  • Fig. 4 is a tree diagram of a network file system maintained by the SAN system implementing a path extension to data archived by the modular backup system, all of Fig. 2.
  • Fig 1 is a schematic block diagram of a modulai backup system
  • a modular backup system 100 compnses thiee components, a management component 1 10, one or more client components 120, and one or more media components 130
  • the three components comprising the management component 1 10, the client component 120, and the media component 130 may reside on several different machines
  • the management component 1 10, the client component 120, and the media component 130 may all leside on a single computing device Oi
  • the management component 110 and one of the media components 130 may reside on a single computing device with a client component 120 residing on a different computing device
  • the management component 110 and one of the client components 120 may reside on a single computing device with a media component 130 residing on a diffeient computing device
  • a media component 130 and a client component 120 may leside on the same computing device with the management component 110 residing on a different computing device
  • the management component 110, the client component 120, and the media component 130 may all reside on different computing devices
  • couises othei arrangements in accordance with pimciples of the present invention aie contemplated and upon viewing the present disclosure will become apparent to those of ordinary skill in the art
  • the management component 110 is coupled to the client components 120 and the media components 130
  • the media components 130 are also coupled to the client components 120
  • the client component 120 controls the actions and parameters of a backup or retrieval for a particular client computing device.
  • a client computing device is the computing device in need of backup and retrieval assistance.
  • the client components 120 can each reside on a client computing device, or is in active communication with the client computing device.
  • the particular client component 120 for a particular client computing device communicates with a management director component 110 regarding such parameters as backup schedules, types of files in the backup schedule, the method of backup or retrieval, and other broad scope archival management functions for the client computing device.
  • the particular client component 120 communicates with a particular media component 130 responsible for the actual backup or retrieval function.
  • the media component 130 controls the actions and parameters of the actual physical level backup or retrieval at the library media containing the archived data.
  • Each media component 130 is responsible for one or more physical backup media devices. As shown in Fig. 1, the media component 130 may be responsible for a single backup device 140, or for a plurality of backup devices 150 through 160.
  • the particular media component 130 directs the data that is the subject of an archival type request to or from, as the case may be, the particular backup devices 140, 150, or 160 that it is responsible for. In the case of a retrieval type archival request, the particular media component 130 directs the retrieved data to a requesting client component 120.
  • the particular media component 130 also creates a library index for the data contained on the particular backup devices 140, 150, or 160 for which it is responsible for operating. Additionally, the particular media component 130 indexes the location of the archived data and files on the particular associated backup media devices 140, 150, or 160 that it is responsible for operating, and allows the management component 110 and the client component 120 access to certain information about the index entries.
  • the media component 130 uses this library index to quickly and easily locate a particular backed up file or other piece of data on the physical devices at its disposal.
  • the particular media component 130 resides on a computing device physically responsible for the operating the library media which the particular media component is responsible for, or it must be in active communication with that computing device.
  • the media component also communicates with the management component 110, since the management component is responsible for the allocation of physical media for archival purposes.
  • the backup devices 140, 150, and 160 can comprise many different types of media, such as massively parallel fast access magnetic media, tape jukebox media, or optical jukebox media devices.
  • the determination of which backup device is to be implemented is determined by several parameters. These include time related frequency of accesses, importance of the backup file or data and urgency of its retrieval, or how long ago the backup was made.
  • the management component 110 directs many aspects of a backup and a retrieval, including scheduling policies, aging policies, index pruning policies, drive cleaning policies, configuration information, keeping track of all running and waiting jobs, allocation of drives, type of backup (i.e. full, incremental, or differential), tracking different applications running on each client, and tracking media.
  • scheduling policies e.g., scheduling policies, aging policies, index pruning policies, drive cleaning policies, configuration information, keeping track of all running and waiting jobs, allocation of drives, type of backup (i.e. full, incremental, or differential), tracking different applications running on each client, and tracking media.
  • the management component 110 contains the scheduling information for a timetable of backups for the computing devices. It should be noted that any number of computing devices might be involved, and that the computing devices may be interconnected.
  • FIG. 2 is a schematic block diagram of a modular backup system working in conjunction with a storage area network (SAN) system 250.
  • a computing device 200 contains and operates a management component 202, which is responsible for the coordination of backup, storage, retrieval, and restoration of files and data on a computer network system 290.
  • the management component 202 coordinates the aspects of these functions with a client component 212, running on another computing device 210, and a client component 222 running on yet another computing device 220.
  • the computing device 220 also has an attached data storage device 214, to which it can store data and files locally.
  • the computing devices 210, 220, and 230 are connected to the SAN system 250 via a connection 264, such as a direct fiber channel connection, or a SCSI connection.
  • a connection 264 such as a direct fiber channel connection, or a SCSI connection.
  • any type of network connection is possible.
  • the SAN system 250 environment comprises the connection media 264, routers, and associated hubs for the actual data communication functions of the network, and a file processor 252.
  • the elements of the SAN system 250 not explicitly numbered are implied in a remainder of the SAN system 250.
  • Another computing device 230 contains another client component 232. However, the computing device 230 is connected, through a network 270, to a file processor 252 for interaction with the SAN system 250 through another network 265.
  • This network could be any type of network, such as a LAN operating under a TCP/IP protocol.
  • the client components 232, 222, or 212 coordinate and direct local backup and retrieval functions on the computing devices 230, 220, and 210, respectively.
  • the management component 202 coordinates and directs the overall network backup of the computer network 290.
  • the computing devices 210, 220, and 230 can all be different architectures of machines running different operating systems.
  • Hardware systems could include those made by SUN, Hewlett/Packard, Intel based families of processors, and machines based on the RS6000 and PowerPC families of processors, to name a few.
  • Operating systems can include the many flavors of UNIX and UNIX-like operating systems, such as HP/UX, Solaris, AIX, and Linux, to name a few, as well as Windows NT by Microsoft.
  • the file processor 252 of the SAN system 250 contains a client component 262 and a media component 260.
  • Storage media 257, 258, and 259 are communicatively coupled to the file processor 252 for storage of network files from the computing devices 210, 220, and 230.
  • These storage devices can be magnetic media for fast retrieval, tape media for longer term storage, or optical media for much longer term storage.
  • the overall SAN system 250 acts as a block access device to the computing devices
  • the overall SAN system 250 acts as a virtual media device and centralizes the network file system from the computing devices 210, 220, and 230. As such, true dynamic sharing of the data and files through the SAN system 250 is possible. These data and files are available to the computing devices 210, 220, and 230. The computing devices 210, 220, and 230 present their network file and data requests to the file processor
  • the file processor 252 over the SAN network media 264 remainder of the SAN system 250 as they would any other storage media available to that computing device.
  • the file processor 252 working in accordance with its software, interprets the data and file requests from the external computing devices.
  • the file processor 252 then performs the file or data request based on the information it is given, and responds accordingly to the file or data request.
  • the network file system is maintained and operated on solely by the file processor 252 of the SAN system
  • All accesses, writes, reads, and requests for information on any files and/or data under the network file system is handled by the SAN system 250, and in particular the file processor 252.
  • the file processor 252 keeps track of all the stored files and/or data stored on the media devices 257, 258, and 259.
  • the file processor 252 maintains and presents a file system view of the stored data and/or files to the computing devices 210, 220, and 230 over the remainder of the SAN system 250 and the SAN network media 264.
  • the computing devices 210, 220, and 230 when accessing or inquiring about portions of the network file system, perform these functions by requesting them through the file processor 252 of the SAN system 250.
  • the SAN system 250 allows access to the files and/or data stored in its storage media, and actually performs all the function of a file system to the attached computing devices 210, 220, and 230. Opening, closing, reading, and writing of data to files and of files themselves actually look and perform like a normal file system to the attached computing devices 210, 220, and 230. These actions are transparent to the computing devices. As such, the SAN system 250 acts and performs as a file system to the rest of the computing devices connected to the file processor 252. Also, from the perspective of the computing devices, each computing device can access and view the data and/or files stored by the file processor 252 of the SAN system 250 as part of a large, monolithic file system.
  • a client component 262 and a media component 260 can be part of the SAN system 250. These components work in conjunction with other components present in the network environment, including the file processor 252 itself, to make up a network backup and retrieval system for the computer network 290.
  • the files and/or data, as stored on the media devices 257, 258, and 259, are routed by the file processor 252 to the media component 260.
  • This request can be made by either the client component 262 local to the file processor 252, or the management component 202 overseeing the network backup as a whole.
  • the files and/or data are then archived on a library media 275 managed by the media component 260, which maintains an index to easily locate the particular data and/or file that has been archived.
  • the data and/or files stored on the network file system can be archived to the library media 275 through the interaction of these components.
  • the media component 260 maintains an index to the locations of the stored files and/or data.
  • the media component 260, the client component 262, or the management component 200 may use the indices created during each backup and retrieval request to create a logical file system extension based on the contents of the library media 275, which contains all the archived data and/or files.
  • the media component 260, the client component 262, or the management component 202 may create a pathname to any particular archived data or file, and this pathname specially indicates to the file processor 252 that the data and/or file is stored as an archive in the media library 275. Specifically, the media component 260 or the client component 262 can use this information to correlate the pathname with the actual storage location in the library media 275 by reconverting the pathname to the index as created by the media component 260 at the time of backup.
  • This pathname information can be made available to the file processor 252, which integrates the file extensions and the information to the existing network file system that it currently maintains.
  • the file processor 252 can query the client component 262 to access the archived file as directed in the requested manner. It may also query the management component 202 to do the same.
  • the client component 262 or the management component 202 then translates the pathname to the appropriate index as stored by the media component 260, and requests that the media component 260 access the file as requested.
  • the media component 260 then accesses the requested entry and returns the result of the query, whether that is information about the data or file, or the data or file itself.
  • the file processor 252 directs the result to that particular computing device.
  • Fig. 3 is schematic block diagram of the interaction of the library media and the media component in the file processor of Fig. 2 as implemented in a SAN system.
  • a library media 310 controlled by a media component 320 may comprise a number of different storage media, or may just comprise one.
  • the library media 310 comprises a fast, alterable random access device 312, a fast, non-alterable random access device 314, a serial device 316, a slow, alterable random access device 318, and a slow, non-alterable random access device 319.
  • An example of the fast, alterable random access device 312 includes various magnetic media, such as a disc drive, that could include multiple writing surfaces.
  • An example of the fast, non-alterable random access device 314 includes a multi disc magneto-optical system.
  • An example of the slow, non-alterable random access device 318 includes jukeboxes containing CD-ROM disc drive cartridges.
  • An example of the slow, non-alterable random access device 319 includes jukeboxes containing WORM optical discs.
  • An example of the serial device 316 could include a magnetic tape cartridge jukebox.
  • the media component 320 would control the placement of files, sectors, and other archival information on the appropriate library media. This placement could be controlled according to the parameters of the backup, such as proximity in date, or whether the archived data is alterable in the archived form. Other parameters to consider could be the relative frequency of requests to the data or to importance of the data as determined by a client component or a management component directing those parameters. Thus, in the case of differential backups, portions of the archived file may reside across several different media. Older portions may be contained in the device 314, while newer updated versions of that block may be contained in the device 312. Portions that have not changed may still be in other library devices.
  • Fig. 4 is a tree diagram of a network file system maintained by the file processor of a SAN system implementing a path extension to data archived by the modular backup system, all of Fig. 2.
  • the normal network file system has a root directory "/”, with various partitions relating to different functions and or different data sets as determined by the functions and configurations of the computing devices 210, 220, and 230, all of Fig. 2.
  • the archived portion of the network file system resides in the subdirectory "Backups/".
  • the network file structure would change to reflect those additions or deletions.
  • the subdirectories to the backup directory could be grouped by machine, by date, or by combination, just to name a few schemes.
  • the media component 260 when the media component 260 performs an archival backup of the computing device 210, both of Fig. 2, the media component 260 would perform an index conversion of the backup index to a filename reflecting the relationship.
  • the archived data and/or files are grouped by machine first, then by archive data.
  • the files backed up on a given day from computing device 210 are given a name corresponding to that particular machine.
  • the name of the computing device 210 corresponds to "Computer 1."
  • the media component 260 derives a filename for each file backed up to the library media 275 from the computing device 210 and would take the form of "/Backups/Computer #l/ ⁇ date>/ ⁇ filename>" in the network file system.
  • the file system indicates that the file "File_A”, “File_B”, and “File_C” on the computer corresponding to "Computerl" were backed up on January 1, 1999.
  • file system also indicates that these same files were backed up on January 2, as well as the "File_D”, a sector of data indicated by “Sector_E”, and a portion of another file, "File F”. It should be noted that the client component 260 or the management component 202 might perform this naming function based on the backed up items, as well from the indexing functions of the media component 260.
  • the complete information about archived data and/or files is readily visible and accessible to any of the computing devices 210, 220, all of Fig. 2, presently using the network file system.
  • an archival backup could take several forms.
  • a backup can target data and files on a sector or block write basis, or can be used in a file basis.
  • An incremental backup for example, only those blocks or files that have been altered would be stored for backup and retrieval memeposes.
  • a differential backup only those changed blocks as contained within an altered file would be stored.
  • criteria such as file size, can be used to determine a hybrid backup strategy wherein both files and blocks are saved, depending on the criteria employed.
  • the media component 260, Fig. 2 can readily incorporate these forms of backups into the scheme.
  • the data and/or file is displayed as part of the file system as relayed to the file processor 252, Fig. 2.
  • the media component 260 can maintain internally a linked list of the actual blocks or sectors that make up a file or a chunk of data for each modification.
  • the file system can display the file and/or data individually as grouped by modification date, and the media component 260 would coordinate in pulling the appropriate sectors or blocks out of the media library 275, Fig. 2 that make up that particular file or data set as of the modification as shown in the network file system.

Abstract

A storage system having a computing device for storing information in the storage system. The computing device includes a management component that directs the storing of information in the storage system, and at least one client component operating on at least one other computing device. The management component coordinates the storing of information in the storage system by interaction with the at least one client component. Various aspects of the present invention may also be found in a method for storing information in a storage system. The method includes configuring a computing device for storing information in the storage system; directing the storing of information in the computing device of the storage system with a management component; and coordinating the storing of information in the storage system by interaction with at least one client component, the at least one client component operating on at least one other computing device.

Description

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE
TITLE: MODULAR BACKUP AND RETRIEVAL SYSTEM WITH AN
INTEGRATED STORAGE AREA FILE SYSTEM
SPECIFICATION
CROSS-REFERENCES TO RELATED APPLICATIONS
This application claims the benefit of U.S. Provisional Patent Application Serial Nos. 60/143,744, and 60/143,743, both filed July 14, 1999, pending, and U.S. Patent Application entitled "Modular Backup and Retrieval System With An Integrated Storage Area File System", filed July 5, 2000, Serial No. .
INCORPORATION BY REFERENCE
This application hereby incorporates by reference, in their entirety, U.S. Provisional Patent Application Serial Nos. 60/143,744, and 60/143,743, both filed July 14, 1999, pending, and U.S. Patent Application entitled "Modular Backup and Retrieval System With
An Integrated Storage Area File System", filed July 5, 2000, Serial No. .
BACKGROUND
1. Technical Field.
The present invention is directed towards backup systems for computer networks. In particular, the present invention is directed towards the implementation- of a distributed, modular backup system with a storage area network (SAN) system, and the use of the modular backup system under the SAN file system. 2. Related Art.
Conventional backup devices usually employ a monolithic backup and retrieval system servicing a single server with attached storage devices. These systems usually control all aspects of a data backup or retrieval, including timing the backup, directing the files to be backed up, directing the mode of the archival request, and directing the storage process itself through attached library media. Further, these backup and retrieval systems are not scalable, and often direct only one type of backup and retrieval system, whether it is a network backup or a single machine backup.
Further, especially in a network environment, a user on an archived network has no easy and readily understandable way to access archived data for informational or retrieval purposes. The archived backups and their information are not easily accessible in a meaningful and clear manner over a network.
The operation of a backup and retrieval system across a network containing several different types of hardware and operating systems presents other challenges to displaying and using the archived data. The possibility of differences in the hardware components, operating systems, and file structures of different components in the network complicate the accessibility of files and data.
Many other problems and disadvantages of the prior art will become apparent to one skilled in the art after comparing such prior art with the present invention as described herein.
SUMMARY OF THE INVENTION
Various aspects of the present invention may be realized through a storage system having a computing device for storing information in the storage system. The computing device includes a management component that directs the storing of information in the storage system, and at least one client component operating on at least one other computing device. The management component coordinates the storing of information in the storage system by interaction with the at least one client component.
In certain embodiments, the management component of the storage system manages backup and retrieval of information according to predetermined storage policies. For example, the storage policies of the storage system may consist of the following: scheduling policies, aging policies, index pruning policies, drive cleaning policies, configuration information, tracking all running and waiting jobs, allocating drives, selecting a type of backup, tracking different applications running on each client, and tracking media types. In other embodiments, the management component contains scheduling information for a timetable of backups for the computing devices and the computing devices may be interconnected. In still other embodiments, the storage system may include a modular backup system that works in conjunction with a storage area network (SAN) system. Of course, the information in the storage system may be data or files and the computing device may include an attached data storage device, to which it can store data and files locally. The computing devices of the storage system may be connected to the SAN system via a direct fiber channel connection, a SCSI connection, or another equivalent type connection.
The storage system may also archive information on a library media managed by the media component, which maintains an index to easily locate the particular information that has been archived. At least one of the media component, the client component, or the management component may use the indices created during each backup and retrieval request to create a logical file system extension based on the contents of the library media, which contains all the archived information. Further, at least one of the media component, the client component, or the management component may create a pathname to any particular archived information, the pathname indicating to the file processor that the information is stored as an archive in the media library. In this case, the at least one of the media component or the client component use the pathname information to correlate the pathname with the actual storage location in the library media by reconverting the pathname to the index as created by the media component at the time of backup.
Various aspects of the present invention may also be found in a method for storing information in a storage system. The method includes configuring a computing device for storing information in the storage system; directing the storing of information in the computing device of the storage system with a management component; and coordinating the storing of information in the storage system by interaction with at least one client component, the at least one client component operating on at least one other computing device.
Other aspects of the present invention will become apparent with further reference to the drawings and specification which follow.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 is a schematic block diagram of a modular backup and retrieval system built in accordance with principles according to the present invention.
Fig. 2 is a schematic block diagram of a modular backup system working in conjunction with a storage area network (SAN) system according to principles of the present invention.
Fig. 3 is a schematic block diagram of the interaction of the library media of Fig. 2 with the SAN system.
Fig. 4 is a tree diagram of a network file system maintained by the SAN system implementing a path extension to data archived by the modular backup system, all of Fig. 2.
DETAILED DESCRIPTION OF THE DRAWINGS
Fig 1 is a schematic block diagram of a modulai backup system A modular backup system 100 compnses thiee components, a management component 1 10, one or more client components 120, and one or more media components 130
Typically, the three components comprising the management component 1 10, the client component 120, and the media component 130 may reside on several different machines For example, the management component 1 10, the client component 120, and the media component 130 may all leside on a single computing device Oi, the management component 110 and one of the media components 130 may reside on a single computing device with a client component 120 residing on a different computing device Or, the management component 110 and one of the client components 120 may reside on a single computing device with a media component 130 residing on a diffeient computing device Or, a media component 130 and a client component 120 may leside on the same computing device with the management component 110 residing on a different computing device Or, the management component 110, the client component 120, and the media component 130 may all reside on different computing devices Of couises, othei arrangements in accordance with pimciples of the present invention aie contemplated and upon viewing the present disclosure will become apparent to those of ordinary skill in the art
As shown in Fig 1, the management component 110 is coupled to the client components 120 and the media components 130 The media components 130 are also coupled to the client components 120
These components of the management component 1 10, client component 120, and the media component 130 aie typically software programs running on the respective computing devices The computing devices may not be the same devices, but communication should exist between these components, as demonstrated The client component 120 controls the actions and parameters of a backup or retrieval for a particular client computing device. A client computing device is the computing device in need of backup and retrieval assistance. The client components 120 can each reside on a client computing device, or is in active communication with the client computing device. The particular client component 120 for a particular client computing device communicates with a management director component 110 regarding such parameters as backup schedules, types of files in the backup schedule, the method of backup or retrieval, and other broad scope archival management functions for the client computing device. The particular client component 120 communicates with a particular media component 130 responsible for the actual backup or retrieval function.
The media component 130 controls the actions and parameters of the actual physical level backup or retrieval at the library media containing the archived data. Each media component 130 is responsible for one or more physical backup media devices. As shown in Fig. 1, the media component 130 may be responsible for a single backup device 140, or for a plurality of backup devices 150 through 160. The particular media component 130 directs the data that is the subject of an archival type request to or from, as the case may be, the particular backup devices 140, 150, or 160 that it is responsible for. In the case of a retrieval type archival request, the particular media component 130 directs the retrieved data to a requesting client component 120.
The particular media component 130 also creates a library index for the data contained on the particular backup devices 140, 150, or 160 for which it is responsible for operating. Additionally, the particular media component 130 indexes the location of the archived data and files on the particular associated backup media devices 140, 150, or 160 that it is responsible for operating, and allows the management component 110 and the client component 120 access to certain information about the index entries. The media component 130 uses this library index to quickly and easily locate a particular backed up file or other piece of data on the physical devices at its disposal.
The particular media component 130 resides on a computing device physically responsible for the operating the library media which the particular media component is responsible for, or it must be in active communication with that computing device. The media component also communicates with the management component 110, since the management component is responsible for the allocation of physical media for archival purposes.
The backup devices 140, 150, and 160 can comprise many different types of media, such as massively parallel fast access magnetic media, tape jukebox media, or optical jukebox media devices. The determination of which backup device is to be implemented is determined by several parameters. These include time related frequency of accesses, importance of the backup file or data and urgency of its retrieval, or how long ago the backup was made.
The management component 110 directs many aspects of a backup and a retrieval, including scheduling policies, aging policies, index pruning policies, drive cleaning policies, configuration information, keeping track of all running and waiting jobs, allocation of drives, type of backup (i.e. full, incremental, or differential), tracking different applications running on each client, and tracking media. First, for storage, the management component 110 contains the scheduling information for a timetable of backups for the computing devices. It should be noted that any number of computing devices might be involved, and that the computing devices may be interconnected.
Fig. 2 is a schematic block diagram of a modular backup system working in conjunction with a storage area network (SAN) system 250. A computing device 200 contains and operates a management component 202, which is responsible for the coordination of backup, storage, retrieval, and restoration of files and data on a computer network system 290. The management component 202 coordinates the aspects of these functions with a client component 212, running on another computing device 210, and a client component 222 running on yet another computing device 220. The computing device 220 also has an attached data storage device 214, to which it can store data and files locally.
The computing devices 210, 220, and 230 are connected to the SAN system 250 via a connection 264, such as a direct fiber channel connection, or a SCSI connection. However, it should be realized that any type of network connection is possible.
The SAN system 250 environment comprises the connection media 264, routers, and associated hubs for the actual data communication functions of the network, and a file processor 252. The elements of the SAN system 250 not explicitly numbered are implied in a remainder of the SAN system 250.
Another computing device 230 contains another client component 232. However, the computing device 230 is connected, through a network 270, to a file processor 252 for interaction with the SAN system 250 through another network 265. This network could be any type of network, such as a LAN operating under a TCP/IP protocol.
The client components 232, 222, or 212 coordinate and direct local backup and retrieval functions on the computing devices 230, 220, and 210, respectively. The management component 202 coordinates and directs the overall network backup of the computer network 290.
The computing devices 210, 220, and 230 can all be different architectures of machines running different operating systems. Hardware systems could include those made by SUN, Hewlett/Packard, Intel based families of processors, and machines based on the RS6000 and PowerPC families of processors, to name a few. Operating systems can include the many flavors of UNIX and UNIX-like operating systems, such as HP/UX, Solaris, AIX, and Linux, to name a few, as well as Windows NT by Microsoft.
The file processor 252 of the SAN system 250 contains a client component 262 and a media component 260. Storage media 257, 258, and 259 are communicatively coupled to the file processor 252 for storage of network files from the computing devices 210, 220, and 230. These storage devices can be magnetic media for fast retrieval, tape media for longer term storage, or optical media for much longer term storage.
The overall SAN system 250 acts as a block access device to the computing devices
210, 220, and 230. Thus, the overall SAN system 250 acts as a virtual media device and centralizes the network file system from the computing devices 210, 220, and 230. As such, true dynamic sharing of the data and files through the SAN system 250 is possible. These data and files are available to the computing devices 210, 220, and 230. The computing devices 210, 220, and 230 present their network file and data requests to the file processor
252 over the SAN network media 264 remainder of the SAN system 250 as they would any other storage media available to that computing device. The file processor 252, working in accordance with its software, interprets the data and file requests from the external computing devices. The file processor 252 then performs the file or data request based on the information it is given, and responds accordingly to the file or data request. The network file system is maintained and operated on solely by the file processor 252 of the SAN system
250. All accesses, writes, reads, and requests for information on any files and/or data under the network file system is handled by the SAN system 250, and in particular the file processor 252.
The file processor 252 keeps track of all the stored files and/or data stored on the media devices 257, 258, and 259. The file processor 252 maintains and presents a file system view of the stored data and/or files to the computing devices 210, 220, and 230 over the remainder of the SAN system 250 and the SAN network media 264. The computing devices 210, 220, and 230, when accessing or inquiring about portions of the network file system, perform these functions by requesting them through the file processor 252 of the SAN system 250.
The SAN system 250 allows access to the files and/or data stored in its storage media, and actually performs all the function of a file system to the attached computing devices 210, 220, and 230. Opening, closing, reading, and writing of data to files and of files themselves actually look and perform like a normal file system to the attached computing devices 210, 220, and 230. These actions are transparent to the computing devices. As such, the SAN system 250 acts and performs as a file system to the rest of the computing devices connected to the file processor 252. Also, from the perspective of the computing devices, each computing device can access and view the data and/or files stored by the file processor 252 of the SAN system 250 as part of a large, monolithic file system.
A client component 262 and a media component 260 can be part of the SAN system 250. These components work in conjunction with other components present in the network environment, including the file processor 252 itself, to make up a network backup and retrieval system for the computer network 290.
In the case of a local storage type archival request, the files and/or data, as stored on the media devices 257, 258, and 259, are routed by the file processor 252 to the media component 260. This request can be made by either the client component 262 local to the file processor 252, or the management component 202 overseeing the network backup as a whole. The files and/or data are then archived on a library media 275 managed by the media component 260, which maintains an index to easily locate the particular data and/or file that has been archived. As such, the data and/or files stored on the network file system can be archived to the library media 275 through the interaction of these components. When archived, the media component 260 maintains an index to the locations of the stored files and/or data. In the present invention, the media component 260, the client component 262, or the management component 200 may use the indices created during each backup and retrieval request to create a logical file system extension based on the contents of the library media 275, which contains all the archived data and/or files.
The media component 260, the client component 262, or the management component 202 may create a pathname to any particular archived data or file, and this pathname specially indicates to the file processor 252 that the data and/or file is stored as an archive in the media library 275. Specifically, the media component 260 or the client component 262 can use this information to correlate the pathname with the actual storage location in the library media 275 by reconverting the pathname to the index as created by the media component 260 at the time of backup.
This pathname information can be made available to the file processor 252, which integrates the file extensions and the information to the existing network file system that it currently maintains. In one embodiment, given a filename, the file processor 252 can query the client component 262 to access the archived file as directed in the requested manner. It may also query the management component 202 to do the same. The client component 262 or the management component 202, as the case may be, then translates the pathname to the appropriate index as stored by the media component 260, and requests that the media component 260 access the file as requested. The media component 260 then accesses the requested entry and returns the result of the query, whether that is information about the data or file, or the data or file itself. The client component 262 or the management component
202, as the case may be, then directs the data or the file itself to the file processor 252. If an external device, such as the computing device 212, initially requested the information, the file processor 252 would direct the result to that particular computing device.
Fig. 3 is schematic block diagram of the interaction of the library media and the media component in the file processor of Fig. 2 as implemented in a SAN system. As shown, a library media 310 controlled by a media component 320 may comprise a number of different storage media, or may just comprise one. In Fig. 3, the library media 310 comprises a fast, alterable random access device 312, a fast, non-alterable random access device 314, a serial device 316, a slow, alterable random access device 318, and a slow, non-alterable random access device 319.
An example of the fast, alterable random access device 312 includes various magnetic media, such as a disc drive, that could include multiple writing surfaces. An example of the fast, non-alterable random access device 314 includes a multi disc magneto-optical system.
An example of the slow, non-alterable random access device 318 includes jukeboxes containing CD-ROM disc drive cartridges. An example of the slow, non-alterable random access device 319 includes jukeboxes containing WORM optical discs. An example of the serial device 316 could include a magnetic tape cartridge jukebox. Upon viewing the present disclosure, one skilled in the art will realize that the media storage devices can include many other types of storage devices adapted to store data from such a system, and are not limited to those listed.
The media component 320 would control the placement of files, sectors, and other archival information on the appropriate library media. This placement could be controlled according to the parameters of the backup, such as proximity in date, or whether the archived data is alterable in the archived form. Other parameters to consider could be the relative frequency of requests to the data or to importance of the data as determined by a client component or a management component directing those parameters. Thus, in the case of differential backups, portions of the archived file may reside across several different media. Older portions may be contained in the device 314, while newer updated versions of that block may be contained in the device 312. Portions that have not changed may still be in other library devices.
Fig. 4 is a tree diagram of a network file system maintained by the file processor of a SAN system implementing a path extension to data archived by the modular backup system, all of Fig. 2. The normal network file system has a root directory "/", with various partitions relating to different functions and or different data sets as determined by the functions and configurations of the computing devices 210, 220, and 230, all of Fig. 2. In this embodiment, the archived portion of the network file system resides in the subdirectory "Backups/".
As each archive backup is performed, or as each archive cleanup and erasing occurs, the network file structure would change to reflect those additions or deletions. For example, the subdirectories to the backup directory could be grouped by machine, by date, or by combination, just to name a few schemes.
Thus, according to the example shown, when the media component 260 performs an archival backup of the computing device 210, both of Fig. 2, the media component 260 would perform an index conversion of the backup index to a filename reflecting the relationship. In the case presented, the archived data and/or files are grouped by machine first, then by archive data. The files backed up on a given day from computing device 210 are given a name corresponding to that particular machine. In one example, detailed in Fig. 4, the name of the computing device 210 corresponds to "Computer 1." Thus, the media component 260 derives a filename for each file backed up to the library media 275 from the computing device 210 and would take the form of "/Backups/Computer #l/<date>/<filename>" in the network file system. In the example as depicted in Fig. 4, the file system indicates that the file "File_A", "File_B", and "File_C" on the computer corresponding to "Computerl" were backed up on January 1, 1999. It should also be noted that the file system also indicates that these same files were backed up on January 2, as well as the "File_D", a sector of data indicated by "Sector_E", and a portion of another file, "File F". It should be noted that the client component 260 or the management component 202 might perform this naming function based on the backed up items, as well from the indexing functions of the media component 260.
The media component 260, the client component 262, or the management component 202, as the case may be, relays this pathname to the file processor 252, which can then modify the file system view as presented to the other computing devices. In this way, versions of the file system, which are in effect snapshots of the file system image, are preserved with each backup.
As such, the complete information about archived data and/or files is readily visible and accessible to any of the computing devices 210, 220, all of Fig. 2, presently using the network file system.
It should be noted that an archival backup could take several forms. A backup can target data and files on a sector or block write basis, or can be used in a file basis. In the case of an incremental backup, for example, only those blocks or files that have been altered would be stored for backup and retrieval puiposes. In the case of a differential backup, only those changed blocks as contained within an altered file would be stored. Or, criteria, such as file size, can be used to determine a hybrid backup strategy wherein both files and blocks are saved, depending on the criteria employed. The media component 260, Fig. 2, can readily incorporate these forms of backups into the scheme. In the case of an incremental backup, the data and/or file is displayed as part of the file system as relayed to the file processor 252, Fig. 2. Or, should a differential backup be used, the media component 260 can maintain internally a linked list of the actual blocks or sectors that make up a file or a chunk of data for each modification. The file system can display the file and/or data individually as grouped by modification date, and the media component 260 would coordinate in pulling the appropriate sectors or blocks out of the media library 275, Fig. 2 that make up that particular file or data set as of the modification as shown in the network file system.
In view of the above detailed description of the present invention and associated drawings, other modifications and variations will now become apparent to those skilled in the art. It should also be apparent that such other modifications and variations may be effected without departing from the spirit and scope of the present invention as set forth in this specification.

Claims

1. A storage system comprising: a computing device for storing information in the storage system, the computing device having a management component that directs the storing of information in the storage system; at least one client component operating on at least one other computing device; the management component coordinating the storing of information in the storage system by interaction with the at least one client component.
2. The storage system of claim 1 wherein the management component manages backup and retrieval of information according to predetermined storage policies.
3. The storage system of claim 2 wherein the storage policies consist of the following: scheduling policies, aging policies, index pruning policies, drive cleaning policies, configuration information, tracking all running and waiting jobs, allocating drives, selecting a type of backup, tracking different applications running on each client, and tracking media types.
4. The storage system of claim 1 wherein the management component contains scheduling information for a timetable of backups for the computing devices.
5. The storage system of claim 4 wherein the computing devices are interconnected.
6. The storage system of claim 1 further comprising a modular backup system that works in conjunction with a storage area network (SAN) system.
7. The storage system of claim 1 wherein the information in the storage system comprises data.
8. The storage system of claim 1 wherein the information in the storage system comprises files.
9. The storage system of claim 1 wherein the computing device further comprises an attached data storage device, to which it can store data and files locally.
10. The storage system of claim 6 wherein the computing devices are connected to the SAN system via a direct fiber channel connection.
1 1. The storage system of claim 6 wherein the computing devices are connected to the SAN system via a SCSI connection.
12. The storage system of claim 1 wherein the information is archived on a library media managed by the media component, which maintains an index to easily locate the particular information that has been archived.
13. The storage system of claim 12 wherein at least one of the media component, the client component, or the management component use the indices created during each backup and retrieval request to create a logical file system extension based on the contents of the library media, which contains all the archived information.
14. The storage system of claim 12 wherein at least one of the media component, the client component, or the management component create a pathname to any particular archived information, the pathname indicating to the file processor that the information is stored as an archive in the media library.
15. The storage system of claim 14 wherein at least one of the media component or the client component use the pathname information to correlate the pathname with the actual storage location in the library media by reconverting the pathname to the index as created by the media component at the time of backup.
16. A method for storing infomiation in a storage system comprising: configuring a computing device for storing information in the storage system; directing the storing of information in the computing device of the storage system with a management component; coordinating the storing of information in the storage system by interaction with at least one client component, the at least one client component operating on at least one other computing device.
PCT/US2000/019363 1999-07-14 2000-07-14 Modular backup and retrieval system with an integrated storage area file system WO2001004755A1 (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US14374499P 1999-07-14 1999-07-14
US14374399P 1999-07-14 1999-07-14
US60/143,744 1999-07-14
US60/143,743 1999-07-14
US60997700A 2000-07-05 2000-07-05
US09/609,977 2000-07-05

Publications (1)

Publication Number Publication Date
WO2001004755A1 true WO2001004755A1 (en) 2001-01-18

Family

ID=27385974

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/019363 WO2001004755A1 (en) 1999-07-14 2000-07-14 Modular backup and retrieval system with an integrated storage area file system

Country Status (1)

Country Link
WO (1) WO2001004755A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7979389B2 (en) 2003-11-13 2011-07-12 Commvault Systems, Inc. System and method for performing integrated storage operations
US8250088B2 (en) * 2007-10-05 2012-08-21 Imation Corp. Methods for controlling remote archiving systems
US8595191B2 (en) 2009-12-31 2013-11-26 Commvault Systems, Inc. Systems and methods for performing data management operations using snapshots
US8699178B2 (en) 2008-07-11 2014-04-15 Imation Corp. Library system with connector for removable cartridges
US8898411B2 (en) 2002-10-07 2014-11-25 Commvault Systems, Inc. Snapshot storage and management system with indexing and user interface
US8959299B2 (en) 2004-11-15 2015-02-17 Commvault Systems, Inc. Using a snapshot as a data source
US9092500B2 (en) 2009-09-03 2015-07-28 Commvault Systems, Inc. Utilizing snapshots for access to databases and other applications
US9298559B2 (en) 2009-12-31 2016-03-29 Commvault Systems, Inc. Systems and methods for analyzing snapshots
US10311150B2 (en) 2015-04-10 2019-06-04 Commvault Systems, Inc. Using a Unix-based file system to manage and serve clones to windows-based computing clients

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0259912A1 (en) * 1986-09-12 1988-03-16 Hewlett-Packard Limited File backup facility for a community of personal computers
US5005122A (en) * 1987-09-08 1991-04-02 Digital Equipment Corporation Arrangement with cooperating management server node and network service node
WO1995013580A1 (en) * 1993-11-09 1995-05-18 Arcada Software Data backup and restore system for a computer network
US5673381A (en) * 1994-05-27 1997-09-30 Cheyenne Software International Sales Corp. System and parallel streaming and data stripping to back-up a network
EP0809184A1 (en) * 1996-05-23 1997-11-26 International Business Machines Corporation Availability and recovery of files using copy storage pools
EP0899662A1 (en) * 1997-08-29 1999-03-03 Hewlett-Packard Company Backup and restore system for a computer network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0259912A1 (en) * 1986-09-12 1988-03-16 Hewlett-Packard Limited File backup facility for a community of personal computers
US5005122A (en) * 1987-09-08 1991-04-02 Digital Equipment Corporation Arrangement with cooperating management server node and network service node
WO1995013580A1 (en) * 1993-11-09 1995-05-18 Arcada Software Data backup and restore system for a computer network
US5673381A (en) * 1994-05-27 1997-09-30 Cheyenne Software International Sales Corp. System and parallel streaming and data stripping to back-up a network
EP0809184A1 (en) * 1996-05-23 1997-11-26 International Business Machines Corporation Availability and recovery of files using copy storage pools
EP0899662A1 (en) * 1997-08-29 1999-03-03 Hewlett-Packard Company Backup and restore system for a computer network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JANDER M: "LAUNCHING STORAGE-AREA NET", DATA COMMUNICATIONS,US,MCGRAW HILL. NEW YORK, vol. 27, no. 4, 21 March 1998 (1998-03-21), pages 64 - 72, XP000740968, ISSN: 0363-6399 *
LUIS-FELIPE CABRERA ET AL: "ADSM: A MULTI-PLATFORM, SCALABLE, BACKUP AND ARCHIVE MASS STORAGE SYSTEM", DIGEST OF PAPERS OF THE COMPUTER SOCIETY COMPUTER CONFERENCE (SPRING) COMPCON,US,LOS ALAMITOS, IEEE COMP. SOC. PRESS, vol. CONF. 40, 5 March 1995 (1995-03-05), pages 420 - 427, XP000545451, ISBN: 0-7803-2657-1 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8898411B2 (en) 2002-10-07 2014-11-25 Commvault Systems, Inc. Snapshot storage and management system with indexing and user interface
US8285671B2 (en) 2003-11-13 2012-10-09 Commvault Systems, Inc. System and method for performing integrated storage operations
US7979389B2 (en) 2003-11-13 2011-07-12 Commvault Systems, Inc. System and method for performing integrated storage operations
US8959299B2 (en) 2004-11-15 2015-02-17 Commvault Systems, Inc. Using a snapshot as a data source
US10402277B2 (en) 2004-11-15 2019-09-03 Commvault Systems, Inc. Using a snapshot as a data source
US8595253B2 (en) 2007-10-05 2013-11-26 Imation Corp. Methods for controlling remote archiving systems
US8250088B2 (en) * 2007-10-05 2012-08-21 Imation Corp. Methods for controlling remote archiving systems
US9116900B2 (en) 2007-10-05 2015-08-25 Imation Corp. Methods for controlling remote archiving systems
US8699178B2 (en) 2008-07-11 2014-04-15 Imation Corp. Library system with connector for removable cartridges
US10997035B2 (en) 2008-09-16 2021-05-04 Commvault Systems, Inc. Using a snapshot as a data source
US9092500B2 (en) 2009-09-03 2015-07-28 Commvault Systems, Inc. Utilizing snapshots for access to databases and other applications
US9268602B2 (en) 2009-09-14 2016-02-23 Commvault Systems, Inc. Systems and methods for performing data management operations using snapshots
US10831608B2 (en) 2009-09-14 2020-11-10 Commvault Systems, Inc. Systems and methods for performing data management operations using snapshots
US8595191B2 (en) 2009-12-31 2013-11-26 Commvault Systems, Inc. Systems and methods for performing data management operations using snapshots
US10379957B2 (en) 2009-12-31 2019-08-13 Commvault Systems, Inc. Systems and methods for analyzing snapshots
US9298559B2 (en) 2009-12-31 2016-03-29 Commvault Systems, Inc. Systems and methods for analyzing snapshots
US10311150B2 (en) 2015-04-10 2019-06-04 Commvault Systems, Inc. Using a Unix-based file system to manage and serve clones to windows-based computing clients
US11232065B2 (en) 2015-04-10 2022-01-25 Commvault Systems, Inc. Using a Unix-based file system to manage and serve clones to windows-based computing clients

Similar Documents

Publication Publication Date Title
US7035880B1 (en) Modular backup and retrieval system used in conjunction with a storage area network
US7389311B1 (en) Modular backup and retrieval system
US8171244B2 (en) Methods for implementation of worm mode on a removable disk drive storage system
US6438642B1 (en) File-based virtual storage file system, method and computer program product for automated file management on multiple file system storage devices
US6047294A (en) Logical restore from a physical backup in a computer storage system
US5887151A (en) Method and apparatus for performing a modified prefetch which sends a list identifying a plurality of data blocks
CN100416508C (en) Copy operations in storage networks
US20070174580A1 (en) Scalable storage architecture
US7386552B2 (en) Methods of migrating data between storage apparatuses
US20020069324A1 (en) Scalable storage architecture
US20050193235A1 (en) Emulated storage system
EP0535922A2 (en) Automated library data retrieval system
WO2004034197A2 (en) System and method for managing stored data
JPH0589583A (en) Method of deleting object from volume and hierachical storage system
CN101147118A (en) Methods and apparatus for reconfiguring a storage system
EP1934756A2 (en) System for archival storage of data
WO2001004756A1 (en) Modular backup and retrieval system used in conjunction with a storage area network
WO2001004755A1 (en) Modular backup and retrieval system with an integrated storage area file system
US7080223B2 (en) Apparatus and method to manage and copy computer files
Koltsidas et al. Seamlessly integrating disk and tape in a multi-tiered distributed file system
Kishi The IBM virtual tape server: Making tape controllers more autonomic
Castets et al. The IBM TotalStorage Tape Selection and Differentiation Guide

Legal Events

Date Code Title Description
AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase