CA2352736A1 - Collection recognizer - Google Patents

Collection recognizer Download PDF

Info

Publication number
CA2352736A1
CA2352736A1 CA002352736A CA2352736A CA2352736A1 CA 2352736 A1 CA2352736 A1 CA 2352736A1 CA 002352736 A CA002352736 A CA 002352736A CA 2352736 A CA2352736 A CA 2352736A CA 2352736 A1 CA2352736 A1 CA 2352736A1
Authority
CA
Canada
Prior art keywords
collection
collections
information
signatures
search space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002352736A
Other languages
French (fr)
Inventor
Kevin W. Jameson
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CA2352736A1 publication Critical patent/CA2352736A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99936Pattern matching access
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99939Privileged access
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99951File or database maintenance
    • Y10S707/99952Coherency, e.g. same view to multiple users
    • Y10S707/99953Recoverability

Abstract

Collection recognizers improve the productivity of knowledge workers by enabling automated systems to recognize interesting collections of arbitrary computer files for automated processing. In operation, a collection recognizes detects collections within a search space, selects interesting collections from the group of detected collections, and finally makes information about the selected collections available to software programs for subsequent automated processing. Collection recognizers help to enable the construction of fully automated collection processing systems.

Description

Patent Application of Kevin W Jameson For COLLECTION RECOGNIZER
CROSS REFERENCES TO RELATED APPLICATIONS
None.
FIELD OF THE INVENTION
This invention relates to automated software systems for processing collections of computer files in arbitrary ways, thereby improving the productivity of software developers, web media developers, and other humans and computer systems that work with collections of computer files.
BACKGROUND OF THE INVENTION
The general problem addressed by this invention is the low productivity of human knowledge workers who use labor-intensive manual processes to work with collections of computer files. One promising solution strategy for this software productivity problem is to build automated systems to replace manual human effort.
Unfortunately, replacing arbitrary manual processes performed on arbitrary computer files with automated systems is a difficult thing to do. Many challenging subproblems must be solved before competent automated systems can be constructed. As a consequence, the general software productivity problem has not been solved yet, despite large industry investments of time and money over several decades.
The present invention provides one piece of the overall functionality required to implement automated systems for processing collections of computer files. In particular, the current invention has a practical application in the technological arts because it provides application programs with a convenient, precise, scalable, and fully automated means for recognizing particular collections of files for automated processing.
The Collection Recognition problem is one important problem that must be solved to enable the construction of automated processing systems. It is the problem of how to automatically recognize particular collections of files for automated processing.
Some interesting characteristics of the collection recognition problem that make it difficult to solve include at least these: collections can have arbitrary data type;
collections can have arbitrary size and content; collections can have arbitrary internal structure; collections can require arbitrary processing; collections can be arbitrarily located within a filesystem, database, or network search space; only a few interesting collections might be selected from a large pool of collections; selection processes can use internal content or exteunal filesystem attributes; and arbitrary numbers of collections may be involved.
General Shortcomings o1' the Prior Art A professional prior art search for the present invention was performed, but produced no meaningful, relevant works of prior art. Therefore the following discussion is general in nature, and highlights the significant conceptual differences between file-oriented mechanisms in the prior art and the novel collection-oriented mechanisms represented by the present invention.
Prior art approaches lack support for collections. This is the largest limitation of all because it prevents the use of high-level collection abstractions that can significantly aid productivity.
Prior art approaches lack user-defined data types for collections of files.
This is a significant limitation because user-defined data types are a primary mechanism for carrying relevant semantic information about collections of files.
Prior art approaches lack shared data types for collections of files. This is a significant limitation because sharable type definitions are a primary mechanism for propagation and reuse of important collection type information.
Prior art approaches lack user-defined per-collection instance data. This is a significant limitation because per-instance data is the primary mechanism for augmenting or overriding general type definition information shared among all collections of a particular type.
Prior art approaches lack the ability to use collection type definition information and collection instance data for match criteria in collection recognition searches. This is a significant limitation because collection type definition and collection instance data are both rich sources of useful recognition matching information.
As can be seen from the above description, prior art approaches have several important disadvantages. Notably, prior art approaches do not support collections, do not support user-defined collection instance information, and do not support user-defined collection data types. These are the three most important limitations of all.
In contrast, the present collection recognizer invention has none of these limitations, as the following disclosure will show.
SUMMARY OF THE INVENTION
A collection recognizer dynamically detects and selects collections from within a search space, and makes the resulting collection recognition information available to software programs, thereby enabling the construction of fully automated software systems for processing collections of arbitrary computer files.
In operation, a collection recognizer is used by an application program to recognize interesting collections of files for processing. A collection recognizer first detects a set of interesting collection signatures from within a search space using signature detection criteria, thereby forming a first pool of detected collections. From the first pool of detected collections, a second pool of selected collections is created, using various selection criteria. Selection criteria can include search space information, collection instance information, collection content information, and collection type definition information. Ultimately, a collection recognizer returns information about detected and selected collections to a calling program for subsequent processing.
Collection recognizers solve the collection recognition problem by providing software programs with a generalized, precise, scalable, customizable, and extensible means for recognizing collections within a filesystem search space. In particular, collection recognizers return information-rich collection data structures back to calling software programs. Collection recognizers thus enable automated collection processing systems to recognize collections of arbitrary computer files in more precise, more automated, more scalable, and more knowledgeable ways than were previously possible.
OBJECTS AND ADVANTAGES
The present collection recognizer invention solves all of the general prior art limitations described previously. Specifically, collection recognizers support collections of files, support user-defined collection types, support shared collection types, support user-defined per-collection instance data, and support use of collection type and instance data in recognition searches.

The present collection recognizer invention also has the following additional objects and advantages.
One object of the present invention is to provide a generalized, fully automated collection recognizer means for software programs, thereby enabling the construction of generalized, large-scale, automated collection processing systems.
Another object is to provide sufficient flexibility, extensibility, and capacity to strongly resist scale-up failure, thereby enabling automated collection recognizers and collection processing systems to scale up smoothly, with reduced risk of scale-up failure.
Another object is to provide a collection recognition model that is independent of search space type, thereby enabling collection recognition searches to be conducted using various search spaces including filesystems, databases, and distributed networks.
Another object is to produce information-rich data structures from the recognition process, containing both collection information and recognition process information, thereby saving application programs the effort of obtaining collection and process information themselves.
Other features and advantages of the present Collection Recognizer invention will become apparent upon further reading of the drawings and disclosure that follow.
BRIEF DESCRIPTION OF DRAWINGS
FIG 1 shows a sample prior art filesystem folder in a typical personal computer filesystem.
FIG 2 shows how a portion of the prior art folder in FIG 1 has been converted into a collection 100 by the addition of a collection specifier file 102 named "cspec" FIG 2 Line 5.
FIG 3 shows an example physical representation of a collection specifier 102, implemented as a simple text file such as would be used on a typical personal computer filesystem.
FIG 4 shows four major inforrnation groupings for collections, including collection type definition 101, collection specifies 102, collection content 103, and collection 100.
FIG 5 shows a more detailed view of the information groupings in FIG 4, illustrating several particular kinds of per-collection-instance and per-collection-type information.
FIG 6 shows a logical diagram of how a Collection Information Manager Means would act as an interface between an application program means 110 and a collection information means 107, including collection information sources 101-103.
FIG 7 shows a physical software embodiment of how an Application Program Means would use a Collection Information Manager Means 111 to obtain collection information from various collection information API means 112-114 connected to various collection information server means 115-117.
FIG 8 shows an example software collection datastructure that relates collection specifier and collection content information for a single collection instance.
FIG 9 shows an example collection type def nition datastructure, such as might be used by software programs that process collections.
FIG 10 shows a more detailed example of the kinds of information found in collection type definitions.
FIG 11 shows a simplified architecture of a Collection Recognizer Means 130 connected to a Collection Signature Search Space Means 108 and a Collection Information Means 107.
FIG 12 shows possible information flows across a collection recognizer API
(Application Programming Interface) interface, illustrating various input and output information flows across the interface.
FIG 13 shows an expanded architecture of the Collection Recognizer Means 130 shown in FIG 11.
FIG 14 shows a simplified algorithm for performing collection recognition, using the software components shown in FIG 13.
FIG 1 S shows an example datastructure of collection recognizer output information, containing a list of detected and selected collections and other information.
FIG 16 shows a tree of collections stored within a typical personal computer filesystem.
FIG 17 shows a derived list search space view based on the collection tree shown in FIG
16. The derived list search space, a simple text-file, is comprised of collection specifier accessor pathnames and collection type values.
FIG 18 shows a simplified algorithm for a collection recognizer using the derived text-file search space of FIG 17.
FIG 19 shows an example logical database table layout for a derived database search space based on the collection tree shown in FIG 16. The derived database search space is composed of one database table containing at least 2 columns describing collection accessor and collection type values FIG 20 shows example software function interfaces from a non-collection-aware (NCA) filesystem API 163 typical of a modern personal computer.
FIG 21 shows a simplified architecture of how a Collection Recognizer Means 130 might use both collection-aware (CA) 162 and non-collection-aware (NCA) 163 API
interfaces to perform collection recognition activities.
FIG 22 shows example software function interfaces that might be part of a collection-aware (CA) filesystem API 162.
FIG 23 shows sample collection signature criteria constructed from filesystem attributes provided by a typical lion-c011 ection-aware (NCA) filesystem API 163 implementation.
FIG 24 shows a simplified upsearch algorithm for detecting a collection signature above the current working directory, using a non-collection-aware (NCA) 163 filesystem search space.
FIG 25 shows how the algorithm of FIG 24 would proceed to change directories while attempting to detect a collection signature, using a non-collection-aware (NCA) 163 filesystem search space.
FIG 26 shows a simplified up search algorithm for detecting a collection signature, using a collection-aware (CA) 162 filesystem search space.
FIG 27 shows a simplified down search algorithm for detecting collection signatures below an initial starting directory, using a typical non-collection-aware (NCA) 163 filesystem search space.
FIG 28 shows how a down search algorithm might sequentially visit the collections of FIG 16, first according to depth within the tree, and second, according to alphabetic order of collection names.
FIG 29 shows a simplified down search algorithm for detecting collection signatures using a collection-aware (CA) 162 filesystem search space.
FIG 30 shows sample policies for selecting interesting collections from sets of detected collections produced by the up search or down search algorithms mentioned above.
FIG 31 shows sample selection tests based on information contained outside (signature) and inside (content) the collections being selected.
FIG 32 shows an example collection specifier that contains a special command option requesting collection recognizers to skip the host collection during recognition actions.

FIG 33 shows a simplified collection recognition algorithm that includes both detection and selection actions.
FIG 34 shows sample collection recognition values that reflect various recognition policy decisions for filesystem implementations of collections. Three sets of recognition policies are shown.
FIG 35 shows an example database schema and query expression that might be used to represent and perform collection signature detection activities, using a database implementation of collections.
FIG 36 shows an example database schema and query expression that might be used to represent and perform collection specifier accessor calculation activities, using a database implementation of collections.
FIG 37 shows an example database schema and query expression that might be used to represent and perform collection specifier access activities, using a database implementation of collections.
FIG 38 shows an example database schema and query expression that might be used to represent and perform collection content access activities, using a database implementation of collections.
FIG 39 shows an example high-level recognition algorithm that includes both detection and selection actions, for the sample database implementation of collections shown in previous diagrams.
FIG 40 shows sample recognition values that reflect various recognition policy decisions and values for the sample database implementation of collections shown in previous diagrams.
FIG 41 shows a simplified logical architecture for a generic, non-collection-enabled, prior art application program.
F1G 42 shows a simplified logical architecture for a generic, Collection-Enabled Application Program 171, made collection-aware by internally modifying said application program to call a Collection Recognizer Means 130 to recognize collections within an Application Data Server Means 172.
FIG 43 shows a simplified logical architecture for a generic, Collection-Enabled Application Program 171, made collection-aware by adding an external wrapper program to relate said application program with a Collection Recognizer Means 130 to recognize collections within an Application Data Server Means 172.

g LIST OF DRAW1NG 17,EFERENCE NUMBERS
100 A collection formed from a prior art folder 101 Collection type definition information 102 Collection specifies information 103 Collection content information 104 Per-collection collection processing information 105 Per-collection collection type indicator 106 Per-collection content link specifiers 107 Collection information means 108 Collection signature search space 110 Application program means 111 Collection information manager means I 12 Collection type definition API means I l3 Collection specifies API means 114 Collection content API means I 15 Collection type definition server means 116 Collection specifies seraer means I 17 Collection content server means 130 Collection recognizes means 140 Module for managing collection recognition process 141 Module for obtaining runtime information 142 Module for getting detected collections 143 Module for detecting collection signatures 144 Module for selecting from detected collection pool 145 Module for deriving additional recognition information 146 Module for formatting recognition output information 160 Collection recognition enabled application architecture 162 Collection-aware storage system API means 163 Non-collection-aware storage system API means 164 A computer operating system 165 A computer disk storage means 170 A non-collection-enabled application architecture 171 Collection-enabled application program means 172 Application data server means 175 A collection-enabled application architecture 176 A collection-enabled application wrapper program 177 A collection-enabled application wrapper architecture DET~.ILED DESCRIPTI01~1 Overview of Collections This section introduces collections and some related terminology Collections are sets of computer files that can be manipulated as a set, rather than as individual files. Collection are comprised of three major parts: ( 1 ) a collection specifier that contains information about a collection instance, (2) a collection type definition that contains information about how to process all collections of a particular type, and (3) optional collection content in the form of arbitrary computer files that belong to a collection.
Collection specifiers contain information about a collection instance. For example, collection specifiers may define such things as the collection type, a text summary description of the collection, collection content members, derivable output products, collection processing information such as process parallelism limits, special collection processing steps, and program option overrides for programs that manipulate collections.
Collection specifiers are typically implemented as simple key-value pairs in text files or database tables.
Collection type definitions are user-defined sets of attributes that can be shared among multiple collections. In practice, collection specifiers contain collection type indicators that reference detailed collection type definitions that are externally stored and shared among all collections of a particular type. Collection type definitions typically define such things as collection types, product types, file types, action types, administrative policy preferences, and other information that is useful to application programs for understanding and processing collections.
Collection content is the set of all files and directories that are members of the collection.
By convention, all files and directories recursively located within an identified set of subtrees are usually considered to be collection members. In addition, collection specifiers can contain collection content directives that add further files to the collection membership. Collection content is also called collection membership.
Collection is a term that refers to the union of a collection specifier and a set of collection content.
Collection information is a term that refers to the union of collection specifier information, collection type definition information, and collection content information.
Collection membership information describes collection content.
Collection information managers are software modules that obtain and organize collection information from collection information stores into information-rich collection data structures that are used by application programs.

Collection Physical Representations - Main Embodiment Figures 1-3 show the physical form of a simple collection, as would be seen on a personal computer filesystem.
FIG 1 shows an example prior art filesystem folder from a typical personal computer filesystem.
FIG 2 shows the prior art folder of FIG 1, but with a portion of the folder converted into a collection 100 by the addition of a collection specifies file FIG 2 Line 5 named "cspec".
In this example, the collection contents 103 of collection 100 are defined by two implicit policies of a preferred implementation.
First is a policy to specify that the root directory of a collection is a directory that contains a collection specifies file. In this example, the root directory of a collection 100 is a directory named "c-myhomepage" FIG 2 Line 4, which in turn contains a collection specifies file 102 named "cspec" FIG 2 Line 5.
Second is a policy to specify that all files and directories in and below the root directory of a collection are part of the collection content. Therefore directory "s"
FIG 2 Line 6, file "homepage.html" FIG 2 Line 7, and file "myphoto.jpg" FIG 2 Line 8 are part of collection content 103 for said collection 100.
FIG 3 shows an example physical representation of a collection specifies file 102, FIG 2 Line 5, such as would be used on a typical personal computer filesystem.
Collection Information Types Figures 4-5 show three main kinds of information that are managed by collections.
FIG 4 shows a high-level logical structure of three types of information managed by collections: collection processing information 101, collection specifies information 102, and collection content information 103. A logical collection 100 is comprised of a collection specifies 102 and collection content 103 together. This diagram best illustrates the logical collection information relationships that exist within a preferred filesystem implementation of collections.
FIG 5 shows a more detailed logical structure of the same three types of information shown in FIG 4. There is only one instance of collection type definition information 101 per collection type. Collection content information FIG 4 103 has been labeled as per-instance information in FIG 5 103(i) because there is one instance of collection content information per collection instance. Collection specifies information 102 has been partitioned into collection instance processing information 104, collection-type link information 105, and collection content link information 106. FIG 5 is intended to show several important types of information 104-106 that are contained within collection specifiers 102.
Suppose that an application program means 110 knows (a) how to obtain collection processing information 101, (b) how to obtain collection content information 103, and (c) how to relate the two with per-collection-instance information 102. It follows that application program means 110 would have sufficient knowledge to use collection processing information 101 to process said collection content 103 in useful ways.
Collection specifiers 102 are useful because they enable all per-instance, non-collection-content information to be stored in one physical location. Collection content 103 is not included in collection specifiers because collection content 103 is often large and dispersed among many files.
All per-collection-instance information, including both collection specifier 102 and collection content 103, can be grouped into a single logical collection 100 for illustrative purposes.
Collection Application Architectures Figures 6-7 show example collection-enabled application program architectures FIG 6 shows how a collection information manager means 111 acts as an interface between an application program means 110 and collection information means 107 that includes collection information sources 101-103. Collectively, collection information sources 101-103 are called a collection information means 107. A collection information manager means 111 represents the union of all communication mechanisms used directly or indirectly by an application program means 110 to interact with collection information sources 101-103.
FIG 7 shows a physical software embodiment of how an application program means could use a collection information manager means 111 to obtain collection information from various collection information API (Application Programming Interface) means 112-114 connected to various collection information server means 115-117.
Collection type definition API means 112 provides access to collection type information available from collection type definition server means 115. Collection specifies API
means 113 provides access to collection specifies information available from collection specifies server means 116. Collection content API means 114 provides access to collection content available from collection content server means 117.
API means 112-114, although shown here as separate software components for conceptual clarity, may optionally be implemented wholly or in part within a collection information manager means 111, or within said server means 115-117, without loss of functionality.
API means 112-114 may be implemented by any functional communication mechanism known to the art, including but not limited to command line program invocations, subroutine calls, interrupts, network protocols, or file passing techniques.
Server means 115-117 may be implemented by any functional server mechanism known to the art, including but not limited to database servers, local or network file servers, HTTP web servers, FTP servers, NFS servers, or servers that use other communication protocols such as TCP/IP, etc.
Server means 115-117 may use data storage means that may be implemented by any functional storage mechanism known to the art, including but not limited to magnetic or optical disk storage, digital memory such as RAM or flash memory, network storage devices, or other computer memory devices.
Collection information manager means 111, API means 112-114, and server means 117 may each or all optionally reside on a separate computer to form a distributed implementation. Alternatively, if a distributed implementation is not desired, all components may be implemented on the same computer.
Collection Data Structures Figures 8-10 show several major collection data structures.
FIG 8 shows an example collection datastructure that contains collection specifier and collection content information for a collection instance. Application programs could use such a datastructure to manage collection information for a collection that is being processed.
In particular, preferred implementations would use collection datastructures to manage collection information for collections being processed. The specific information content of a collection datastructure is determined by implementation policy. However, a collection specifier typically contains at least a collection type indicator FIG 8 Line 4 to link a collection instance to a collection type definition.
FIG 9 shows an example collection type definition datastructure that could be used by application programs to process collections. Specific information content of a collection type definition datastructure is determined by implementation policy. However, collection type definitions typically contain information such as shown in Figures 9-10.
FIG 10 shows example information content for a collection type definition datastructure such as shown in FIG 9. FIG 10 shows information concerning internal collection directory structures, collection content location definitions, collection content datatype definitions, collection processing definitions, and collection results processing definitions. The specific information content of a collection type definition is determined by implementation policy. If desired, more complex definitions and more complex type definition information structures can be used to represent more complex collection structures, collection contents, or collection processing requirements.
Collection Recognizer Architecture Figures 11-15 show software architectures and algorithms for collection recognizers.
FIG 11 shows a simplified architecture of a Collection Recognizer Means 130 connected to sources of collection signature search space information 108 and collection information 107. Although shown here as separate entities for conceptual clarity, collection signature search spaces 108 and collection information sources 107 are often implemented within the same computer filesystem.
FIG 12 shows example information flows across a collection recognizer means (Application Programming Interface) interface, illustrating various input and output flows across the interface. The input flows depict search space, detection, and selection criteria.
The output flows depict lists of detected collections, selected collections, and other information provided by the recognition process.
Collection Recognizer Terminology Collection signature search spaces are computer data storage mechanisms that store collection signatures. Examples of typical collection signature search spaces are typical personal computer filesystems, databases, and network storage mechanisms such as FTP
servers, HTTP servers, and so on. In essence, a collection signature search space can be any searchable computer storage medium.
Collection signatures are particular sets of attributes from computer data storage media that indicate the presence of a collection. Examples of typical collection signatures include particular filenames, particular directory names, particular filesystem timestamp amibutes, or combinations thereof. FIG 23 lists several possible combinations of filesystem attributes that could be used to define collection signatures. The main purpose of a collection signature is to provide sufficient information to derive a collection specifier accessor for the collection belonging to the signature.
Collection specifier accessors are computer storage system expressions that can be used to access collection specifier information. In preferred filesystem implementations, collection accessors are pathnames to collection specifier files. An explicit collection specifies accessor is an explicit pathname that points to a valid collection specifies file.
FIG 34 shows several collection specifies accessors for typical filesystems.
In database implementations, collection specifies accessors are database expressions that can be used to access collection specifies information. FIG 36 shows an example database table that could be used to store collection specifies accessors for a database implementation.
Collection detection criteria are combinations of search space attributes that define signature match criteria. In operation, collections are detected when their signatures match the current collection detection criteria being used by a searching software module.
Typically, collection detection criteria are designed to be exact matches to collection signatures. However, detection criteria can also be made broader, to detect multiple different collection signatures. FIG 23 shows some example collection signatures that could be used for exact-match detection criteria. FIG 34 shows some additional collection detection policies.
Collection selection criteria are collection characteristics that are used to select interesting collections from a pool of detected collections. Selection criteria can be comprised from any property or attribute or content associated with collections, including signature attributes, collection contents, or collection type definition attributes. FIG
30 shows some example selection policies. FIG 31 shows some possible selection tests based on signature properties and content properties.
Collection Recognizer --- OpEration FIG 13 shows a detailed architectural view of the collection recognizer means software shown in FIG 11. A collection recognizer manager 140 oversees the collection recognition process.
Module Get Runtime Info 141 obtains and prepares input arguments and runtime information required by the collection recognizer manager 140. Runtime information typically includes command line arguments, environment information, explicit collection accessors provided on the command line, and other implementation configuration options.
If appropriate input arguments and explicit collection specifier accessors are provided directly to the invocation, collection detection activities may be omitted.
This is because the major purpose of the collection detection process is not required.
Having obtained all necessary input values for a recognition process, Collection Recognizer Manager 140 proceeds to carry out a recognition process with the help of modules 142-145.
Module Get Detected Collections 142 obtains and returns a list of collections that match collection signature match criteria provided to the invocation. Module Detect Collection Signatures API Means 143 interacts with a Collection Signature Search Space 108 means to obtain matching collection signatures.
Module Collection Information Manager 111 is used to retrieve collection information about the detected collections. Collection Information Managers are described in a related patent application. See the cross-references to related applications section of this document for more information.
Collection Information Sources 107 are used to provide collection specifier, collection data type, and collection content information. Collection information sources 107 are not special data storage mechanisms. Rather, they are normal data storage mechanisms known to the art, but with the additional expectation that they contain valid collection information.
Finally, Get Detected Collections 142 returns a list of interesting detected collections and associated information to Collection Recognizes Manager 140.
Module Select Collections I44 selects interesting collections from the pool of detected collections, according to selection criteria provided to the invocation.
Module Derive Additional Recognition Info 145 obtains more detailed information about selected collections and about the recognition process itself. Module Collection Information Manager 111 is used to retrieve collection information about selected collections. Collection Information Sources 107 are used to provide collection specifies, collection data type, and collection content information.
Module Output Recognition Information 146 organizes output collection recognition information from the collection recognition process in preparation for returning final information to Collection Recognizes Manager 140. Optionally, this module could write recognition information to disk, print it to a printer, or otherwise display or distribute recognition information.
FIG 14 shows a simplified algorithm for performing collection recognition, using the software components shown in FIG 13.
FIG 15 shows a datastructure view of example collection recognizes output information such as might be produced by the architecture of FIG 13 and algorithm of FIG
14. The example recognizes output information contains lists of detected and selected collections and other recognition information.
Although a single datastructure has been used in FIG 15 to illustrate and relate detection, selection, and other recognition information for clarity, a single data structure is not required. Other separate datastructures could also achieve the same result, providing that proper associations were maintained among the various information elements returned by the recognition process.
Derived Search Spaces Figures 16-19 show example derived collection search spaces in text file and database formats, along with collection recognizes algorithms for processing the derived search spaces.
FIG 16 shows a tree of collections stored within a typical personal computer filesystem.
Collections within a filesystem can be organized in arbitrary ways, with a caution that nested collections may confuse some application programs. The acceptance, meaning, and proper treatment of nested collections are determined by implementation policy. For example, one implementation may choose to disallow nested collections, while another implementation may accept them.
FIG 17 shows a collection list search space derived from the collection tree of FIG I6.
The derived list search space is comprised of a list of collection specifier accessor pathnames and collection type indicators. Specific information content of derived search spaces is determined by implementation policy, with the constraint that search spaces must provide enough information to support detection and selection operations.
Specific detection and selection criteria are also determined by implementation policy.
Policy examples for preferred implementations are shown later in this document.
FIG 18 shows a simplified alg~~rithm for a collection recognizer, using the derived text-file search space of FIG 1 7 . In particular, the algorithm does not perform any detection activities, since there is no need to detect or discover collections. All entries within derived search spaces are assumed to be formed from valid collections.
The recognition algorithm FIG 18 proceeds by sequentially performing various kinds of selection activities on the set of collections contained within the search space of FIG 17.
Input control arguments to algorithm FIG 18 could specify which types of selection procedures should be performed. If later selection procedures were not required, the algorithm could optionally return (e.g. FIG 18, Line 8) without executing all selection procedures shown in the figure.
FIG 19 shows an example logical database table layout for a derived database search space based on the collection tree shown in FIG 16. The derived database search space is composed of one database table containing at least 2 columns describing collection specifier accessor and collection type values. Specific information content of derived search spaces is determined by implementation policy, with the constraint that derived search spaces must provide at least collection specifier accessor information sufficient for accessing valid collection specifiers.
Collections - Filesystem Implementation Figures 20-23 show physical embodiments of collection-aware (CA) and non-collection-aware (NCA) filesystems that contain collection signatures that can be detected by collection recognizers.
FIG 20 shows example software function interfaces that might be part of an example NCA filesystem API 163, such as might be found on a typical personal computer.
One important feature of NCA filesystem APIs is that they do not provide functions that "understand" or manipulate collections directly. NCA filesystem APIs understand only files and directories, not collections. It follows that collection recognizers built on top of NCA APIs must provide additional software logic to implement collection-aware operations that use the underlying NCA filesystem services.

FIG 21 shows an architectural view of how a collection recognizer means 130 might use both CA 162 and NCA 163 API interfaces to perform collection recognition activities. A
collection-aware API means 162 is built on top of an NCA API means 163, which is in turn is part of a computer operating system 164.
FIG 22 shows example software function interfaces that might be part of a collection-aware filesystem API 162. Function interfaces shown in this figure "understand" and manipulate collections directly, as evidenced by their function names.
FIG 23 shows example collection signature criteria policies that are based on filesystem attributes provided by a typical NCA filesystem API 163 implementation. The policies shown define collection signatures that are composed of various file names, suffixes, owners, timestamps, and other attributes provided by an NCA filesystem API 163 implementation.
Collection Detection - Upward Search Figures 24-26 show how collection recognizers can use up search algorithms to detect collection signatures. The main purpose of collection recognizer up searches is to identify the current working collection that is being used by an application program.
Automatic recognition of the current working collection allows automated programs to act more autonomously, and saves human workers the effort of manually identifying current working collections to programs.
FIG 24 shows an example up search algorithm for detecting a collection signature above the current working directory, using an NCA filesystem 163 search space.
FIG 25 shows pictorially how the up search algorithm of FIG 24 would proceed to change directories generally upward while attempting to detect a collection signature, using an NCA filesystem 163 search space. As shown by the arrows in FIG 25, a collection recognizer will change directories upward to find a collection signature that leads to a collection specifier. In this example, the search begins in the "images"
directory, and proceeds upward to the "s" directory, and thence to the "c-myhomepage"
directory, where a collection signature match is found. In this example, a valid collection signature is defined by the implementation to be a directory that contains a collection specifier file 102, FIG 25 Line 5, named "cspec".
FIG 26 shows an example up search algorithm for detecting a collection signature, using a CA 162 filesystem search space. It is worth noting how much simpler up search algorithms are for CA interfaces. Such interfaces provide the means to directly ask for collections that meet particular search criteria. No detailed programmatic manipulation of search space information is required of software programs that use CA
interfaces.
Collection Detection - Downward Search Figures 27-29 show how collection recognizers use down search algorithms to detect collection signatures. The main purpose of recognizer down searches is to detect and organize multiple collections within a search space into a logical group. This enables processing of the whole group of collections with a single processing command, thereby improving the productivity and efficiency of both automated programs and human information workers.
FIG 27 shows an example down search algorithm for detecting collection signatures below an initial starting directory, using a typical NCA 163 filesystem search space. This algorithm is appropriate for use in preferred filesystem implementations of collections.
Various kinds of tree traversal algorithms known to the art can be used successfully, as implementation preferences dictate.
FIG 28 shows how the down search algorithm of FIG 27 might sequentially visit all collections shown in the tree of FIG 16, in order according to (a) the depth of each collection within the tree, and to (b) the alphabetic sort order of each collection name. In particular, collections near the top of the tree are visited earlier, and collections with names that sort toward the front of the alphabet are visited earlier.
The term "visit order" refers to the order in which collections are visited by an application program. Different programs may calculate different visit orders using the same set of physical collections, according to the needs and policies of the program.
However, it is more convenient for human operators if all collection processing programs within an implementation environment follow the same visit order conventions. That way, human programmers can have more confidence that particular visit orders specified by them will actually be obeyed by automated programs.
FIG 29 shows an example down search algorithm for detecting collection signatures using a collection-aware 162 filesystem search space. Algorithms for CA search spaces are considerably less complex than algorithms for NCA search spaces because CA
interfaces "understand" collections and can therefore provide higher-level, collection-oriented functionality through the CA interface.
Collection Selection Figures 30-32 show how collection recognizers can select interesting collections from sets of detected collections. The main purpose of selection is to create a logical group of collections that have specific properties that are interesting to the programs that are driving the recognition process. For example, one application program might want to process collections of a particular collection type, whereas another application program might want to identify all collections that have no content files.
FIG 30 shows example policies for selecting interesting collections from sets of detected collections that have been produced by up search or down search algorithms. In particular, selection tests can be classified into two major groups: (1) outside-collection tests based on attributes of the search space such as filename, suffix, owner, timestamps, and (2) inside-collection tests based on attributes of the collection and the collection type definition.
FIG 31 shows example selection tests based on outside-collection (collection signature) and inside-collection (collection specifier, type, content) selection criteria. The content of specific selection tests is decided by the implementation, or by the recognizer invocation parameters.
FIG 32 shows an example collection specifier that contains a special command option Line 4 that requests collection recognizers to skip the host collection during detection activities. As a consequence of Line 4 in the collection specifier shown in FIG 32, the host collection owning the specifier would normally be excluded from all lists of collections (detected and selected) that were returned by a recognizer to a calling program.
Collection Recognition Figures 33-34 summarize the recognition process from the viewpoints of algorithm and information content.
FIG 33 shows an example high-level recognition algorithm that includes both detection and selection actions. The algorithm first performs a collection signature detection process to obtain collection specifier accessors. Having obtained the collection specifier accessors, the algorithm proceeds to read collection specifier, type, and content information, in preparation for selection testing. Selection testing is then performed.
Finally, the algorithm optionally derives more recognition information from the recognition process, and returns recognition process output to the calling program.
FIG 34 shows example recognition policy values for filesystem implementations of collections. Three sets of recognition policy values are shown.
The first policy set Lines 1-4 detects collections whose signatures contain a special filename "cspec", selects all detected collections, and admits all files in the subtree below the collection specifier file as collection content.
The second policy set Lines 5-8 detects collections whose signatures contain a special filename suffix ".cspec", selects only collections whose type is "html homepage", and uses content boundary information from the collection specifier to delimit content files for the selected collections.
The third policy set Lines 9-12 detects collections whose signatures contain a special hidden directory named ".collection", selects only collections that are C
programs named "helloworld", and uses content boundary information from the collection specifier to delimit content files for the selected collections.
The policy sets shown in FIG 34 are completely arbitrary, and are provided as examples only. In practice, recognition policies are decided by the implementation and by particular recognizes invocations.
Collections - Database Implementation Figures 35-40 show an example database implementation of collections. The main purpose of these figures is show that preferred implementations of collection search spaces are not limited to simple filesystems involving many files and directories. Instead, database implementations may well be more efficient and manageable for large-scale collection implementations. The specific characteristics and suitability of particular implementations for particular situations are matters to be decided by implementation designers.
FIG 35 shows an example database schema and query expression that might be used to represent and perform collection signature detection activities, using a database implementation of collections. Although this example uses a separate database table to hold collection signature information, signature information could also be part of a larger table that served other design requirements.
FIG 36 shows an example database schema and query expression that might be used to represent and perform collection specifies accessor calculation activities, using a database implementation of collections. Collection identifier values derived from the collection signature table of FIG 35 are used as keys into the collection accessor table of FIG 36.
FIG 37 shows an example database schema and query expression that might be used to represent and perform collection specifies value access activities, using a database implementation of collections. Collection accessor values obtained from the collection accessor table of FIG 36 are used as keys into the collection specifies values table of FIG
37.
FIG 38 shows an example database schema and query expression that might be used to represent and perform collection content access activities, using a database implementation of collections. In this example, collection content identifier values obtained from the collection specifies values table of FIG 37 are used as keys into the collection content table of FIG 38.
Specific information content, structure, and query chaining patterns among database tables are design policy matters that are determined by the implementation.
The examples shown here are for illustration purposes only.
FIG 39 shows an example high-level recognition algorithm that includes both detection and selection actions, for the sample database implementation of collections shown in previous diagrams. The example algorithm shown here parallels the structure and function of the example algorithm shown in FIG 33, but uses a database implementation instead of a filesystem implementation of collections.

FIG 40 shows example recognition policy decisions and values for the example database implementation of collections shown in previous diagrams.
Recognition-Enabled Applications Figures 41-42 compare high-level software architectures for non-collection-enabled and collection-enabled application programs.
FIG 41 shows simplified software architecture 170 for a generic, non-collection-enabled application program means 110. A non-collection-aware application program means 110 uses an application data server means 172 to provide data to the application.
In this figure, application program means 110 has no knowledge of collections, and cannot work with collections in meaningful, collection-oriented ways.
FIG 42 shows simplified software architecture 175 for a generic, collection-enabled application program. A collection-aware application program means 171 uses a collection recognizes means 130 to recognize collections stored on an application data server means 172. A collection recognizes means 130 would return a list of recognized collections back to said application program means 171, in a datastructure such as the one shown in FIG
15. Now having a list of recognized collections in its possession, the collection-aware application program means 171 can process the recognized collections in meaningful, collection-oriented ways.
Collection-aware application architecture 175 is appropriate for situations where it is both feasible and desirable to add collection support to application programs by making internal modifications to the application programs. Note that application program means 171 differs from NCA application program means 110 by the internal modifications required to integrate collection recognizes means 130 into said application architecture 175.
FIG 43 shows an alternate high-level architecture 177 for a generic, collection-enabled application program. Collection-aware architecture 177 is appropriate for situations in which an NCA application program means 110 cannot be internally modified to interact with a collection recognizes means 130 in the architectural pattern 175 shown in FIG 42.
Instead, a CA application wrapper program 176 provides a desired CA interface to users by using the services of both an NCA application program means 110 and a collection recognizes means 130.
Collection-aware application architecture 177 is appropriate for situations where it is not feasible and desirable to add internal collection support to NCA application program means 110 by making internal modifications to the application program means 110.
Instead, new wrapper programs are created to serve as new and value-added collection-aware interfaces to existing application programs.

CONCLUSION
The present collection recognizer invention is a general, customizable, extensible, and scalable solution to the collection recognition problem faced by automated collection processing systems.
In particular, collection recognizers provide programs with a practical means for recognizing and obtaining detailed information about interesting collections for processing, and thereby enable such automated systems to perform automated computations that were not possible before.
RAMIFICATIONS
Although the foregoing descriptions are specific, they should be considered as sample embodiments of the invention, and not as limitations. Those skilled in the art will understand that many other possible ramifications can be imagined without departing from the spirit and scope of the present invention.
General Software Ramifications The foregoing disclosure has recited particular combinations of program architecture, data structures, and algorithms to describe preferred embodiments. However, those of ordinary skill in the software art can appreciate that many other equivalent software embodiments are possible within the teachings of the present invention.
As one example, data structures have been described here as coherent single data structures for convenience of presentation. But information could also be could be spread across a different set of coherent data structures, or could be split into a plurality of smaller data structures for implementation convenience, without loss of purpose or functionality.
As a second example, particular software architectures have been presented here to more strongly associate primary algorithmic functions with primary modules in the software architectures. However, because software is so flexible, many different associations of algorithmic functionality and module architecture are also possible, without loss of purpose or technical capability. At the under-modularized extreme, all algorithmic functionality could be contained in one software module. At the over-modularized extreme, each tiny algorithmic function could be contained in a separate software module.
As a third example, particular simplified algorithms have been presented here to generally describe the primary algorithmic functions and operations of the invention.
However, those skilled in the software art know that other equivalent algorithms are also easily possible. For example, if independent data items are being processed, the algorithmic order of nested loops can be changed, the order of functionally treating items can be changed, and so on.
Those skilled in the software art can appreciate that architectural, algorithmic, and resource tradeoffs are ubiquitous in the software art, and are typically resolved by particular implementation choices made for particular reasons that are important for each implementation at the time of its construction. The architectures, algorithms, and data structures presented above comprise one such conceptual implementation, which was chosen to emphasize conceptual clarity.
From the above, it can be seen that there are many possible equivalent implementations of almost any software architecture or algorithm, regardless of most implementation differences that might exist. Thus when considering algorithmic and functional equivalence, the essential inputs, outputs, associations, and applications of information that truly characterize an algorithm should also be considered. These characteristics are much more fundamental to a software invention than are flexible architectures, simplified algorithms, or particular organizations of data structures.
Practical Applications Collection recognizers can be used in various practical applications One application is to improve the productivity of human computer programmers by providing them with an automated means of detecting and selecting interesting collections for processing.
Another application is to enable the construction of automated collection processing systems that are capable of detecting, selecting, and processing collections according to dynamic input values provided to the invocation.
Another application is to enable application programs to dynamically discover the current working collection at program invocation time, thereby helping the program to react to the current computational situation.
Other applications can also be imagined by those skilled in the art.
Functional Enhancements One possible functional enhancement is to modify a collection recognizer to work with various formats of collection specifiers and collection type definition. For example, a collection specifier could be modified to work with popular markup languages such as SGML, XML, or HTML, or with other more formally structured languages.
Collection Search Space Variations Example filesystem and database search space implementations were discussed in the foregoing specification. However, other search space mechanisms are also possible.

For example, in-memory search spaces could be used for greater speed, using datastructures known to the art, such as hash tables, lists, and tree structures. Similarly, network search spaces available by network protocol API means could be used for distributed implementations. In addition, these alternative search spaces could be either collection-aware or non-collection-aware implementations, as design considerations dictate.
Collection Identification Means The fundamental purpose of collection recognizers is to identify interesting collections on behalf of calling programs. Although the examples given here described collection recognizers as returning lists of recognized collections to calling programs, returning lists of collections is not the only method of identifying interesting collections.
As one alternative, a collection recognizes could physically mark or modify each recognized collection within the search space, thereby making it possible for other humans or programs to identify marked collections at a later time.
As another alternative, a collection recognizes could copy or relocate recognized collections into different physical locations, thereby identifying interesting collections by their new physical locations.
Still another alternative would be to write a list of interesting collections to an external location, for later use.
In all of these alternative collection identification means variations, the goals of recognition are accomplished without requiring a recognizes to return a list of recognized collections to a calling program. Even so, returning a list of collections in a collection recognition data structure is the preferred mechanism.
Alternative Implementations Each API means identified in the specification may be implemented by any functional API mechanism known to the art, including using command line program invocations, subroutine calls, interrupts, network protocols, remote procedure invocations, or other file and information passing techniques.
Each server means identified in the specification may be implemented by any functional server mechanism known to the art, including but not limited to database servers, local or network file servers, HTTP web servers, FTP servers, NFS servers, or servers that use other network communication protocols known to the art, such as TCP/IP.
Each server means identified in the specification may use a data storage means that may be implemented by any functional storage mechanism known to the art, including but not limited to magnetic or optical disk storage, digital memory such as RAM or flash memory, network storage device;. or other computer memory devices known to the art.
Each software component identi~'ied in the specification may optionally reside on a separate computer to form a distributed implementation. However, if a distributed implementation is not desired, all components may reside on the same computer.
Although collection and recognizer data structures have been described here as coherent single structures, other implementations are possible. For example, information could be split into a plurality of smaller data structures for implementation or communication convenience, without loss of functionality.
As can be seen by one of ordinary skill in the art, many other ramifications are also possible within the teachings of this disclosure. However, all implementations share the same general conceptual goal of enabling application programs to use collection recognizers to detect and select interesting collections from collection search spaces.
SCOPE
The present invention is not limited to any particular computer architecture, operating system, filesystem, database, or other software implementation.
Therefore the full scope of the present invention should be determined by the accompanying claims and their legal equivalents, rather than from the examples given in the specification.

Claims (35)

1. A collection recognizer method for making information about recognized collections available to software programs, to be performed on or with the aid of a programmable device, comprising the following steps:
(a) detecting collection signatures within a collection signature search space, thereby forming a set of detected collections, wherein collection signatures are particular sets of attributes from computer data storage media that are capable of indicating the presence of valid collections, and collection signature search spaces are searchable computer data storage mechanisms that are capable of storing collection signatures, and (b) making information about said detected collections available for use by software programs, wherein collections are data structures comprised of a collection specifier and collection content containing zero or more collection content files.
2. The method of claim 1, further comprising the steps of:
(a) obtaining collection content information for one or more said detected collections, and (b) making said collection content information available for use by software programs.
3. The method of claim 2, wherein (a) said step of obtaining collection content information defines collection content for a collection to include all files in a subtree that is rooted at a directory containing an associated collection specifier file for said collection.
4. The method of claim 2, wherein (a) said step of obtaining collection content information determines collection content for a collection in part by using associated collection type definition information for said collection.
5. The method of claim 1, further comprising the steps of:
(a) selecting collections from said set of detected collections, thereby forming a set of selected collections, and (b) making information about said selected collections available for use by software programs.
6. The method of claim 5, wherein (a) said step of selecting collections from the set of detected collections uses associated collection type definition information for collections being selected.
7. The method of claim 5, wherein (a) said step of selecting collections from the set of detected collections uses associated collection content information for collections being selected.
8. The method of claim 1, wherein (a) said step of detecting collection signatures within a collection signature search space uses collection specifier filename information.
9. The method of claim 1, wherein (a) said step of detecting collection signatures within a collection signature search space uses a collection signature up search algorithm.
10. The method of claim 1, wherein (a) said step of detecting collection signatures within a collection signature search space uses a collection signature down search algorithm.
11. The method of claim 1, wherein (a) said collection signature search space is a typical hierarchical computer filesystem.
12. The method of claim 1, wherein (a) said collection signature search space is implemented using a relational database.
13. The method of claim 1, wherein (a) said collection signature search space is implemented using a network protocol interface.
14. A programmable collection recognizer apparatus for making information about recognized collections available to software programs, comprising:
(a) means for detecting collection signatures within a collection signature search space, thereby forming a set of detected collections, wherein collection signatures are particular sets of attributes from computer data storage media that are capable of indicating the presence of valid collections, and collection signature search spaces are searchable computer data storage mechanisms that are capable of storing collection signatures, and (b) means for making information about said detected collections available for use by software programs, wherein collections are data structures comprised of a collection specifier and collection content containing zero or more collection content files.
15. The programmable apparatus of claim 14, further comprising:
(a) means for obtaining collection content information for one or more said detected collections, and (b) means for making said collection content information available for use by software programs.
16. The programmable apparatus of claim 15, wherein (a) said means for obtaining collection content information defines collection content for a collection to include all files in a subtree that is rooted at a directory containing an associated collection specifier file for said collection.
17. The programmable apparatus of claim 15, wherein (a) said means for obtaining collection content information determines collection content for a collection in part by using associated collection type definition information for said collection.
18. The programmable apparatus of claim 14, further comprising:
(a) means for selecting collections from said set of detected collections, thereby forming a set of selected collections, and (b) means for making information about said selected collections available for use by software programs.
19. The programmable apparatus of claim 18, wherein (a) said means for selecting collections from the set of detected collections uses associated collection type definition information for collections being selected.
20. The programmable apparatus of claim 18, wherein (a) said means for selecting collections from the set of detected collections uses associated collection content information for collections being selected.
21. The programmable apparatus of claim 14, wherein (a) said collection signature search space is implemented using a relational database.
22. The programmable apparatus of claim 14, wherein (a) said collection signature search space is implemented using a network protocol interface.
23. A computer program product, comprising a computer readable storage medium having computer readable program code means for making information about recognized collections available to software programs, the computer program product comprising computer readable program code means for (a) detecting collection signatures within a collection signature search space, thereby forming a set of detected collections, wherein collection signatures are particular sets of attributes from computer data storage media that are capable of indicating the presence of valid collections, and collection signature search spaces are searchable computer data storage mechanisms that are capable of storing collection signatures, and (b) making information about said detected collections available for use by software programs, wherein collections are data structures comprised of a collection specifier and collection content containing zero or more collection content files.
24. The computer program product of claim 23, further comprising computer readable program code means for:

(a) obtaining collection content information for one or more said detected collections, and (b) making said collection content information available for use by software programs.
25. The computer program product of claim 24, wherein (a) said means for obtaining collection content information defines collection content for a collection to include all files in a subtree that is rooted at a directory containing an associated collection specifier file for said collection.
26. The computer program product of claim 24, wherein (a) said means for obtaining collection content information determines collection content for a collection in part by using associated collection type definition information for said collection.
27. The computer program product of claim 23, further comprising computer readable program code means for:
(a) selecting collections from said set of detected collections, thereby forming a set of selected collections, and (b) making information about said selected collections available for use by software programs.
28. The computer program product of claim 27, wherein (a) said means for selecting collections from the set of detected collections uses associated collection type definition information for collections being selected.
29. The computer program product of claim 27, wherein (a) said means for selecting collections from the set of detected collections uses associated collection content information for collections being selected.
30. The computer program product of claim 23, wherein (a) said means for detecting collection signatures within a collection signature search space uses collection specifier filename information.
31. The computer program product of claim 23, wherein (a) said means for detecting collection signatures within a collection signature search space uses a collection signature up search algorithm.
32. The computer program product of claim 23, wherein (a) said means for detecting collection signatures within a collection signature search space uses a collection signature down search algorithm.
33. The computer program product of claim 23, wherein (a) said means for detecting collection signatures is capable of using a collection signature search space that is a hierarchical computer filesystem.
34. The computer program product of claim 23, wherein (a) said means for detecting collection signatures is capable of using a collection signature search space that is implemented using a relational database.
35. The computer program product of claim 23, wherein (a) said means for detecting collection signatures is capable of using a collection signature search space that is implemented using a network protocol interface.
CA002352736A 2001-06-21 2001-07-05 Collection recognizer Abandoned CA2352736A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/885080 2001-06-21
US09/885,080 US6768989B2 (en) 2001-06-21 2001-06-21 Collection recognizer

Publications (1)

Publication Number Publication Date
CA2352736A1 true CA2352736A1 (en) 2002-12-21

Family

ID=25386089

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002352736A Abandoned CA2352736A1 (en) 2001-06-21 2001-07-05 Collection recognizer

Country Status (2)

Country Link
US (1) US6768989B2 (en)
CA (1) CA2352736A1 (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7447703B2 (en) * 2001-06-21 2008-11-04 Coverity, Inc. Collection information manager
US6917947B2 (en) * 2001-06-21 2005-07-12 Kevin Wade Jameson Collection command applicator
US20040044653A1 (en) * 2002-08-27 2004-03-04 Jameson Kevin Wade Collection shortcut expander
US20040044668A1 (en) * 2002-08-27 2004-03-04 Jameson Kevin Wade Collection view expander
US20040044692A1 (en) * 2002-08-27 2004-03-04 Jameson Kevin Wade Collection storage system
US8732245B2 (en) * 2002-12-03 2014-05-20 Blackberry Limited Method, system and computer software product for pre-selecting a folder for a message
US20050234964A1 (en) * 2004-04-19 2005-10-20 Batra Virinder M System and method for creating dynamic workflows using web service signature matching
US8850209B2 (en) * 2006-09-12 2014-09-30 Microsoft Corporation Schema signing
US10853311B1 (en) * 2014-07-03 2020-12-01 Pure Storage, Inc. Administration through files in a storage system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6460036B1 (en) * 1994-11-29 2002-10-01 Pinpoint Incorporated System and method for providing customized electronic newspapers and target advertisements
US5941944A (en) * 1997-03-03 1999-08-24 Microsoft Corporation Method for providing a substitute for a requested inaccessible object by identifying substantially similar objects using weights corresponding to object features
US6374402B1 (en) * 1998-11-16 2002-04-16 Into Networks, Inc. Method and apparatus for installation abstraction in a secure content delivery system
US6564263B1 (en) * 1998-12-04 2003-05-13 International Business Machines Corporation Multimedia content description framework
US7506034B2 (en) * 2000-03-03 2009-03-17 Intel Corporation Methods and apparatus for off loading content servers through direct file transfer from a storage center to an end-user

Also Published As

Publication number Publication date
US6768989B2 (en) 2004-07-27
US20030084026A1 (en) 2003-05-01

Similar Documents

Publication Publication Date Title
US5581755A (en) Method for maintaining a history of system data and processes for an enterprise
US7020644B2 (en) Collection installable knowledge
US5557793A (en) In an object oriented repository, a method for treating a group of objects as a single object during execution of an operation
US7325007B2 (en) System and method for supporting non-native data types in a database API
RU2398275C2 (en) File system presented inside database
JP4222947B2 (en) Method, program, and system for representing multimedia content management objects
US20070106629A1 (en) System and method for accessing data
US8495510B2 (en) System and method for managing browser extensions
JPH03191467A (en) Method of discriminating document at- tribute
WO2000075849A2 (en) Method and apparatus for data access to heterogeneous data sources
US7543004B2 (en) Efficient support for workspace-local queries in a repository that supports file versioning
US8577865B2 (en) Document searching system
US20070156653A1 (en) Automated knowledge management system
US6735598B1 (en) Method and apparatus for integrating data from external sources into a database system
US6768989B2 (en) Collection recognizer
US20020089551A1 (en) Method and apparatus for displaying a thought network from a thought's perspective
CA2352643A1 (en) Collection content classifier
Koutrika et al. Rule-based query personalization in digital libraries
CA2352407C (en) Collection information manager
CN111666115B (en) Device, method and storage medium for searching engine plug-in
Brdjanin et al. On suitability of standard UML notation for relational database schema representation
GB2431257A (en) System and method for accessing data
US20040044692A1 (en) Collection storage system
JPH07210568A (en) File management device
Huang et al. A SPARQL query processing system using map-phase-multi join for big data in clouds

Legal Events

Date Code Title Description
EEER Examination request
FZDE Discontinued