WO2009021072A1

WO2009021072A1 - Interpretation analysis and decision support in research information systems

Info

Publication number: WO2009021072A1
Application number: PCT/US2008/072383
Authority: WO
Inventors: Joseph Fanelli; Lane Watson; Samuel Scott Beckey
Original assignee: Intelli-Services, Inc.
Priority date: 2007-08-06
Filing date: 2008-08-06
Publication date: 2009-02-12

Abstract

A system and method for implementation to a software application that processes, identifies, selects, analyzes, interprets, and presents research information is disclosed herein. More particularly, the system utilizes contextualized data, partially contextualized data and/or metadata to refine and process information. Further in the system, data is manipulated using tools and queries to extract particular data sets. Additionally, methods are provided to report, visualize, analyze and retain data. In a preferred embodiment, the present invention is realized through Information, Interrogation, and Interpretation phases.

Description

INTERPRETATION ANALYSIS AND DECISION SUPPORT IN RESEARCH

INFORMATION SYSTEMS

Inventors:

Joseph Fanelli, Escondido, CA; Samuel Scott Beckey, San Diego. CA; Lane Watson, Burlington, NC.

PRIORITY CLAIM

[0001] This patent application contains subject matter claiming benefit of the priority date of

United States Provisional Patent Application Serial No. 60/954,223 tiled on August 6, 2007, entitled METHOD OF PROCESSING INPUT DATA FOR INTERPRETATION AND DECISION SUPPORT OF SAME, accordingly, the entire contents of this provisional patent application is hereby expressly incorporated by reference.

BACKGROUND OF THE INVENTION

Field of the Invention

[0002] The present invention pertains generally management of intellectual property in a research environment. More specifically, the present invention relates to data integration, unification, record creation and interpretation in support of decision making regarding intellectual property.

Description of the Prior Art

[0003] Management of research information has become an increasing priority for commercial and institutional enterprises. Since research is often an unstructured process, simple and efficient information management systems have been difficult to apply in many cases. Electronic record keeping systems exist in parts of the research process, but are often supplemented and redundant to other systems that contain original source data. Where multiple information systems are provided, they can be incompatible with each other and common indices or metadata are often non-existent.

[0004] Research information and its associated metadata are critically important to accurately memorialize ownership of intellectual property. Certified and validated records are the basis for first to invent status in patentable subject matter. Additionally, they establish a basis for evidence in legal determinations, and further can be used in documentation of research for filings with government agencies such as the FDA.

[0005] Differing locations and multiple labs increase a need for information systems that allow comparability of data contemporaneously across geographic locations. Collaborative science is more and more important, and collaborators, both within and outside the enterprise, can best share information from databases.

[0006] Recording enough information so that the experiment is fully embodied in the data is still further a challenge. Often, although the data is collected in computers, incompatible databases and differing priorities for data retention mean that the researchers themselves are left to define means of data collection, storage, organization, and retention.

[0007] Fully contextualized data can be achieved through the techniques defined in our previous and concurrent patent application filings having common inventorship, specifically: PCT/US08/65557, entitled "'Electronic Voice-Enabled Laboratory Notebook;'" and another non- provisional patent application filed concurrently domestically herewith in the United States entitled, "'Multi-Dimensional Metadata in Research Recordkeeping." Once the contextualized information provided for in the concurrent patent application is available, the Interpretation, Analysis, and Decision Support in Research Recordkeeping can be utilized and is among the subject matter of the present invention.

[0008] There is a vast array of tools and techniques that can be used to process rich data as provided in the contextual database. Making those tools available to users with varying degrees of time and energy can greatly enhance the value of the data. Methods and processes for organizing such tools into a user workbench, or series of user workbenches is described herein. The selection of tools that will be provided begins with the most powerful, most accessible, and familiar techniques. Also included are the tools most suited to particular applications.

[0009] Accordingly it is an object of the present invention to provide an Interpretation,

Analysis, and Decision Support in Research Recordkeeping that allows users to produce value as they operate these tools, so that bringing new data into the application makes that data available to other subsequent users. It is further an object of the present invention to provide a system and method that captures the effort of identifying new sources of data (such as from public database sources) and adds value to the data through annotation and the creation of metadata that is further retained for each session. BRIEF SUMMARY OF THE INVENTION

[0010] The present invention specifically addresses and alleviates the above mentioned deficiencies, more specifically, the present invention in a first aspect is a system and method to contextualize data and subsequently organize and interpret the data. Contextual data comprises the addition of multi-dimensionai metadata to original data objects so that it can be oriented in terms of important indices. The indices are not simply unique and sorted values, but allow more complex structures, such as hierarchies, groupings, sets, and hash based indices for each dimension.

[0011] In another aspect, the invention is a method for presenting a range of data manipulation operations to a user. More specifically, the method is divisible to 'information, Interrogation, and Interpretation" phases, wherein tools and techniques are defined in each area. Au implementation of this may be as menu choices in a computerized graphical user interface.

[0012] The Information phase allows both internal and external data sources to be obtained, recognized, and mapped. AH connections are retained so that users may continue to add value to previous work. External data (e.g. from a non-contextualized database) is supplemented with metadata allowing it to be fully or partially contextυalized. Users are presented with tools to automatically, partially assist, or manually add contextual information to external data. The external data can remain external to take advantage of online updates, or can be imported after being contextualized. [0013] The Interrogation phase provides a range of manipulations on the collected data sources, resulting in a selection of information to be passed to the analysis and Interpretation phase. Graphical tools which understand the content of data objects are further provided, and as with the other phases, a plug-in architecture allows the method to be expanded as new techniques become available. Standard tools for the building of queries (such as SQL builders), graphical tools, and semantic tools are provided.

[0014] The Interpretation phase provides the user with standardized and easy to use operations. Users can easily invoke visualization routines that allow a computer interface to display and manipulate data as graphical objects, often in multiple dimensions. A library of visualizations are provided, and it is relatively easy to assign metadata and data object contents to each dimension. In addition to graphical display, the program provides a range of reporting options, allowing the resulting data to be formatted into printed reports, or exported into standard desktop applications like spreadsheets.

[0015] A further function of the Interpretation phase is to transfer data to a set of

Interpretation algorithms. Most of these interpretation algorithms are numerical techniques. Advanced AI operations involving machine learning are provided. The intent is to make a range of possible manipulations easy for the user to invoke, and allow experimentation with the data.

[0016] Still another aspect the invention is characterized as a method of development of eontextualized data from external data sources where a series of metadata indices is derived from existing data records, related data records, and user guided rules. Such metadata indices are further characterized in that they are not limited to strict database index rules, and may contain hierarchical, sequential, groups, sets and subsets, and other data structures as indices.

[0017] In yet still another aspect, the invention is characterized as a learning mechanism wherein a software application can retain information from and between multiple sessions allowing each users contributions to add to an enterprise-wide knowledge base, particularly as applied to the management of external data sources.

[0018] While the apparatus and method has or will be described for the sake of grammatical fluidity with functional explanations, it is to be expressly understood that the claims, unless expressly formulated under 35 USC 112, or similar applicable law, are not to be construed as necessarily limited in any way by the construction of "means" or "steps" limitations, but are to be accorded the full scope of the meaning and equivalents of the definition provided by the claims under the judicial doctrine of equivalents, and in the case where the claims are expressly formulated under 35 USC 112 are to be accorded full statutory equivalents under 35 USC 112, or similar applicable law. The invention can be better visualized by turning now to the following drawings wherein like elements are referenced by like numerals.

BRIEF DESCRIPTION OF THE DRAWINGS

[0019] The novel features of this invention, as well as the invention itself, both as to its structure and its operation, will be best understood from the accompanying drawings, taken in conjunction with the accompanying description, in which similar reference characters refer to similar parts, and in which: [0020] Fig. 1A and Fig. 1B are schematical illustrations showing a creation of contextualized, enhanced data objects from data sources;

[0021] Fig. 2 is a functional block diagram defining a three phase operation (Information,

Interrogation and Interpretation) of a preferred method for interpretation, analysis, and decision support:

[0022] Fig.3 is a schematical illustration defining the steps, features and capabilities of a preferred Information phase where data sources are managed and the method allows user created metadata to be associated with internal and external data sources;

[0023] Fig.4 is a schematical illustration showing a preferred Interrogation phase, where internal and external data, now in contextualized form, is subject to selection to produce a dataset for analysis, reporting, and visualization;

[0024] Fig. 5 is a schematical diagram showing tools made available in a preferred

Interpretation phase;

[0025] Fig. 6 is a functional block diagram of an exemplary application using laboratory data in the context of an electronic lab notebook application; and

[0026] Fig.7 is an additional schematical illustration showing relevant information regarding types of data and techniques used in each phases as described herein during the analysis of laboratory data. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0027] Broadly speaking, the present invention is a method and process for interpretation, analysis, and decision support beginning with contextualized data objects 100. By contextualized, we mean that metadata has been provided for the data objects such that they can be oriented in a multi-dimensional space defined for the applications. Embodied in a software program, this method and process provides an aided system for a researcher, scientist, or analyst to discover, understand, and present observations about the data.

[0028] By way of example and not by way of limitation, our focus has been on the application of the methods herein in the area of management of data objects obtained in research processes, such as in a laboratory. Those skilled in the art will appreciate that the method and process can be applied in other areas, such as legal research, forensics, or data mining in commercial applications.

[0029] Also by way example, we are referring to a machine based information storage mechanism where the information is organized according to the principles and techniques disclosed in the aforementioned U.S. Non-provisional patent application entitled "Multi-Dimensional Metadata in Research Recordkeeping." In general, this information can be thought of as data objects from various sources, supplemented by multiple metadata indices so that as much as possible of the context of the data is recorded. The ultimate goal is to establish enough metadata for each record so that complete reproducibility of an experiment is possible. [0030] Data objects can be received in many forms. Often, they are packets or datagrams received from laboratory instruments. They may also be structured records themselves, so that certain metadata is already contained in the data object. Other times, they may be received in well known computerized formats, such as JPG, MPG, TIFF, WAV, etc. There also exists both public and proprietary representations of certain scientific data, such as genomic sequences, chemical structures, and even 2-D and 3-D objects from computer assisted design (CAD) programs.

[0031] Now with reference to Fig. 1A and Fig. I B, a schematical representation of a typical contextualized data object 100 is shown. This object could be stored in a relational database as a record, maintained in an XML file, or kept in a variety of ways according to data processing practice. An original data object 10 represents the information received from some source. It 10 could be simple data such as a numeric value, or very complex information from a structured application. In a most complex case, the original data object 10 could be an entire database itself.

[0032] Typically, data import programs 120 are provided which accept data from specific types of sources. These 120 may be listening for information sent 121 by laboratory instruments in the example. Where import programs 120 are used, they may be sensitive to information embedded 111 in the data object 110 which can be extracted and used to construct indices 113, such as shown in Fig. 1B.

[0033] Additionally, other metadata indices 114 may be derived from other related records.

For example, if a laboratory instrument reports 121 loading a new inventory item based on a barcode, the inventory item may be applied to subsequent measurements from the instrument. This can also be useful across data sources 120, so that a change in context in one import program can be used in other import programs according to established rules.

[0034] As a last resort, it may occasionally be necessary to ask a researcher or user, in real time or after the fact, to establish context 115. According to the needs of the enterprise, such queries should be made in the least obtrusive manner to establish the contextualization 115.

[0035] An important distinction is drawn herein between relational database indices and the metadata indices 111-116 presented. The metadata indices may be references to other data structures, so that they may represent a place in a hierarchy, a grouping according to a tree structure, or some other technique. They 111-116 may or may not be unique.

[0036] Another function of import programs 120 may be to apply validation information 112 to the incoming data objects. By doing so, it is possible to establish ownership and security of the data objects 100 in order to address requirements for confidentiality, integrity, authentication, non- repudiation, and authorization.

[0037] Now referring to Fig. 2, a general process to perform interpretation, analysis, and decision support from multi-dimensional metadata enhanced databases, as well as other data sources is illustrated. This process is divided into three phases: (1 ) Information 20, (2) Interrogation 30, and (3) Interpretation 40. The process is implemented in a software program so that users are guided and facilitated through the three phases. Freedom of movement 21 between phases allows a user to go back and forth between phases 20, 30, 40. Fluid movement 21 is important as a recursive process that is typically used in research. Observations made in the Interpretation 40 phase, may result in new selections for Interrogation 30 and the software will retain such information, even between sessions. Similarly, operation of the program during the Information 20 phase will be retained and possibly remain available to other users.

[0038] The Information 20 phase is characterized as the management of data sources. In modem research, information is required not just from internal sources, but also from publicly available data. Examples include scientific databases, publications, results of internet searches, etc.

[0039] When accessing external data 201, the program retains information about sources and then applies metadata indices 215 in a similar manner to the import programs 120 discussed above. In a research information system, the use of the program itself creates data objects as a user interacts. These data objects represent recording work operations in a protocol and are themselves subject to multidimensional metadata indices.

[0040] By internal data 100, we refer to information that is already processed by data import programs 120 and is stored in the multi-dimensional metadata structure 111, 113-116. The Information 20 phase applies this structure to external data 201 so that it can be analyzed, compared, and interpreted as well.

[0041] Fig. 3 more specifically illustrates the Information 20 phase according to a preferred embodiment of the present invention. In this phase the user can explore available data sources 100, 201, 210, 211, choose the data sources (o use, and assign in both an automated and manual manner the metadata indices necessary for each data source. Work performed in the Information 20 phase is saved, shared and re-usable by the user and others. [0042] By assigning metadata indices to source data 100, 201, 210, 211, we produce contextualized data 210 which has more value than the original information. By finding data sources 100, 201, 210, 211 the user is building a more comprehensive collection of information for the rest of the enterprise. This information supplements the research data already in the system, including the data produced by the research organization itself. This data is located in internal data sources 100 and should already contain multidimensional metadata indices 111, 113-116.

[0043] New sources of data are obtained from outside data sources 201, typically the internet, but also perhaps from collaborators, subscription databases, or external applications. In some cases the method of transfer 202 will be defined by available standards, such as XML, SQL, HTML, or other methods. In other cases, a transfer program may be constructed similar to the data import programs 120 used for other research data sources. In either case, the program will provide a plug-in architecture 212 so that data import programs may be continuously updated, re-used, and managed for all users.

[0044] Using data import programs 120 and standard provided transfers, the process creates contextualized data 210 in the Information 20 phase. Where information is embedded 111 in the imported data objects, rich data sources 211 yield context by interpreting the data objects themselves. In addition, metadata assignment rules 213 may be created and reviewed by the user as the data is selected. Users themselves can assign 214 metadata where necessary. The metadata index structures 215 are as defined in the aforementioned U.S. Pat. App. entitled "Multi- Dimensional Metadata in Research Recordkeeping," filed concurrently herewith, and are not necessarily the same as relational indices. [0045] The method in a preferred embodiment herein further provides an information workbench 220 to the user allowing them to browse and select from an annotated card catalog of available data sources 221. These data sources are additionally retained and shared according to enterprise rules so that the work of identifying and indexing new information is shared 222 and maintained for other users in the enterprise.

[0046] Information from external data sources 201 may be transferred to internal databases, but this is not a requirement, and in some cases may be undesirable. Instead, the shared libraries of data sources 222, along with the conditions for producing multi-dimensional metadata indices are maintained, and the original data continues to reside externally 201.

[0047] Using workbench 220, the user selects a subset 223 of the available data sources that are to be passed to the Interrogation 30 phase. Fig. 4 illustrates the Interrogation 30 phase of the process and method. Both internal 100 and external 201 data sources have been selected and are contextualized 210 with multi-dimensional metadata 215. Users can now build their query, or ask questions of the data sources 100, 201. To do this the program provides an Interrogation 30 workbench 220 supplied with graphical tools 31 to create interrogations.

[0048] Yet again, a plug-in architecture is provided so that new query tools 31-36 can be incorporated. Using a combination of these tools 31-36, the user creates a subset of the data objects 37 as well as components from within those data objects to be used in the final Interpretation 40 phase. [0049] Tools 31 for selection in the Interrogation 30 phase include graphical 31 and traditional 32 query builders, structure editors 33, data specific queries 35, and intelligent queries based on structured 36 data in enterprise systems.

[0050] In a preferred embodiment shown in Fig. 4, primary tools on the Interrogation 30 workbench include: (a) graphical query builders 31, (b) traditional SQL editors 32, and (c) XML structure editors 33. Specialized features of the graphical query builders include the ability to operate with complex metadata index structures, such as hierarchy browsers for some of the metadata types.

[0051] Additional selection capacity is provided by specialized selection tools 34. Selection tools allow information within data objects 100 to be accessed and only those data objects 100 of interest to be retained. Selection 34 is possible by simple numeric lists, ranges, and groups, but also based on semantic (language based) criteria, where text is searched for matching or not matching strings. Advanced semantic selection can take into account the meaning of words.

[0052] Selection criteria 34 that are related to complex searches or extremely complex data, can be accomplished through the use of hashes, signatures, and indices where the complexity of data is abbreviated in a shorter form that can be more easily manipulated.

[0053] Using the plug-in architecture as explained herein, specific queries 35 for specialized data types can be created. Scientific data, for example, can represent more complex information. Where these data types exist, specialized query tools can be applied. In scientific data, sequence matching patterns for finding patterns in genomic data are common. Molecular structures can be represented in standard proprietary and non-proprietary formats and can be matched as structures and substructures with specific searches.

[0054] Image processing 37 is also accommodated through these plug-ins. By defining image processing algorithms and outputs, the user can select images based on specific characteristics. Open sources tools, such as imageJ developed by the National Institutes of Health (NIH) or other more proprietary tools can be used to create queries. In some cases, it will be necessary to use such tools to preprocess a large number of images, creating additional metadata which represents the outcome of image processing algorithms. Examples of applications of this technique would include: (a) counting the number and proportions of stained objects (cells) in a microscope image, (b) measuring and quantifying prevalent color information in a ehromatograph, or (c) applying techniques to measure the size of objects (tumors perhaps) in a image from radiology.

[0055] In any case, the results of Interrogation 30 phase are used directly, or passed on as a selection of data objects 41 from the internal 100 and external 201 data sources, to the Interpretation 40 phase.

[0056] With reference now to Fig. 5, features and functions of the Interpretation 40 phase of the preferred method herein are illustrated. Users are presented with a graphical user interface providing menu access to three functions. Initially, reporting 44 uses spreadsheet functionality to design reports in tabular form. Users build reports, or select a series of pre-formatted reports from menus. Where the user builds a report, is can be stored, annotated, and re-used within the enterprise. [0057] Reporting 44 techniques include plug-ins to aliow reporting of many original data object formats. In some cases, graphical images are part of the report, if they were included in the data object. The architecture of the application allows various report writers to be included or substituted as required by the user or enterprise needs. Reports can be output in standard formats or exported to spreadsheets or other data manipulation programs.

[0058] Visualization 42 herein allows the user to explore large datasets using the metadata indices 215 as dimensions. Multi-dimensional visualization can be provided by computer graphics and mapping data to multiple dimensional surfaces allowing data to be explored in real-time on the computer screen.

[0059] Topology based visualizations 42 allow any information in the original data object to be used to generate either presentation axes or brought up on selection of a specific point in multidimensional space.

[0060] Further processing of selected records is accomplished under the menu header of numerical methods 43. The method and process used in the program allows an expandable set of numerical methods algorithms to be presented. Using an API (application programming interface) developed for the program, the program provides a flexible mechanism to transfer information to numerical algorithms. For instance, as an object oriented JAVA class.

[0061] Numerical methods 43 further can include the following representative categories as are well defined in the literature. Using standard source code available through academic, software, and scientific organizations, a range of numerical methods can be obtained. [0062] Still further, statistical Tools 431 such as standard statistical libraries can be easily applied. Commands for statistical libraries make available a variety of techniques which are not commonly understood. As a result, researchers are encouraged to use the best available techniques. Allowing annotations to the tools can allow users to share information about appropriate tools with each other within the enterprise.

[0063] Additionally, machine learning tools allow the computer to attempt to detect relations in large datasets. Three recent techniques include support vector machinery 432, neural networks 433, and boost algorithms 434.

[0064] Discovering relationships between multidimensional data 215 can be aided with clustering algorithms 435. Such tools attempt to find related data by reorganizing information across each of the available dimensions and looking for significant peaks and valleys.

[0065] Pattern Recognition 436 is a means to take large sets of arrayed or matrixed information in defined dimensions and look at repeating patterns. This technique may be used, for example, in scientific laboratory data and can detect certain artifacts or patterns of control experiments. Allowing the program to include and annotate previously found patterns allows the algorithms to be repeatedly applied.

[0066] As stated herein, outcomes from the Interpretation 40 phase can then be recursively returned 21 to earlier phases 20, 30 to refine the data being used and operated upon. Differential 437 and regression 438 algorithms allow the system to search for relationships (cause and effect) across each of the available multiple dimensions 215. [0067] Statistical validation 439 can be used as a QA/QC technique in the example scientific research laboratory. Additional methods include confirmatory methods 441 and cross validation techniques 442.

[0068] Using data specific pattern detection and recognition, the Interpretation 40 phase can include specific algorithms and setup data for application specific processing. Artifact detection in images, control well mapping in array data, and probability based validation 439 of data can be pre- configured. As users create additional processing, the method herein allows the software program to retain each new capability for later use by the same or different users. Using metadata indices 215 to detect similar experiments in the past, the program can suggest comparisons of outcomes, possible discrepancies, and ambiguities in relation to previous data. Quality Control and Quality Assurance algorithms are further provided and trends can be observed in experimental outcomes when protocols are similar.

[0069] Lastly, a user can end a session by storing, annotating, and/or returning selected data objects 41 for further refinement 49.

[0070] In Fig. 6, the application of these techniques to a laboratory notebook application is given as an example. Fig. 7 shows a sequence of information that could be used in the example laboratory notebook application and some of the functions that would be presented to the user.

QUERY SCENARIOS

[0071] The following questions are examples of the types of information that users should access and that are very difficult to answer with present levels of information integration in research based applications.

[0072] Management Decision Support:

[0073] "What new data has been produced under a specific research program?"

[0074] "What members of the staff contributed to the discovery of a new target compound?"

[0075] "How much of our research is devoted to a specific therapeutic area?"

[0076] Operations Queries:

[0077] "When and by whom was this protocol used before, and what were the results?"

[0078] "Show me all the microscope images obtained when the species was mice and the disease target was pancreatic cancer."

[0079] "Same as above, but count the number of cell nuclei in each 200x magnification slide."

[0080] "What is the standard deviation of the ratio of stained cells to unstained cells in ail the images representing blood disease in rats where the disease target is lymphoma?" [0081] "Have we seen other unusual results using this family of protocols where the reagents were provided by this supplier?"

[0082] Many alterations and modifications may be made by those having ordinary skill in the art without departing from the spirit and scope of the invention. Therefore, it must be understood that the illustrated embodiments have been set forth only for the purposes of example and that it should not be taken as limiting the invention as defined by the following claims. For example, notwithstanding the fact that the elements of a claim are set forth below in a certain combination, it must be expressly understood that the invention includes other combinations of fewer, more or different elements, which are disclosed above even when not initially claimed in such combinations.

[0083] While the particular Interpretation Analysis And Decision Support in Research

Recordkeeping as herein shown and disclosed in detail is fully capable of obtaining the objects and providing the advantages herein before stated, it is to be understood that it is merely illustrative of the presently preferred embodiments of the invention and that no limitations are intended to the details of construction or design herein shown other than as described in the appended claims.

[0084] Insubstantial changes from the claimed subject matter as viewed by a person with ordinary skill in the art, now known or later devised, are expressly contemplated as being equivalently within the scope of the claims. Therefore, obvious substitutions now or later known to one with ordinary skill in the art are defined to be within the scope of the defined elements.

Claims

CLAIMSWhat is claimed is:

1. A guided system for performing analysis of intellectual properly records comprising: a collection of data objects (100) from a plurality of different sources (100, 201, 210,

211); a plurality of metadata indices (215) wherein the data objects are organized using dimensions of the metadata indices; a plurality of tools (e.g. 214, 221, 31 , 32, 33, 34, 35, 36, 37, 42.43, 44. 431, 432, 433, 434, 435, 436, 437, 438, 439, 442), the tools offered to assist a user in data information organization (20), interrogation (30), and interpretation (40).

2. The system of claim 1 , wherein the collection of data objects comprise data generated from an electronic recordkeeping system in a scientific laboratory.

3. A system for the manipulation of data comprising: a software application wherein a user is guided by the software application through a three phase process comprising: an Information (20) phase providing metadata indices (215) for internal (100) and external (210) data; an Interrogation (30) phase providing selection (34) of data objects (41 ) based on a query (e.g. 32, 35, 36); and an Interpretation (40) phase providing an array of possible presentation and analysis capabilities (e.g. 42, 43, 44, 431, 432, 433, 434, 435, 436, 437, 438, 439, 441, 442) the selection of data objects.

4. The system of claim 3 wherein a plug-in architecture (e.g. 212) determines available features and operations in each of the Information, Interrogation, and Interpretation phases.