US5655116A - Apparatus and methods for retrieving information - Google Patents

Apparatus and methods for retrieving information Download PDF

Info

Publication number
US5655116A
US5655116A US08/203,082 US20308294A US5655116A US 5655116 A US5655116 A US 5655116A US 20308294 A US20308294 A US 20308294A US 5655116 A US5655116 A US 5655116A
Authority
US
United States
Prior art keywords
information
query
knowledge base
set forth
concepts
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US08/203,082
Inventor
Thomas Kirk
Alon Yitzchak Levy
Divesh Srivastava
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alcatel Lucent SAS
AT&T Corp
Sound View Innovations LLC
Original Assignee
Lucent Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lucent Technologies Inc filed Critical Lucent Technologies Inc
Priority to US08/203,082 priority Critical patent/US5655116A/en
Assigned to AMERICAN TELEPHONE AND TELEGRAPH COMPANY reassignment AMERICAN TELEPHONE AND TELEGRAPH COMPANY ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KIRK, THOMAS, LEVY, ALON YITZCHAK, SRIVASTAVA, DIVESH
Priority to US08/347,016 priority patent/US5600831A/en
Priority to US08/394,867 priority patent/US5768578A/en
Priority to JP7522489A priority patent/JPH09503088A/en
Priority to EP95911924A priority patent/EP0696366A4/en
Priority to CA002161233A priority patent/CA2161233A1/en
Priority to PCT/US1995/002338 priority patent/WO1995023371A1/en
Assigned to AT&T IPM CORP. reassignment AT&T IPM CORP. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: AT&T CORP.
Assigned to AT&T CORP. reassignment AT&T CORP. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: AMERICAN TELELPHONE AND TELEGRAPH COMPANY
Assigned to LUCENT TECHNOLOGIES INC. reassignment LUCENT TECHNOLOGIES INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: AT&T CORP.
Publication of US5655116A publication Critical patent/US5655116A/en
Application granted granted Critical
Assigned to THE CHASE MANHATTAN BANK, AS COLLATERAL AGENT reassignment THE CHASE MANHATTAN BANK, AS COLLATERAL AGENT CONDITIONAL ASSIGNMENT OF AND SECURITY INTEREST IN PATENT RIGHTS Assignors: LUCENT TECHNOLOGIES INC. (DE CORPORATION)
Assigned to LUCENT TECHNOLOGIES INC. reassignment LUCENT TECHNOLOGIES INC. TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENT RIGHTS Assignors: JPMORGAN CHASE BANK, N.A. (FORMERLY KNOWN AS THE CHASE MANHATTAN BANK), AS ADMINISTRATIVE AGENT
Assigned to CREDIT SUISSE AG reassignment CREDIT SUISSE AG SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ALCATEL-LUCENT USA INC.
Assigned to SOUND VIEW INNOVATIONS, LLC reassignment SOUND VIEW INNOVATIONS, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ALCATEL LUCENT
Assigned to ALCATEL-LUCENT USA INC. reassignment ALCATEL-LUCENT USA INC. RELEASE OF SECURITY INTEREST Assignors: CREDIT SUISSE AG
Anticipated expiration legal-status Critical
Assigned to ALCATEL-LUCENT USA INC. reassignment ALCATEL-LUCENT USA INC. RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: CREDIT SUISSE AG
Assigned to ALCATEL-LUCENT USA INC. reassignment ALCATEL-LUCENT USA INC. MERGER AND CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: ALCATEL USA MARKETING, INC., ALCATEL USA SOURCING, INC., ALCATEL-LUCENT USA INC., LUCENT TECHNOLOGIES, INC.
Assigned to NOKIA OF AMERICA CORPORATION reassignment NOKIA OF AMERICA CORPORATION CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: ALCATEL-LUCENT USA INC.
Assigned to ALCATEL LUCENT reassignment ALCATEL LUCENT NUNC PRO TUNC ASSIGNMENT (SEE DOCUMENT FOR DETAILS). Assignors: NOKIA OF AMERICA CORPORATION
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2452Query translation
    • G06F16/24524Access plan code generation and invalidation; Reuse of access plans
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing

Definitions

  • the invention concerns information retrieval generally. More specifically, the invention concerns the use of a knowledge base to integrate multiple sources of information into one uniform view.
  • each subject area has a number and all of the books about the subject area have that number. Keeping track of the physical locations is done by giving each number in the cataloging system a place on the shelves and putting the books having the number in that place. Maps and labels on the shelves tell users of the library where to look for a book. Finding a book thus involves going to the card catalogs, looking up the subject category in the catalog to find the catalog number, and then using the map to find the shelf where books having that catalog number are stored.
  • the invention integrates information about the location and access of the information into the information retrieval system by adding the information to the knowledge base which is used to provide the conceptual organization of the information.
  • the knowledge base not only includes a world view made up of the concepts which are employed in conceptual queries made to the system, but also a system view made up of concepts which indicate how the sources of the information are to be accessed.
  • the system responds to a user's conceptual query, it uses concepts in both the world view and the system view to produce an information access description.
  • the information access description describes how the information is to be accessed in the information sources available locally or by means of the network.
  • the information access description is interpreted in another component of the invention to produce the protocols required to retrieve the information needed to answer the query.
  • system view is part of the knowledge base
  • knowledge base can be used to change how the protocols are generated in the course of the search. For example, results of one part of the search may be returned, and those results and the system view may be used to alter how the remainder of the search is carried out.
  • FIG. 1 is a conceptual overview of an information retrieval apparatus which employs the principles of the present invention.
  • FIG. 1 presents an overview of an information retrieval apparatus 101 which incorporates the principles of the invention.
  • a preferred embodiment of information retrieval apparatus is implemented using a digital computer system and information sources which are accessible via the Internet communications network.
  • the central component of apparatus 101 is a knowledge base 109 built upon a description logic based knowledge representation system CLASSIC capable of performing inferences of classification, subsumption, and completion.
  • Knowledge-base systems are described generally in Jeffery D. Ullman, Principles of Database and Knowledge-base Systems, Vols. I-II, Computer Science Press, Rockville, Md., 1989. Descriptions of CLASSIC may be found in Alex Borgida, Ronald Brachman, Deborah McGuinness, and Lori Resnick, "CLASSIC: A Structural Data Model for Objects", in Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data, pp. 59-67, 1989, R. J.
  • Knowledge base 109 is used to construct a domain model 111 which organizes information accessible via apparatus 101 into a set of concepts which fit the manner in which the user of system 101 is intending to view and use the information.
  • domain model 111 has three components: world view 115, which contains concepts corresponding to the way in which a user of the system looks at the information being retrieved, system/network view 117, which contains concepts corresponding to the way in which the information is described in the context of the data bases which contain it and the communications protocols through which it is accessed, and information source descriptions 113, which contains concepts describing the information sources at a conceptual level.
  • System/network view 117 and information source descriptions 113 are normally not visible to the user.
  • the concepts in these portions of domain model 111 do, however, participate fully in the reasoning processes that determine how to satisfy a query.
  • Information sources 123 are generally (though not limited to) network-based information servers that are accessed by standard internet communication protocols. Sources can also include databases, ordinary files and directories, and other knowledge bases.
  • the user interacts with the system through a graphical user interface 103 .
  • the two primary modes of interaction supported by this interface are querying and browsing. In both cases the user expresses both browsing and querying operations in terms of concepts from "world view" portion 115 of domain model 111.
  • a knowledge base browser in CLASSIC 109 allows the user to view and interactively explore the concept taxonomy.
  • the concept taxonomy is represented graphically as a directed graph 105, where the nodes correspond to concepts and edges indicate parent/child relationships among concepts.
  • the knowledge base browser also serves as an editor, allowing the user to define new concepts in terms of existing ones.
  • the classification inferences in knowledge representation system 109 automatically place new concepts at the correct place in the taxonomy with respect to existing concepts.
  • the query language used in system 101 is based on CLASSIC, but has additional constructors that enable the user to express queries more easily.
  • the query is formulated in terms of the concepts and objects that appear in the world view part 115 of the knowledge base.
  • Query translator 107 translates queries expressed in the query language into CLASSIC description language expressions which are used to consult the knowledge base. Due to the limited expressive power of the description language and the need for special purpose query operators, the query language may contain elements not expressible in the description language of knowledge representation system 109. After partial translation to a description language expression, the remaining fragments of the query are translated to procedural code that is executed as part of the query evaluation.
  • the knowledge base is a virtual information store in the sense that the information artifacts themselves remain external to the knowledge base; the system instead stores detailed information (in terms of domain model 111) about the location of these information artifacts and how to retrieve them. Retrieval of a particular piece of information is done on demand, when it is needed to satisfy part of a query.
  • the types of information managed in this manner include files, directories, indexes, databases, etc.
  • the domain model embodied in the knowledge base is logically decomposed into world view 115, system/network view 117, and information source descriptions 113
  • World view 115 is the set of concepts with which the user interacts and queries are expressed.
  • System/network view 117 concerns low level details which, though essential for generating successful query results, are normally of no interest to the user.
  • Information source descriptions 113 is a collection of concepts for describing information sources. These information source descriptions are expressed in terms of both world and system concepts. The purpose of encoding information source descriptions 113 in the domain model is to make it possible for CLASSIC to reason about what information sources must be consulted in order to satisfy a query.
  • system concepts comprising system/network view 117 as those concepts that describe the low-level details of information access. This includes concepts related to network communication protocols, location addressing, storage formats, index types, network topology and connectivity, etc. Since the knowledge base generally merely retrieves information instead of storing previously-retrieved information, system/network view 117 includes all those concepts relevant to determining attributes like location, retrieval methods, and content format.
  • concepts within world view 115 describe things with which the user is familiar; they are the concepts that describe characteristics of information artifacts of interest to users.
  • Concepts within information source descriptions 113 relate the concepts in world view 115 to concepts concerning the semantic content of information sources.
  • knowledge representation system 109 can employ the concepts in information source descriptions 113 to relate the concepts used in the query to actual information sources and can employ system/network view 117 to relate the concepts used in the query to an access plan which describes how to retrieve information from the sources as required to answer the query.
  • query translator 107 translates the query into a form to which knowledge representation system 109 can respond. Then the translated query .is analyzed in knowledge base system 109 to decide which of the external information sources are relevant to the query, and which subqueries need to be sent to each information source. This step uses world view 115 and system/network view 117. The information in system/network view 117 is expressed in a site description language which will be described in more detail later.
  • Knowledge base 109 uses The conceptual information from world view 115 and system/network view 117 to produce an information access description describing how to access the information required for the query in information sources 123.
  • Knowledge base 109 provides the information access description to access plan generation and execution component 119, which formulates an access plan including the actual commands needed to retrieve the information from sources 123.
  • Plan formulation Given the information access description, planner 119 decides on the order in which to access sources 123 and how the partial answers will be combined in order to answer the user's query. The key distinction between this step and traditional database techniques is that planner 119 can change the plan after partial answers are obtained. Replanning may of course involve inferences based on concepts from information source descriptions 113 and/or system/network view 117 and the results of the search thus far.
  • Plan materialization The previous step produced a plan at the level of logical source accesses. This step takes these logical accesses and translates them to specific network commands. This phase has two aspects:
  • Specific network commands are generated to access the sites.
  • information from the system/network view is taken into account.
  • the system will generate the appropriate commands for performing the access.
  • the translations to service and site-specific access commands are performed by Information Access Protocol Modules 121 (0 . . . n), described in the following section.
  • system 101 uses a work space in the computer system upon which system 101 is implemented to store its intermediate results.
  • system 101 may decide to replan for the rest of the query.
  • Access to information sources is done using a variety of standard information access protocols.
  • the purpose of these modules is to translate generic information access operations (retrieval, listing collections, searching indexes) into corresponding operations of the form expected by the information source. For many standard Internet access protocols, the translation is straightforward.
  • Examples of access protocols supported by these modules include several network protocols defined by Internet RFC draft standard documents, including FTP (File Transfer Protocol), Gopher, NNTP (Network News Transfer Protocol), HTTP (Hypertext Transfer Protocol).
  • FTP File Transfer Protocol
  • Gopher Gopher
  • NNTP Network News Transfer Protocol
  • HTTP Hypertext Transfer Protocol
  • other modules support access to local (as opposed to network-based) information repositories, such as local filesystems and databases.
  • the concepts in information source descriptions 113 relate concepts in world view 115 to information sources 123. These relationships are expressed using a site description language.
  • CLASSIC and related knowledge representation systems employ description languages which can function as site description languages, but such site description languages do not permit efficient reasoning. In a preferred embodiment, efficiency has been substantially increased by the use of a site description language which extends CLASSIC.
  • a description language consists of three types of entities: concepts (representing unary relations), roles (binary relations) and individuals (object constants).
  • Concepts can be defined in terms of descriptions that specify the properties that individuals must satisfy to belong to the concept.
  • Binary relationships between objects are referred to as roles and are used to construct complex descriptions for defining concepts.
  • Description logics vary by the type of constructors available in the language used to construct descriptions. Description logics are very convenient for representing and reasoning in domains with rich hierarchical structure. Description languages other than the one uses in CLASSIC exist and may be used as starting points for site description languages.
  • the concept customer is a primitive concept that includes all customers and specifically the disjoint subconcepts Business and Residential. Each instance of a business customer has a role BusinessType, specifying the types of business it performs. Given these primitive concepts, we can define a concept TravelAgent by the description (AND Business (fills BusinessType "Travel”)).
  • Quote(ag,al,src,dest,c,d) denotes that a travel agent ag quoted a price of c to travel from src to dest on airline al on date d.
  • Dir(cust,ac,telNo) gives the directory listing of customer cust as area code ac and phone number telNo.
  • Arbitrary constraints are formed from atomic constraints using logical operators ⁇ axed V. CLASSIC can determine efficiently whether one constraint entails another using subsumption reasoning in the description logic. Other well-known techniques are used for implication reasoning of order constraints.
  • constraints play a major role in information gathering and are used in several ways.
  • semantic knowledge about the general n-ary relations E can be expressed by constraints over the arguments of the relations.
  • the first argument of the relation Quote must be an instance of the concept TravelAgent.
  • constraints can be used to specify subsets of information that exist at external sites. For example, a travel gent may have only flights whose cost is less than $1000.
  • constraints are extremely useful in specifying complex queries.
  • Constraints may be used together with concepts and knowledge base relations to describe properties of extensions of the knowledge base relations, that is, information specified by the knowledge base relations and the properties.
  • the information in the extension may come from the knowledge base, but most often it will come from one or more of the information sources 123.
  • constraints contain only concepts whose extensions exists in the knowledge base.
  • the knowledge base system Given a query (defined formally below), the knowledge base system must infer the missing portions of the extensions of relations needed to answer the query, using the information present at the external sites.
  • the knowledge base can also be viewed as an information source containing part of the extensions.
  • each L i is either a constraint or an atom of a relation in E ⁇ I, and L is a relation in I.
  • the predicates in I may be recursively defined. For example, we might be interested in the following query that retrieves phone numbers of travel agents in New York City who sell tickets from NYC to Paris for under $500, on Air France:
  • SL 0 is the language of Horn rules containing relation names from R, E and D, and constraints.
  • the site relations R appear only in the antecedents of the rules.
  • a set of rules in SL 0 enables us to compute extensions of the knowledge base relations in E and D, given the relations in R.
  • Agent1DB (al,dest,cost,date) ⁇ (cost ⁇ $1000).
  • Agent0 is an agent that sells tickets only on United Airlines. Agent1 specializes in cheap deals out of New York City. Note that the third rule specifies a constraint on the information in Agent1DB, and is allowed in SL 0 . We also have three sites containing directory information:
  • the site 212Residential (212Business) contains the residential (business) customers in the 212 area code.
  • the site 800Dir contains all the toll-free numbers.
  • Answering a specific query first requires that knowledge base system 109 determine which sources 123 contain information that can be used to answer the query. Having found the set of relevant sources 123, the next problem is to devise an optimal information access description. After access plan generation and execution component 119 has generated an access plan from the access description and executed the access plan to retrieve the information, knowledge representation system 109 uses the information to compute extensions to knowledge base relations and then uses the knowledge base as extended to compute the answer to the query. The challenge in this procedure is to determine the minimal portions of the site relations that are relevant to answering the query. Clearly, for some site relations the minimal portion can be empty, indicating that the site relation does not contain any relevant information.
  • Agent0 since we are looking for a flight on AirFrance, Agent0 will be deemed irrelevant, and therefore, Agent0DB will be ignored. Similarly, the 800 directory listing database will not be queried. Moreover, since the first argument of Quote must be an instance of the concept TravelAgent, and since TravelAgent is subsumed by Business, which is disjoint from Residential, only the 212 business directory listing Will be considered for the query.
  • Processing updates on the knowledge base requires updating relevant site relations and hence, determining the relevant sites.
  • Maintaining consistency among site relations again requires that we determine which sites contain information relevant to a given consistency condition.
  • Finding the relevant sites is done by extending the algorithm described in Alon Y. Levy and Yehoshua Sagiv, "Constraints and Redundacy in Datalog", Proceedings of the Eleventh ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, San Diego, Calif., 1992.
  • the key observation that enables us to use that algorithm is that the language for expressing constraints (concept descriptions and order constraints) satisfies the requirements of the query-tree algorithm outlined in that paper.
  • Finding minimal portions of the sites is done in two steps. The first step determines which portions of the knowledge base relations are needed to solve the query, and the second step determines which portions of the site relations are needed to compute the relevant portions of the knowledge base relations.
  • the algorithm uses the query-tree, which is a tool that, given a query which is expressed in terms of certain relations will specify which portions of the mentioned relations are relevant to the query.
  • the first step is done by building a query-tree for the user query, in terms of the knowledge base relations, and pushing the constraints from the query to the KB relations.
  • the second step is done by building a query-tree for each relevant KB relation (which is defined in terms of the external sites), and pushing the constraints to the external site relations.
  • SL 0 A major limitation of SL 0 is that we cannot express knowledge about the sites that enables finding such a non-redundant set. For instance, suppose that in our example, the query was to find some flight from NYC to Paris on United Airlines. There is no way in SL 0 to express the fact that Agent0 contains all the information about flights on United. Had we been able to express this knowledge, we could have determined that Agent0 by itself would suffice for answering this query, and hence, that knowledge of Agent1 would have been redundant.
  • the language SL 1 has several important properties. First, it enables us to express knowledge stating that a certain site contains complete information of a certain type. Second, given arbitrary formulas in SL 1 and a set of site relations, it is possible to uniquely determine the extensions of the knowledge base relations. Finally, given query, we show that an agent can find a non-redundant site set for computing the answer to the query. That is, the agent can find a set of minimal portions of sites such a that there are no redundancies among these portions. This notion is made formal as follows:
  • a site set is a set ⁇ (R 1 ,C 1 ), . . . , (R n ,C n ) ⁇ , denoting the set of facts of the site relations R i satisfying the constraints C i .
  • the site relations contain the same attributes of the knowledge base relations. Often, a site relation will be specialized w.r.t. a knowledge base relation and therefore will contain fewer attributes.
  • Agent0DB does not contain the name of the agent as an attribute, nor the name of the airline.
  • the site relation R projects out the arguments y of E.
  • One such restriction is to require that all elements of y be constants.
  • Another is to allow the rule to specify linear functions of x that uniquely determine y. Adding the restricted completion axioms can be done as before.
  • Agent0 has all the flights of United Airlines:
  • SIMS site description language
  • both systems must first find the relevant sites and then access them.
  • both of these tasks are approached as planning problems.
  • system 101 at least the first task can be done by the more economical techniques of logical inference.
  • SIMS does not allow arbitrary n-ary relations, and therefore, the system must map each external relation to a concept in LOOM. This can be done only when the relation has a primary key (i.e., an attribute that uniquely determines the rest of the attributes of the tuple). Although one can always conceptually add another such attribute to a relation, modeling a relation in such a way is unnatural.
  • the source is connected to a communications network to which the system has access;
  • the source has a protocol for obtaining information which is known to the system.

Abstract

A query translator translates a query between a graphical user interface and a knowledge representation system. The knowledge representation system reformulates the query and generates an access plan to access data requested by the query. The access plan utilizes several different protocols to access the query information located in dissimilar databases distributed throughout a network. The knowledge representation system generates the access plan by first processing the query through a world view which defines the information in conceptual terms that a human being would understand and then processes the query through a system/network view which redefines the query into network and database access information so that the data requested by the query can be located. Placing the world view and the system network view in the knowledge representation system enables real-time intelligence to be used in the search process by providing a feedback loop between the searched databases and the knowledge representation system.

Description

BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention concerns information retrieval generally. More specifically, the invention concerns the use of a knowledge base to integrate multiple sources of information into one uniform view.
2. Description of the Prior Art
People have been collecting information for a long time. Even the ancient world had its libraries and archives. In our day, we have astronomical amounts of information stored in libraries, archives, and data base systems around the world. In the relative recent past, we have begun to connect the data base systems to communications networks, so that a user at a workstation anywhere in the world can quickly access information in a data base system anywhere else in the world.
Ever since people began collecting information, they have had two problems with their collections:
Keeping track of the physical locations of items of information in the collection; and
Imposing some kind of conceptual organization on the information.
In the context of a traditional library, the conceptual organization is provided by the cataloging system; for example, in the Dewey decimal cataloging system, each subject area has a number and all of the books about the subject area have that number. Keeping track of the physical locations is done by giving each number in the cataloging system a place on the shelves and putting the books having the number in that place. Maps and labels on the shelves tell users of the library where to look for a book. Finding a book thus involves going to the card catalogs, looking up the subject category in the catalog to find the catalog number, and then using the map to find the shelf where books having that catalog number are stored.
As the size of a collection of information increases, it becomes more and more difficult to get from a concept to the physical location of the information. For example, many very large traditional libraries do not permit ordinary users to go to the shelves and get a book. Instead, the user looks the book up in the card catalog and writes the title, author, and catalog number on a request slip. A specialist in finding books on the shelves then goes and gets the book for the user. A major disadvantage of such a system is that it does not permit the user to look up one book on a subject in the card catalog and then go to the shelf and browse to see what else is there.
While data base systems and networks have enormously increased the accessibility of information, they have made the problems of keeping track of the physical location and imposing a conceptual organization even more difficult. Keeping track of the physical location now involves not only knowing which of the enormous number of interconnected collections of information contains the information the user wants, but also knowing what sequences of commands (or protocols) are required to access the information over the network. Imposition of a conceptual organization has also become more difficult. Unlike human librarians, computers cannot deal directly with concepts. For example, a computer is helpless with a request like "tell me everything you know about Napoleon's youth", since it has no idea either that Napoleon is a historical person or what period of time could reasonably be termed his "youth". Before the computer can do anything, the request must be broken down so that the computer searches for the right Napoleon in a historical data base instead of a cooking data base and searches over the span of time which makes up the first 21 years of that Napoleon's life.
One technique which is now being used to impose an organization is to interpose a knowledge base system between the user and the data base systems which contain the information. In this technique, the conceptual organization of the information is provided by the knowledge base. Queries involving concepts are made to the knowledge base, which translates them into the commands needed to reference the data base system. See for example European Patent Application 0 542 430 A2, Alexander Borgida and Ronald Brachman, Information Access Apparatus and Methods, published May 19, 1993.
Attempts are also being made to build information retrieval systems which not only employ knowledge bases to impose a conceptual organization, but also to access information across a network. One example of such a system is that being built by the SIMS project, described in Yigal Arens and Craig A. Knoblock, "Planning and Reformulating Queries for Semantically-modeled Multidatabase Systems", in: Proceedings of the First International Conference on Information and Knowledge Management, Baltimore, Md.,1992. Problems left unsolved by these attempts include efficient location of the relevant information sources and the manner in which the system represents its knowledge about the location of the information. It is an object of the present invention to solve these and other problems and thereby to provide more efficient and usable information access methods and apparatus.
SUMMARY OF THE INVENTION
The invention integrates information about the location and access of the information into the information retrieval system by adding the information to the knowledge base which is used to provide the conceptual organization of the information. In the information retrieval system of the invention, the knowledge base not only includes a world view made up of the concepts which are employed in conceptual queries made to the system, but also a system view made up of concepts which indicate how the sources of the information are to be accessed. When the system responds to a user's conceptual query, it uses concepts in both the world view and the system view to produce an information access description. The information access description describes how the information is to be accessed in the information sources available locally or by means of the network. The information access description is interpreted in another component of the invention to produce the protocols required to retrieve the information needed to answer the query.
An important advantage of the fact that the system view is part of the knowledge base is that the knowledge base can be used to change how the protocols are generated in the course of the search. For example, results of one part of the search may be returned, and those results and the system view may be used to alter how the remainder of the search is carried out.
Other objects and advantages of the apparatus and methods disclosed herein will be apparent to those of ordinary skill in the art upon perusal of the following Drawing and Detailed Description, wherein:
BRIEF DESCRIPTION OF THE DRAWING
FIG. 1 is a conceptual overview of an information retrieval apparatus which employs the principles of the present invention.
Reference numbers in the Drawing have two parts: the two least-significant digits are the number of an item in a figure; the remaining digits are the number of the figure in which the item first appears. Thus, an item with the reference number 103 first appears in FIG. 1.
DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT
The following Detailed Description begins with an overview of the architecture and continues with a detailed description of the Site Description Language used in a preferred embodiment.
ARCHITECTURE Architecture Overview
FIG. 1 presents an overview of an information retrieval apparatus 101 which incorporates the principles of the invention. A preferred embodiment of information retrieval apparatus is implemented using a digital computer system and information sources which are accessible via the Internet communications network.
The central component of apparatus 101 is a knowledge base 109 built upon a description logic based knowledge representation system CLASSIC capable of performing inferences of classification, subsumption, and completion. Knowledge-base systems are described generally in Jeffery D. Ullman, Principles of Database and Knowledge-base Systems, Vols. I-II, Computer Science Press, Rockville, Md., 1989. Descriptions of CLASSIC may be found in Alex Borgida, Ronald Brachman, Deborah McGuinness, and Lori Resnick, "CLASSIC: A Structural Data Model for Objects", in Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data, pp. 59-67, 1989, R. J. Brachman, et al., "Living with CLASSIC", in: J. Sowa, ed., Principles of Semantic Networks: Explorations in the Representations of Knowledge, Morgan-Kaufmann, 1991, pp. 401-456, and L. A. Resnick, et al., CLASSIC: The CLASSIC User's Manual, AT&T Bell Laboratories Technical Report, 1991.
Knowledge base 109 is used to construct a domain model 111 which organizes information accessible via apparatus 101 into a set of concepts which fit the manner in which the user of system 101 is intending to view and use the information. In system 101, domain model 111 has three components: world view 115, which contains concepts corresponding to the way in which a user of the system looks at the information being retrieved, system/network view 117, which contains concepts corresponding to the way in which the information is described in the context of the data bases which contain it and the communications protocols through which it is accessed, and information source descriptions 113, which contains concepts describing the information sources at a conceptual level. System/network view 117 and information source descriptions 113 are normally not visible to the user. The concepts in these portions of domain model 111 do, however, participate fully in the reasoning processes that determine how to satisfy a query.
An important benefit of using a description logic system like CLASSIC is that as new information is added to the system, much of the work of organizing the new information with respect to the concepts already in knowledge base 109 is done automatically. Only a description of the known attributes of the information must be specified; the CLASSIC's inference mechanisms then automatically classifies these descriptions into appropriate places in the concept hierarchy.
User interaction with the system is accomplished through browsing and querying operations in terms of high-level concepts (concepts that are meaningful to a user unsophisticated in the details for information location and access). These concepts are intended to reflect the terms in which the user thinks about the type and content of information being queried. By working with these high-level concepts, the user is unburdened with the details of the location and distribution of information across multiple remote information servers.
Information sources 123 are generally (though not limited to) network-based information servers that are accessed by standard internet communication protocols. Sources can also include databases, ordinary files and directories, and other knowledge bases.
User Interface
The user interacts with the system through a graphical user interface 103 . The two primary modes of interaction supported by this interface are querying and browsing. In both cases the user expresses both browsing and querying operations in terms of concepts from "world view" portion 115 of domain model 111.
A knowledge base browser in CLASSIC 109 allows the user to view and interactively explore the concept taxonomy. The concept taxonomy is represented graphically as a directed graph 105, where the nodes correspond to concepts and edges indicate parent/child relationships among concepts.
To support extension of the concept taxonomy, the knowledge base browser also serves as an editor, allowing the user to define new concepts in terms of existing ones. The classification inferences in knowledge representation system 109 automatically place new concepts at the correct place in the taxonomy with respect to existing concepts.
Since both the high-level world concepts 115 and low-level system concepts 117 coexist in a single domain model 111, an important role of user interface 103 is to filter the system concepts out of the view seen by the user in query results and in the taxonomy browser.
Query Translator 107
The query language used in system 101 is based on CLASSIC, but has additional constructors that enable the user to express queries more easily. The query is formulated in terms of the concepts and objects that appear in the world view part 115 of the knowledge base. Query translator 107 translates queries expressed in the query language into CLASSIC description language expressions which are used to consult the knowledge base. Due to the limited expressive power of the description language and the need for special purpose query operators, the query language may contain elements not expressible in the description language of knowledge representation system 109. After partial translation to a description language expression, the remaining fragments of the query are translated to procedural code that is executed as part of the query evaluation.
Knowledge Representation System 109
The knowledge base is a virtual information store in the sense that the information artifacts themselves remain external to the knowledge base; the system instead stores detailed information (in terms of domain model 111) about the location of these information artifacts and how to retrieve them. Retrieval of a particular piece of information is done on demand, when it is needed to satisfy part of a query. The types of information managed in this manner include files, directories, indexes, databases, etc.
The domain model embodied in the knowledge base is logically decomposed into world view 115, system/network view 117, and information source descriptions 113 World view 115 is the set of concepts with which the user interacts and queries are expressed. System/network view 117 concerns low level details which, though essential for generating successful query results, are normally of no interest to the user. Information source descriptions 113 is a collection of concepts for describing information sources. These information source descriptions are expressed in terms of both world and system concepts. The purpose of encoding information source descriptions 113 in the domain model is to make it possible for CLASSIC to reason about what information sources must be consulted in order to satisfy a query.
We define system concepts comprising system/network view 117 as those concepts that describe the low-level details of information access. This includes concepts related to network communication protocols, location addressing, storage formats, index types, network topology and connectivity, etc. Since the knowledge base generally merely retrieves information instead of storing previously-retrieved information, system/network view 117 includes all those concepts relevant to determining attributes like location, retrieval methods, and content format.
Continuing in more detail, concepts within world view 115 describe things with which the user is familiar; they are the concepts that describe characteristics of information artifacts of interest to users. Concepts within information source descriptions 113 relate the concepts in world view 115 to concepts concerning the semantic content of information sources. Thus, given a query which employs concepts in world view 115, knowledge representation system 109 can employ the concepts in information source descriptions 113 to relate the concepts used in the query to actual information sources and can employ system/network view 117 to relate the concepts used in the query to an access plan which describes how to retrieve information from the sources as required to answer the query.
Access Plan Generation and Execution
When a user wishes to obtain information, the user inputs a query in system 101's query language at graphical user interface 103. System 101 then answers the query. There are several steps involved. First, query translator 107 translates the query into a form to which knowledge representation system 109 can respond. Then the translated query .is analyzed in knowledge base system 109 to decide which of the external information sources are relevant to the query, and which subqueries need to be sent to each information source. This step uses world view 115 and system/network view 117. The information in system/network view 117 is expressed in a site description language which will be described in more detail later.
Knowledge base 109 uses The conceptual information from world view 115 and system/network view 117 to produce an information access description describing how to access the information required for the query in information sources 123. Knowledge base 109 provides the information access description to access plan generation and execution component 119, which formulates an access plan including the actual commands needed to retrieve the information from sources 123.
1. Plan formulation: Given the information access description, planner 119 decides on the order in which to access sources 123 and how the partial answers will be combined in order to answer the user's query. The key distinction between this step and traditional database techniques is that planner 119 can change the plan after partial answers are obtained. Replanning may of course involve inferences based on concepts from information source descriptions 113 and/or system/network view 117 and the results of the search thus far.
2. Plan materialization: The previous step produced a plan at the level of logical source accesses. This step takes these logical accesses and translates them to specific network commands. This phase has two aspects:
Format translation: the description of the sites is given at a logical level. However, to actually access the site, one must conform to a syntax of a specific query language. In this step, these translations are done.
Specific network commands are generated to access the sites. Here, information from the system/network view is taken into account. Depending on the site being accessed, the system will generate the appropriate commands for performing the access.
The translations to service and site-specific access commands are performed by Information Access Protocol Modules 121 (0 . . . n), described in the following section.
Several points should be noted about the above process:
In executing the plan, system 101 uses a work space in the computer system upon which system 101 is implemented to store its intermediate results.
After executing part of the plan, system 101 may decide to replan for the rest of the query.
Information Access Protocol Modules 121
Access to information sources is done using a variety of standard information access protocols. The purpose of these modules is to translate generic information access operations (retrieval, listing collections, searching indexes) into corresponding operations of the form expected by the information source. For many standard Internet access protocols, the translation is straightforward.
Examples of access protocols supported by these modules include several network protocols defined by Internet RFC draft standard documents, including FTP (File Transfer Protocol), Gopher, NNTP (Network News Transfer Protocol), HTTP (Hypertext Transfer Protocol). In addition, other modules support access to local (as opposed to network-based) information repositories, such as local filesystems and databases.
Site Description Language
As previously pointed out, the concepts in information source descriptions 113 relate concepts in world view 115 to information sources 123. These relationships are expressed using a site description language. CLASSIC and related knowledge representation systems employ description languages which can function as site description languages, but such site description languages do not permit efficient reasoning. In a preferred embodiment, efficiency has been substantially increased by the use of a site description language which extends CLASSIC.
The following discussion of the site description language employed in the preferred embodiment employs the example below:
Consider an application in which we can obtain information about airline flights from various travel agents. We have access to fares given by specific travel agents and to telephone directory information to obtain their phone numbers. In practice, the information about price quotes and telephone listings may be distributed across different external database servers which contain different portions of the information. For example, some travel agent may deal only with domestic travel, another may deal with certain airlines. Some travel brokers deal only with last minute reservations, e.g., flights originating in the next one week. Similarly, directory information may be distributed by area code. In some area codes, all listings may be in one database, while others may partition residential and business customers.
The starting point for the site description language is the description language used in CLASSIC. A description language consists of three types of entities: concepts (representing unary relations), roles (binary relations) and individuals (object constants). Concepts can be defined in terms of descriptions that specify the properties that individuals must satisfy to belong to the concept. Binary relationships between objects are referred to as roles and are used to construct complex descriptions for defining concepts. Description logics vary by the type of constructors available in the language used to construct descriptions. Description logics are very convenient for representing and reasoning in domains with rich hierarchical structure. Description languages other than the one uses in CLASSIC exist and may be used as starting points for site description languages. The only requirement is that the question of subsumption (i.e., does a description D1 always contain a description D2) be decidable. We denote the concepts in our representation language by D=D1, . . . , Dl.
In our example, we can have a hierarchy of concepts describing various types of telephone customers. The concept customer is a primitive concept that includes all customers and specifically the disjoint subconcepts Business and Residential. Each instance of a business customer has a role BusinessType, specifying the types of business it performs. Given these primitive concepts, we can define a concept TravelAgent by the description (AND Business (fills BusinessType "Travel")).
One limitation of description languages is that they do not naturally model general n-ary relations (A relation may be thought of as a a table with columns and rows. An n-ary relation has n columns.) n-ary relations arise very commonly in practice and dealing with such relations is essential to modeling external information sources that contain arbitrary relational databases. Hence our representation language augments description languages with a set of general n-ary relations E=E1, . . . , En. It should be emphasized that the general n-ary relations are not part of the description language. Hereafter, we refer to the set of relations E U D as the knowledge base relations, to distinguish them from relations stored outside knowledge representation system 109. Our application domain is naturally conceptualized by the following two relations:
Quote(ag,al,src,dest,c,d), denotes that a travel agent ag quoted a price of c to travel from src to dest on airline al on date d.
Dir(cust,ac,telNo), gives the directory listing of customer cust as area code ac and phone number telNo.
A key aspect of our representation language is the ability to capture rich semantic structure using constraints, with which CLASSIC can reason efficiently. An atomic constraint is an atom either of the form D(x), where D is some concept in D, and x is a variable, or (xi θxj) (or (xi θa)) where xi and xj are variables, α is a constant and θ.di-elect cons. {>,≧,<,≦,=,≠}. Arbitrary constraints are formed from atomic constraints using logical operators Λ axed V. CLASSIC can determine efficiently whether one constraint entails another using subsumption reasoning in the description logic. Other well-known techniques are used for implication reasoning of order constraints. For details, see the Ullman reference cited above. Any atomic constraint may be used about which implication/subsumption reasoning can be done efficiently. Constraints play a major role in information gathering and are used in several ways. First, semantic knowledge about the general n-ary relations E can be expressed by constraints over the arguments of the relations. In our example, we can specify that the first argument of the relation Quote must be an instance of the concept TravelAgent. Second, as we discuss in subsequent sections, constraints can be used to specify subsets of information that exist at external sites. For example, a travel gent may have only flights whose cost is less than $1000. Finally, as we see below, constraints are extremely useful in specifying complex queries.
Constraints may be used together with concepts and knowledge base relations to describe properties of extensions of the knowledge base relations, that is, information specified by the knowledge base relations and the properties. The information in the extension may come from the knowledge base, but most often it will come from one or more of the information sources 123. We assume that the definitions of the concepts exist in the knowledge base, although the extensions of the concepts and the relations may not be entirely present in the knowledge base. However, we assume that constraints contain only concepts whose extensions exists in the knowledge base.
Given a query (defined formally below), the knowledge base system must infer the missing portions of the extensions of relations needed to answer the query, using the information present at the external sites. For the purpose of our discussion, the knowledge base can also be viewed as an information source containing part of the extensions.
The query language that we use in our discussion combines the use of concepts from description logics and Horn rules, as described in Francesco Donini, M. Lenzefini, D. Nardi, and A. Schaerf", "A Hybrid System Integrating Datalog and Concept Languages", in Working notes of the AAAI Fall Symposium on Principles of Hybrid Reasoning, 1991. A query is essentially a relation defined in terms of a set of Horn rules, using the relations E, intermediate relations I, and constraints. Each rule is of the form
L.sub.1 (X.sub.1)Λ . . . ΛL.sub.n (x.sub.n)→L(x).
where each Li is either a constraint or an atom of a relation in E ∪ I, and L is a relation in I. Note that the predicates in I (and therefore, the query predicate) may be recursively defined. For example, we might be interested in the following query that retrieves phone numbers of travel agents in New York City who sell tickets from NYC to Paris for under $500, on Air France:
Quote(name,AirFrance,NYC,Paris,c,d)ΛDir(name,ac,telNo)Λ(c.ltoreq.$500)Λ(ac=212)→Answer(telNo).
The Site Description Language SL0
We now describe the site description language in detail. Since the number of information sources 123 is likely to be very large and the cost of accessing them will be significant, it is important that the site description language permit knowledge representation system 109 to find a minimal set of relevant sites (or portions of sites).
In our discussion we assume that the external sites contain extensions of relations, denoted by R=R1, . . . , Rm. (Recall that concepts are unary relations and therefore, a site can contain the instances of a concept.) Note that a site need not explicitly store a certain relation, but only be able to compute it effectively when queried for it. Formally, SL0 is the language of Horn rules containing relation names from R, E and D, and constraints. The site relations R appear only in the antecedents of the rules. Intuitively, a set of rules in SL0 enables us to compute extensions of the knowledge base relations in E and D, given the relations in R. Given a set of rules, we add the formulas required by the predicate completion of the predicates in E (but not R!). See K. Clark, "Negation as Failure" in: Logic and Data Bases, Plenum Press, ed. J. Minker and H. Gallaire, New York, 1978, pp. 293-322. Intuitively, this means that extensions of these relations can be computed only using the given rules. Note that if we want to model incomplete information about some of the relations in E, then we can remove the predicate completion axioms for those relations. In our example, we can describe the following two travel agents:
Agent0DB(src,dest,cost,date)→Quote(Agent0,United,src,dest,cost,date).
Agent1DB(al,dest,cost,date)→Quote(Agent1,al,NYC,dest,cost,date).
Agent1DB(al,dest,cost,date)→(cost≦$1000).
Agent0 is an agent that sells tickets only on United Airlines. Agent1 specializes in cheap deals out of New York City. Note that the third rule specifies a constraint on the information in Agent1DB, and is allowed in SL0. We also have three sites containing directory information:
212Residential(name,telNo)→Dir(name,212,telNo).
212Residential(name,telNo)→Residential(name).
212Business(name,telNo)→Dir(name,212,telNo).
212Business(name,telNo)→Business(name).
800Dir(name,telNo)→Dir(name,800,telNo).
The site 212Residential (212Business) contains the residential (business) customers in the 212 area code. The site 800Dir contains all the toll-free numbers.
Answering a specific query first requires that knowledge base system 109 determine which sources 123 contain information that can be used to answer the query. Having found the set of relevant sources 123, the next problem is to devise an optimal information access description. After access plan generation and execution component 119 has generated an access plan from the access description and executed the access plan to retrieve the information, knowledge representation system 109 uses the information to compute extensions to knowledge base relations and then uses the knowledge base as extended to compute the answer to the query. The challenge in this procedure is to determine the minimal portions of the site relations that are relevant to answering the query. Clearly, for some site relations the minimal portion can be empty, indicating that the site relation does not contain any relevant information.
In our example, since we are looking for a flight on AirFrance, Agent0 will be deemed irrelevant, and therefore, Agent0DB will be ignored. Similarly, the 800 directory listing database will not be queried. Moreover, since the first argument of Quote must be an instance of the concept TravelAgent, and since TravelAgent is subsumed by Business, which is disjoint from Residential, only the 212 business directory listing Will be considered for the query.
It should be realized that the problem of finding relevant sites is a crucial problem for system 101. Economical solutions to the problem are important not only for answering queries, but also for other operations. Examples include
Processing updates on the knowledge base requires updating relevant site relations and hence, determining the relevant sites.
Efficiently monitoring queries over time requires determining precisely which external site relations should be monitored.
Maintaining consistency among site relations again requires that we determine which sites contain information relevant to a given consistency condition.
Finding the relevant sites is done by extending the algorithm described in Alon Y. Levy and Yehoshua Sagiv, "Constraints and Redundacy in Datalog", Proceedings of the Eleventh ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, San Diego, Calif., 1992. The key observation that enables us to use that algorithm is that the language for expressing constraints (concept descriptions and order constraints) satisfies the requirements of the query-tree algorithm outlined in that paper. Finding minimal portions of the sites is done in two steps. The first step determines which portions of the knowledge base relations are needed to solve the query, and the second step determines which portions of the site relations are needed to compute the relevant portions of the knowledge base relations. The algorithm uses the query-tree, which is a tool that, given a query which is expressed in terms of certain relations will specify which portions of the mentioned relations are relevant to the query. The first step is done by building a query-tree for the user query, in terms of the knowledge base relations, and pushing the constraints from the query to the KB relations. The second step is done by building a query-tree for each relevant KB relation (which is defined in terms of the external sites), and pushing the constraints to the external site relations.
The Site Description Language SL1
Although the site description language SL0 is quite expressive, it has some limitations. In practice, information may reside redundantly in many sources 123. Therefore, an important challenge is to find a set of site portions such that each portion is minimal and such that the overall set does not contain redundancies.
A major limitation of SL0 is that we cannot express knowledge about the sites that enables finding such a non-redundant set. For instance, suppose that in our example, the query was to find some flight from NYC to Paris on United Airlines. There is no way in SL0 to express the fact that Agent0 contains all the information about flights on United. Had we been able to express this knowledge, we could have determined that Agent0 by itself would suffice for answering this query, and hence, that knowledge of Agent1 would have been redundant.
The reason we cannot express this knowledge is that in SL0, the site relations cannot appear in the consequents of the rules. In the extended language SL1 which we describe below, we add additional rules which are weaker than the predicate completion axioms, and enable us to compute the knowledge base relations from the site relations. The language SL1 contains the formulas in SL0 and restricted forms of Horn rules in which the site relations appear in the consequents. Formally, except for rules in SL0, knowledge about the relation R is specified by a set of rules of the form:
P.sub.1 (x)ΛC.sub.1 →R(x).
P.sub.k (x)ΛC.sub.k →R(x).
where Pi is a relation in E or D, and Ci is a constraint. Furthermore, (for reasons we explain below) we restrict the rules such that the constraints are pairwise mutually exclusive, i.e., for each pair, 1≦i<j≦k, it must be the case that Ci ΛCj is unsatisfiable. These rules specify that R contains complete information about the facts of Pi that satisfy Ci. Given the rules above, we add the following rules that enable us to use R to compute the extensions of the knowledge base relations:
R(x)ΛC.sub.1 →P.sub.1 (x).
R(x)ΛC.sub.k →P.sub.k (x).
Intuitively, this means that if we stated that R contains all the facts of Pi that satisfy Ci, we take that also to mean that any fact in R that satisfies Ci indeed belongs to Pi. Since the Ci 's are pairwise mutually exclusive, we obtain intuitive results. Note that if the constraints were not pairwise exclusive, as in rules r2 and r3, we would infer that some facts are in the intersection of the knowledge base relations.
The language SL1 has several important properties. First, it enables us to express knowledge stating that a certain site contains complete information of a certain type. Second, given arbitrary formulas in SL1 and a set of site relations, it is possible to uniquely determine the extensions of the knowledge base relations. Finally, given query, we show that an agent can find a non-redundant site set for computing the answer to the query. That is, the agent can find a set of minimal portions of sites such a that there are no redundancies among these portions. This notion is made formal as follows:
A site set is a set {(R1,C1), . . . , (Rn,Cn)}, denoting the set of facts of the site relations Ri satisfying the constraints Ci.
In our example, if we have a server with the listing of all travel agents in the greater NYC area (area codes 212 and 718), we can specify its contents as follows:
Dir(name,ac,telNo)Λ((ac=212)V(ac=718))ΛTravelAgent(name).fwdarw.NYTravel(name,ac,telNo).
Given our query that restricts travel agents to the 212 area code, we can use this knowledge to determine that only the NYTravel directory listing is needed.
Allowing Specialized Sites
In the SL1 rules we described above, the site relations contain the same attributes of the knowledge base relations. Often, a site relation will be specialized w.r.t. a knowledge base relation and therefore will contain fewer attributes. For example, Agent0DB does not contain the name of the agent as an attribute, nor the name of the airline. To express completeness knowledge about such sites, we allow for rules of the following form with a few restrictions:
E(x,y)ΛC→R(x).
In this rule, the site relation R projects out the arguments y of E. We impose restrictions on the rule that guarantee that the variables y can be determined uniquely, given x. One such restriction is to require that all elements of y be constants. Another is to allow the rule to specify linear functions of x that uniquely determine y. Adding the restricted completion axioms can be done as before.
As an example of using such rules, we can specify that Agent0 has all the flights of United Airlines:
Quote(Agent0,United,src,dest,cost,date)→Agent0DB(src,dest,cost,date).
The importance of using a site description language in information source descriptions 113 can be seen by a comparison of system 101 with the SIMS system. Both systems must first find the relevant sites and then access them. In SIMS, both of these tasks are approached as planning problems. In system 101, at least the first task can be done by the more economical techniques of logical inference. Moreover, SIMS does not allow arbitrary n-ary relations, and therefore, the system must map each external relation to a concept in LOOM. This can be done only when the relation has a primary key (i.e., an attribute that uniquely determines the rest of the attributes of the tuple). Although one can always conceptually add another such attribute to a relation, modeling a relation in such a way is unnatural.
Conclusion
The foregoing Detailed Description has disclosed to those of skill in the art how to make and use an information retrieval system which permits conceptual queries of heterogeneous information sources which are connected by means of a communications network. Important aspects of the system disclosed herein include the incorporation of a system/network view in the knowledge base and the use of a more expressive site description language in the information source descriptions component of the knowledge base. These aspects of the system permit the system to be used to access any source of information which satisfies two properties:
the source is connected to a communications network to which the system has access; and
the source has a protocol for obtaining information which is known to the system.
Put another way, there is no requirement that a source have any kind of special adapter in order to be accessed by the system of the invention. The foregoing aspects of the system further permit the use of inference techniques instead of the more expensive planning techniques to determine which sources are to be accessed.
As is apparent from the Detailed Description, while the system of the invention is advantageously implemented using the CLASSIC knowledge base system, the principles of the invention are by no means restricted to that system, but may be implemented in other knowledge base systems as well. Further, other site description languages may be developed which incorporate the principles of the site description languages disclosed herein.
All of the above being the case, the foregoing Detailed Description is to be understood as being in every respect illustrative and exemplary, but not restrictive, and the scope of the invention disclosed herein is not to be determined from the Detailed Description, but rather from the claims as interpreted according to the full breadth permitted by the patent laws.

Claims (19)

What is claimed is:
1. Information retrieval apparatus for retrieving information from a plurality of information sources, each information source being accessible by at least one of a plurality of information access protocols, the apparatus comprising:
a knowledge base responsive to a conceptual query on a knowledge representation of a domain of information, the knowledge representation including at least
a world view including a first set of concepts employed in the conceptual query and
a system view including a second set of concepts employed in accessing the plurality of information sources,
the knowledge base responding to the query by using the world view and the system view to produce an information access description describing how to access information required for the query in the plurality of information sources; and
means responsive to the information access description for employing the protocols to obtain information required to respond to the query from at least one information source in the plurality thereof and providing the obtained information to the knowledge base.
2. The apparatus set forth in claim 1 wherein:
the knowledge base further includes
information source descriptions including a third set of concepts which describe the information sources; and
the knowledge base further responds to the query by additionally using the information source descriptions to produce the information access description.
3. The apparatus set forth in claim 2 wherein:
the information source descriptions describe the information sources in terms of concepts from the first set thereof and the second set thereof.
4. The apparatus set forth in claim 3 wherein:
the information source descriptions further include descriptions of n-ary relations.
5. The apparatus set forth in claim 4 wherein:
the information source descriptions further include a constraint.
6. The apparatus set forth in claim 3 wherein:
the information source descriptions further include a constraint.
7. The apparatus set forth in claim 6 wherein:
the constraint is a subsumption constraint, wherein a variable is constrained to be subsumed in a concept.
8. The apparatus set forth in claim 7 wherein:
there is a plurality of the constraints and the plurality of constraints includes an order constraint.
9. The apparatus set forth in claim 8 wherein:
the knowledge base uses the information descriptions to produce the information access description such that the information access description describes a minimal relevant set of the information sources required for the query.
10. The apparatus set forth in claim 8 wherein:
the constraints are pairwise mutually exclusive; and
the knowledge base uses the information descriptions to produce the information access description such that the information access description describes a minimal relevant set of the information sources required for the query which includes no redundancy.
11. The apparatus set forth in claim 2 wherein:
the means responsive to the information access description includes means responsive to the obtained information and to concepts in the knowledge base for replanning how further information is to be obtained.
12. The apparatus set forth in claim 2 further comprising:
a graphical user interface for representing the world view concepts as a directed graph.
13. The apparatus set forth in claim 12 wherein:
the graphical user interface includes interactive means for altering the world view by editing the directed graph.
14. Information retrieval apparatus for retrieving information from a plurality of information sources, each information source being accessible by at least one of a plurality of information access protocols, the apparatus comprising
a knowledge base responsive to a conceptual query on a knowledge representation of a domain of information, the knowledge representation including at least
a world view including a first set of concepts employed in the conceptual query and
information source descriptions including a second set of concepts which describe the information sources in terms of concepts belonging to the world view,
the knowledge base responding to the query by using the world view and the information source descriptions to produce an information access description describing how to access information required for the query in the plurality of information sources; and
means responsive to the information access description for employing the protocols to obtain information required to respond to the query from at least one information source in the plurality thereof and providing the obtained information to the knowledge base,
the apparatus having the improvement comprising:
descriptions of relations in the information source descriptions.
15. The apparatus set forth in claim 14 wherein:
the information source descriptions further includes a constraint.
16. The apparatus set forth in claim 15 wherein:
the constraint is a subsumption constraint, wherein a variable is constrained to be subsumed in a concept.
17. The apparatus set forth in claim 16 wherein:
there is a plurality of the constraints and the plurality of constraints includes an order constraint.
18. The apparatus set forth in claim 17 wherein:
the knowledge base uses the information descriptions to produce the information access description such that the information access description describes a minimal relevant set of the information sources required for the query.
19. The apparatus set forth in claim 17 wherein:
the constraints are pairwise mutually exclusive; and
the knowledge base uses the information descriptions to produce the information access description such that the information access description describes a minimal relevant set of the information sources required for the query, the minimal relevant set including no redundancy.
US08/203,082 1994-02-28 1994-02-28 Apparatus and methods for retrieving information Expired - Lifetime US5655116A (en)

Priority Applications (7)

Application Number Priority Date Filing Date Title
US08/203,082 US5655116A (en) 1994-02-28 1994-02-28 Apparatus and methods for retrieving information
US08/347,016 US5600831A (en) 1994-02-28 1994-11-30 Apparatus and methods for retrieving information by modifying query plan based on description of information sources
EP95911924A EP0696366A4 (en) 1994-02-28 1995-02-27 Apparatus and method for retrieving information
JP7522489A JPH09503088A (en) 1994-02-28 1995-02-27 Device and method for retrieving information
US08/394,867 US5768578A (en) 1994-02-28 1995-02-27 User interface for information retrieval system
CA002161233A CA2161233A1 (en) 1994-02-28 1995-02-27 Apparatus and method for retrieving information
PCT/US1995/002338 WO1995023371A1 (en) 1994-02-28 1995-02-27 Apparatus and method for retrieving information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US08/203,082 US5655116A (en) 1994-02-28 1994-02-28 Apparatus and methods for retrieving information

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US08/347,016 Continuation-In-Part US5600831A (en) 1994-02-28 1994-11-30 Apparatus and methods for retrieving information by modifying query plan based on description of information sources

Publications (1)

Publication Number Publication Date
US5655116A true US5655116A (en) 1997-08-05

Family

ID=22752425

Family Applications (1)

Application Number Title Priority Date Filing Date
US08/203,082 Expired - Lifetime US5655116A (en) 1994-02-28 1994-02-28 Apparatus and methods for retrieving information

Country Status (1)

Country Link
US (1) US5655116A (en)

Cited By (56)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998035303A1 (en) * 1997-01-24 1998-08-13 The Board Of Regents Of The University Of Washington Method and system for network information access
US5809296A (en) * 1994-11-29 1998-09-15 St Computer Systems & Services Ltd. Method and structure for clustering database tables into classes and presenting each class as an E-R model
US5809497A (en) * 1995-05-26 1998-09-15 Starfish Software, Inc. Databank system with methods for efficiently storing non uniforms data records
US5897632A (en) * 1996-08-27 1999-04-27 At&T Corp Method and system for using materialized views to evaluate queries involving aggregation
WO1999050762A1 (en) * 1998-03-27 1999-10-07 Informix Software, Inc. Processing precomputed views
US5983220A (en) * 1995-11-15 1999-11-09 Bizrate.Com Supporting intuitive decision in complex multi-attributive domains using fuzzy, hierarchical expert models
US5987463A (en) * 1997-06-23 1999-11-16 Oracle Corporation Apparatus and method for calling external routines in a database system
US5987450A (en) * 1996-08-22 1999-11-16 At&T System and method for obtaining complete and correct answers from incomplete and/or incorrect databases
US5995961A (en) * 1995-11-07 1999-11-30 Lucent Technologies Inc. Information manifold for query processing
US6009422A (en) * 1997-11-26 1999-12-28 International Business Machines Corporation System and method for query translation/semantic translation using generalized query language
US6035298A (en) * 1995-10-19 2000-03-07 British Telecommunications Public Limited Company Accessing plural independent databases having plural database schemas
US6041344A (en) * 1997-06-23 2000-03-21 Oracle Corporation Apparatus and method for passing statements to foreign databases by using a virtual package
US6049800A (en) * 1997-06-23 2000-04-11 Oracle Corporation Mechanism and method for performing callbacks
US6134544A (en) * 1997-11-21 2000-10-17 Telefonaktiebolaget Lm Ericsson (Publ) Query supporting interface between a customer administrative system and database network elements in a telecommunications system
US6226649B1 (en) 1997-06-23 2001-05-01 Oracle Corporation Apparatus and method for transparent access of foreign databases in a heterogeneous database system
US6236997B1 (en) 1997-06-23 2001-05-22 Oracle Corporation Apparatus and method for accessing foreign databases in a heterogeneous database system
WO2001052146A1 (en) * 2000-01-12 2001-07-19 Pricewaterhousecoopers Llp Multi-term frequency analysis
US6327587B1 (en) 1998-10-05 2001-12-04 Digital Archaeology, Inc. Caching optimization with disk and/or memory cache management
US20020022956A1 (en) * 2000-05-25 2002-02-21 Igor Ukrainczyk System and method for automatically classifying text
US20020082778A1 (en) * 2000-01-12 2002-06-27 Barnett Phillip W. Multi-term frequency analysis
US20020133392A1 (en) * 2001-02-22 2002-09-19 Angel Mark A. Distributed customer relationship management systems and methods
US20030115191A1 (en) * 2001-12-17 2003-06-19 Max Copperman Efficient and cost-effective content provider for customer relationship management (CRM) or other applications
US6601058B2 (en) 1998-10-05 2003-07-29 Michael Forster Data exploration system and method
US20030220917A1 (en) * 2002-04-03 2003-11-27 Max Copperman Contextual search
US6665681B1 (en) * 1999-04-09 2003-12-16 Entrieva, Inc. System and method for generating a taxonomy from a plurality of documents
US6711585B1 (en) 1999-06-15 2004-03-23 Kanisa Inc. System and method for implementing a knowledge management system
US6735622B1 (en) * 1997-04-23 2004-05-11 Xerox Corporation Transferring constraint descriptors between light-weight devices for document access
US6823328B2 (en) 2000-06-05 2004-11-23 International Business Machines Corporation System and method for enabling unified access to multiple types of data
US6836776B2 (en) 2000-06-05 2004-12-28 International Business Machines Corporation System and method for managing hierarchical objects
US20050055321A1 (en) * 2000-03-06 2005-03-10 Kanisa Inc. System and method for providing an intelligent multi-step dialog with a user
US6898591B1 (en) 1997-11-05 2005-05-24 Billy Gayle Moon Method and apparatus for server responding to query to obtain information from second database wherein the server parses information to eliminate irrelevant information in updating databases
US20050144133A1 (en) * 1994-11-28 2005-06-30 Ned Hoffman System and method for processing tokenless biometric electronic transmissions using an electronic rule module clearinghouse
US7013300B1 (en) 1999-08-03 2006-03-14 Taylor David C Locating, filtering, matching macro-context from indexed database for searching context where micro-context relevant to textual input by user
US20060085375A1 (en) * 2004-10-14 2006-04-20 International Business Machines Corporation Method and system for access plan sampling
US7072840B1 (en) * 1994-10-28 2006-07-04 Cybear, L.L.C. Prescription management system
US7178153B1 (en) 2002-05-10 2007-02-13 Oracle International Corporation Method and mechanism for implementing an access interface infrastructure
US7206778B2 (en) 2001-12-17 2007-04-17 Knova Software Inc. Text search ordered along one or more dimensions
US20070255735A1 (en) * 1999-08-03 2007-11-01 Taylor David C User-context-based search engine
US20090182707A1 (en) * 2008-01-10 2009-07-16 Dbix Corporation Database changeset management system and method
US7631193B1 (en) * 1994-11-28 2009-12-08 Yt Acquisition Corporation Tokenless identification system for authorization of electronic transactions and electronic transmissions
US20100042478A1 (en) * 1994-05-31 2010-02-18 Twintech E.U., Limited Liability Company Providing Services From A Remote Computer System To A User Station Over A Communications Network
US7698567B2 (en) 1994-11-28 2010-04-13 Yt Acquisition Corporation System and method for tokenless biometric electronic scrip
US20100274785A1 (en) * 2009-04-24 2010-10-28 At&T Intellectual Property I, L.P. Database Analysis Using Clusters
US7882032B1 (en) 1994-11-28 2011-02-01 Open Invention Network, Llc System and method for tokenless biometric authorization of electronic communications
US20110066600A1 (en) * 2009-09-15 2011-03-17 At&T Intellectual Property I, L.P. Forward decay temporal data analysis
US7970678B2 (en) 2000-05-31 2011-06-28 Lapsley Philip D Biometric financial transaction system and method
US8027876B2 (en) 2005-08-08 2011-09-27 Yoogli, Inc. Online advertising valuation apparatus and method
US8429167B2 (en) 2005-08-08 2013-04-23 Google Inc. User-context-based search engine
US8752010B1 (en) 1997-12-31 2014-06-10 Honeywell International Inc. Dynamic interface synthesizer
US20140330858A1 (en) * 2013-05-06 2014-11-06 Aol Inc. Systems and methods for processing geographic data
US20150154256A1 (en) * 2013-12-01 2015-06-04 Paraccel Llc Physical Planning of Database Queries Using Partial Solutions
US9165323B1 (en) 2000-05-31 2015-10-20 Open Innovation Network, LLC Biometric transaction system and method
US9454548B1 (en) 2013-02-25 2016-09-27 Emc Corporation Pluggable storage system for distributed file systems
US20170004410A1 (en) * 2015-07-03 2017-01-05 Christopher William Paran Standardized process to quantify the value of research manuscripts
US9798771B2 (en) 2010-08-06 2017-10-24 At&T Intellectual Property I, L.P. Securing database content
US9984083B1 (en) * 2013-02-25 2018-05-29 EMC IP Holding Company LLC Pluggable storage system for parallel query engines across non-native file systems

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4408302A (en) * 1979-04-18 1983-10-04 Olympia Werke Ag Word processor with display device
US4769772A (en) * 1985-02-28 1988-09-06 Honeywell Bull, Inc. Automated query optimization method using both global and parallel local optimizations for materialization access planning for distributed databases
US4933800A (en) * 1988-06-03 1990-06-12 Yang Tai Her Motor overload detection with predetermined rotation reversal
US5021989A (en) * 1986-04-28 1991-06-04 Hitachi, Ltd. Document browsing apparatus with concurrent processing and retrievel
US5117349A (en) * 1990-03-27 1992-05-26 Sun Microsystems, Inc. User extensible, language sensitive database system
US5175814A (en) * 1990-01-30 1992-12-29 Digital Equipment Corporation Direct manipulation interface for boolean information retrieval
US5212787A (en) * 1991-03-12 1993-05-18 International Business Machines Corporation Method and apparatus for accessing a relational database without exiting an object-oriented environment
US5226111A (en) * 1987-01-06 1993-07-06 Hewlett-Packard Company Organization of theory based systems
US5241621A (en) * 1991-06-26 1993-08-31 Digital Equipment Corporation Management issue recognition and resolution knowledge processor
US5263126A (en) * 1992-09-10 1993-11-16 Chang Hou Mei H Automatic expert system
US5265014A (en) * 1990-04-10 1993-11-23 Hewlett-Packard Company Multi-modal user interface
US5315703A (en) * 1992-12-23 1994-05-24 Taligent, Inc. Object-oriented notification framework system
US5355474A (en) * 1991-09-27 1994-10-11 Thuraisngham Bhavani M System for multilevel secure database management using a knowledge base with release-based and other security constraints for query, response and update modification
US5355320A (en) * 1992-03-06 1994-10-11 Vlsi Technology, Inc. System for controlling an integrated product process for semiconductor wafers and packages
US5369761A (en) * 1990-03-30 1994-11-29 Conley; John D. Automatic and transparent denormalization support, wherein denormalization is achieved through appending of fields to base relations of a normalized database
US5379366A (en) * 1993-01-29 1995-01-03 Noyes; Dallas B. Method for representation of knowledge in a computer as a network database system
US5388196A (en) * 1990-09-07 1995-02-07 Xerox Corporation Hierarchical shared books with database
US5408655A (en) * 1989-02-27 1995-04-18 Apple Computer, Inc. User interface system and method for traversing a database

Patent Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4408302A (en) * 1979-04-18 1983-10-04 Olympia Werke Ag Word processor with display device
US4769772A (en) * 1985-02-28 1988-09-06 Honeywell Bull, Inc. Automated query optimization method using both global and parallel local optimizations for materialization access planning for distributed databases
US5021989A (en) * 1986-04-28 1991-06-04 Hitachi, Ltd. Document browsing apparatus with concurrent processing and retrievel
US5226111A (en) * 1987-01-06 1993-07-06 Hewlett-Packard Company Organization of theory based systems
US4933800A (en) * 1988-06-03 1990-06-12 Yang Tai Her Motor overload detection with predetermined rotation reversal
US5408655A (en) * 1989-02-27 1995-04-18 Apple Computer, Inc. User interface system and method for traversing a database
US5175814A (en) * 1990-01-30 1992-12-29 Digital Equipment Corporation Direct manipulation interface for boolean information retrieval
US5117349A (en) * 1990-03-27 1992-05-26 Sun Microsystems, Inc. User extensible, language sensitive database system
US5369761A (en) * 1990-03-30 1994-11-29 Conley; John D. Automatic and transparent denormalization support, wherein denormalization is achieved through appending of fields to base relations of a normalized database
US5265014A (en) * 1990-04-10 1993-11-23 Hewlett-Packard Company Multi-modal user interface
US5388196A (en) * 1990-09-07 1995-02-07 Xerox Corporation Hierarchical shared books with database
US5212787A (en) * 1991-03-12 1993-05-18 International Business Machines Corporation Method and apparatus for accessing a relational database without exiting an object-oriented environment
US5241621A (en) * 1991-06-26 1993-08-31 Digital Equipment Corporation Management issue recognition and resolution knowledge processor
US5355474A (en) * 1991-09-27 1994-10-11 Thuraisngham Bhavani M System for multilevel secure database management using a knowledge base with release-based and other security constraints for query, response and update modification
US5355320A (en) * 1992-03-06 1994-10-11 Vlsi Technology, Inc. System for controlling an integrated product process for semiconductor wafers and packages
US5263126A (en) * 1992-09-10 1993-11-16 Chang Hou Mei H Automatic expert system
US5315703A (en) * 1992-12-23 1994-05-24 Taligent, Inc. Object-oriented notification framework system
US5379366A (en) * 1993-01-29 1995-01-03 Noyes; Dallas B. Method for representation of knowledge in a computer as a network database system

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
R.D. Semmel; "Quick: a system that uses conceptual design knowledge for query information."; IEEE, 1992; pp. 214-221.
R.D. Semmel; Quick: a system that uses conceptual design knowledge for query information. ; IEEE, 1992; pp. 214 221. *
Saxton et al.; Design of an integrated information retrieval/database management system ; IEEE, vol. 2, Issue 2, Jun. 1990; pp. 210 219. *
Saxton et al.;"Design of an integrated information retrieval/database management system"; IEEE, vol. 2, Issue 2, Jun. 1990; pp. 210-219.
T. Topaloglou, et al., "Query Optimization for KBMSs: Temporal, Syntactic and Semantic Transformations", IEEE, 1992, pp. 310-319.
T. Topaloglou, et al., Query Optimization for KBMSs: Temporal, Syntactic and Semantic Transformations , IEEE, 1992, pp. 310 319. *
Weishar et al.; An Intelligent heterogeneous autonomous database architecture for semantic heterogeneity support. ; IEEE, 1991; pp. 152 155. *
Weishar et al.;"An Intelligent heterogeneous autonomous database architecture for semantic heterogeneity support."; IEEE, 1991; pp. 152-155.

Cited By (101)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100287155A1 (en) * 1994-05-31 2010-11-11 Twintech E.U., Limited Liability Company Software And Method That Enables Selection Of One Of A Plurality Of Online Service Providers
US9111604B2 (en) 1994-05-31 2015-08-18 Intellectual Ventures I Llc Software and method that enables selection of on-line content from one of a plurality of network content service providers in a single action
US8825872B2 (en) 1994-05-31 2014-09-02 Intellectual Ventures I Llc Software and method for monitoring a data stream and for capturing desired data within the data stream
US20100042478A1 (en) * 1994-05-31 2010-02-18 Twintech E.U., Limited Liability Company Providing Services From A Remote Computer System To A User Station Over A Communications Network
US8407682B2 (en) 1994-05-31 2013-03-26 Intellectual Ventures I Llc Software and method that enables selection of one of a plurality of online service providers
US8812620B2 (en) 1994-05-31 2014-08-19 Intellectual Property I LLC Software and method that enables selection of one of a plurality of online service providers
US8719339B2 (en) * 1994-05-31 2014-05-06 Intellectual Ventures I Llc Software and method that enables selection of one of a plurality of online service providers
US8499030B1 (en) 1994-05-31 2013-07-30 Intellectual Ventures I Llc Software and method that enables selection of one of a plurality of network communications service providers
US9484078B2 (en) 1994-05-31 2016-11-01 Intellectual Ventures I Llc Providing services from a remote computer system to a user station over a communications network
US9484077B2 (en) 1994-05-31 2016-11-01 Intellectual Ventures I Llc Providing services from a remote computer system to a user station over a communications network
US20110016193A1 (en) * 1994-05-31 2011-01-20 Twintech E.U., Limited Liability Company Providing services from a remote computer system to a user station over a communications network
US8635272B2 (en) 1994-05-31 2014-01-21 Intellectual Ventures I Llc Method for distributing a list of updated content to a user station from a distribution server wherein the user station may defer installing the update
US8321499B2 (en) 1994-05-31 2012-11-27 Intellectual Ventures I Llc Method for distributing content to a user station
US7072840B1 (en) * 1994-10-28 2006-07-04 Cybear, L.L.C. Prescription management system
US7698567B2 (en) 1994-11-28 2010-04-13 Yt Acquisition Corporation System and method for tokenless biometric electronic scrip
US8260716B2 (en) 1994-11-28 2012-09-04 Open Invention Network, Llc System and method for processing tokenless biometric electronic transmissions using an electronic rule module clearinghouse
US7631193B1 (en) * 1994-11-28 2009-12-08 Yt Acquisition Corporation Tokenless identification system for authorization of electronic transactions and electronic transmissions
US20050144133A1 (en) * 1994-11-28 2005-06-30 Ned Hoffman System and method for processing tokenless biometric electronic transmissions using an electronic rule module clearinghouse
US7882032B1 (en) 1994-11-28 2011-02-01 Open Invention Network, Llc System and method for tokenless biometric authorization of electronic communications
US8831994B1 (en) 1994-11-28 2014-09-09 Open Invention Network, Llc System and method for tokenless biometric authorization of electronic communications
US5809296A (en) * 1994-11-29 1998-09-15 St Computer Systems & Services Ltd. Method and structure for clustering database tables into classes and presenting each class as an E-R model
US5809497A (en) * 1995-05-26 1998-09-15 Starfish Software, Inc. Databank system with methods for efficiently storing non uniforms data records
US6035298A (en) * 1995-10-19 2000-03-07 British Telecommunications Public Limited Company Accessing plural independent databases having plural database schemas
US5995961A (en) * 1995-11-07 1999-11-30 Lucent Technologies Inc. Information manifold for query processing
US6463431B1 (en) * 1995-11-15 2002-10-08 Bizrate.Com Database evaluation system supporting intuitive decision in complex multi-attributive domains using fuzzy hierarchical expert models
US5983220A (en) * 1995-11-15 1999-11-09 Bizrate.Com Supporting intuitive decision in complex multi-attributive domains using fuzzy, hierarchical expert models
US5987450A (en) * 1996-08-22 1999-11-16 At&T System and method for obtaining complete and correct answers from incomplete and/or incorrect databases
US5897632A (en) * 1996-08-27 1999-04-27 At&T Corp Method and system for using materialized views to evaluate queries involving aggregation
WO1998035303A1 (en) * 1997-01-24 1998-08-13 The Board Of Regents Of The University Of Washington Method and system for network information access
US6735622B1 (en) * 1997-04-23 2004-05-11 Xerox Corporation Transferring constraint descriptors between light-weight devices for document access
US5987463A (en) * 1997-06-23 1999-11-16 Oracle Corporation Apparatus and method for calling external routines in a database system
US6226649B1 (en) 1997-06-23 2001-05-01 Oracle Corporation Apparatus and method for transparent access of foreign databases in a heterogeneous database system
US6049800A (en) * 1997-06-23 2000-04-11 Oracle Corporation Mechanism and method for performing callbacks
US6041344A (en) * 1997-06-23 2000-03-21 Oracle Corporation Apparatus and method for passing statements to foreign databases by using a virtual package
US6236997B1 (en) 1997-06-23 2001-05-22 Oracle Corporation Apparatus and method for accessing foreign databases in a heterogeneous database system
US6898591B1 (en) 1997-11-05 2005-05-24 Billy Gayle Moon Method and apparatus for server responding to query to obtain information from second database wherein the server parses information to eliminate irrelevant information in updating databases
US6134544A (en) * 1997-11-21 2000-10-17 Telefonaktiebolaget Lm Ericsson (Publ) Query supporting interface between a customer administrative system and database network elements in a telecommunications system
US6009422A (en) * 1997-11-26 1999-12-28 International Business Machines Corporation System and method for query translation/semantic translation using generalized query language
US8752010B1 (en) 1997-12-31 2014-06-10 Honeywell International Inc. Dynamic interface synthesizer
US6594653B2 (en) 1998-03-27 2003-07-15 International Business Machines Corporation Server integrated system and methods for processing precomputed views
US6493699B2 (en) 1998-03-27 2002-12-10 International Business Machines Corporation Defining and characterizing an analysis space for precomputed views
US6480836B1 (en) 1998-03-27 2002-11-12 International Business Machines Corporation System and method for determining and generating candidate views for a database
WO1999050762A1 (en) * 1998-03-27 1999-10-07 Informix Software, Inc. Processing precomputed views
US6601058B2 (en) 1998-10-05 2003-07-29 Michael Forster Data exploration system and method
US6327587B1 (en) 1998-10-05 2001-12-04 Digital Archaeology, Inc. Caching optimization with disk and/or memory cache management
US6665681B1 (en) * 1999-04-09 2003-12-16 Entrieva, Inc. System and method for generating a taxonomy from a plurality of documents
US20040148155A1 (en) * 1999-04-09 2004-07-29 Entrevia, Inc., A Delaware Corporation System and method for generating a taxonomy from a plurality of documents
US7113954B2 (en) 1999-04-09 2006-09-26 Entrleva, Inc. System and method for generating a taxonomy from a plurality of documents
US7401087B2 (en) 1999-06-15 2008-07-15 Consona Crm, Inc. System and method for implementing a knowledge management system
US6711585B1 (en) 1999-06-15 2004-03-23 Kanisa Inc. System and method for implementing a knowledge management system
US20070033221A1 (en) * 1999-06-15 2007-02-08 Knova Software Inc. System and method for implementing a knowledge management system
US7013300B1 (en) 1999-08-03 2006-03-14 Taylor David C Locating, filtering, matching macro-context from indexed database for searching context where micro-context relevant to textual input by user
US20070255735A1 (en) * 1999-08-03 2007-11-01 Taylor David C User-context-based search engine
US7881981B2 (en) 1999-08-03 2011-02-01 Yoogli, Inc. Methods and computer readable media for determining a macro-context based on a micro-context of a user search
WO2001052146A1 (en) * 2000-01-12 2001-07-19 Pricewaterhousecoopers Llp Multi-term frequency analysis
US20020082778A1 (en) * 2000-01-12 2002-06-27 Barnett Phillip W. Multi-term frequency analysis
US7849117B2 (en) 2000-01-12 2010-12-07 Knowledge Sphere, Inc. Multi-term frequency analysis
US7539656B2 (en) 2000-03-06 2009-05-26 Consona Crm Inc. System and method for providing an intelligent multi-step dialog with a user
US7337158B2 (en) 2000-03-06 2008-02-26 Consona Crm Inc. System and method for providing an intelligent multi-step dialog with a user
US20050055321A1 (en) * 2000-03-06 2005-03-10 Kanisa Inc. System and method for providing an intelligent multi-step dialog with a user
US20060143175A1 (en) * 2000-05-25 2006-06-29 Kanisa Inc. System and method for automatically classifying text
US20020022956A1 (en) * 2000-05-25 2002-02-21 Igor Ukrainczyk System and method for automatically classifying text
US7028250B2 (en) 2000-05-25 2006-04-11 Kanisa, Inc. System and method for automatically classifying text
US7970678B2 (en) 2000-05-31 2011-06-28 Lapsley Philip D Biometric financial transaction system and method
US8630932B1 (en) 2000-05-31 2014-01-14 Open Invention Network, Llc Biometric financial transaction system and method
US9165323B1 (en) 2000-05-31 2015-10-20 Open Innovation Network, LLC Biometric transaction system and method
US8452680B1 (en) 2000-05-31 2013-05-28 Open Invention Network, Llc Biometric financial transaction system and method
US8630933B1 (en) 2000-05-31 2014-01-14 Open Invention Network, Llc Biometric financial transaction system and method
US6836776B2 (en) 2000-06-05 2004-12-28 International Business Machines Corporation System and method for managing hierarchical objects
US6823328B2 (en) 2000-06-05 2004-11-23 International Business Machines Corporation System and method for enabling unified access to multiple types of data
US20020133392A1 (en) * 2001-02-22 2002-09-19 Angel Mark A. Distributed customer relationship management systems and methods
US20030115191A1 (en) * 2001-12-17 2003-06-19 Max Copperman Efficient and cost-effective content provider for customer relationship management (CRM) or other applications
US7206778B2 (en) 2001-12-17 2007-04-17 Knova Software Inc. Text search ordered along one or more dimensions
US20030220917A1 (en) * 2002-04-03 2003-11-27 Max Copperman Contextual search
US7178153B1 (en) 2002-05-10 2007-02-13 Oracle International Corporation Method and mechanism for implementing an access interface infrastructure
US20060085375A1 (en) * 2004-10-14 2006-04-20 International Business Machines Corporation Method and system for access plan sampling
US8027876B2 (en) 2005-08-08 2011-09-27 Yoogli, Inc. Online advertising valuation apparatus and method
US8429167B2 (en) 2005-08-08 2013-04-23 Google Inc. User-context-based search engine
US8515811B2 (en) 2005-08-08 2013-08-20 Google Inc. Online advertising valuation apparatus and method
US9449105B1 (en) 2005-08-08 2016-09-20 Google Inc. User-context-based search engine
US20090182707A1 (en) * 2008-01-10 2009-07-16 Dbix Corporation Database changeset management system and method
US8161048B2 (en) 2009-04-24 2012-04-17 At&T Intellectual Property I, L.P. Database analysis using clusters
US20100274785A1 (en) * 2009-04-24 2010-10-28 At&T Intellectual Property I, L.P. Database Analysis Using Clusters
US8595194B2 (en) 2009-09-15 2013-11-26 At&T Intellectual Property I, L.P. Forward decay temporal data analysis
US20110066600A1 (en) * 2009-09-15 2011-03-17 At&T Intellectual Property I, L.P. Forward decay temporal data analysis
US9798771B2 (en) 2010-08-06 2017-10-24 At&T Intellectual Property I, L.P. Securing database content
US9965507B2 (en) 2010-08-06 2018-05-08 At&T Intellectual Property I, L.P. Securing database content
US10719510B2 (en) 2013-02-25 2020-07-21 EMC IP Holding Company LLC Tiering with pluggable storage system for parallel query engines
US11514046B2 (en) 2013-02-25 2022-11-29 EMC IP Holding Company LLC Tiering with pluggable storage system for parallel query engines
US11288267B2 (en) 2013-02-25 2022-03-29 EMC IP Holding Company LLC Pluggable storage system for distributed file systems
US9454548B1 (en) 2013-02-25 2016-09-27 Emc Corporation Pluggable storage system for distributed file systems
US9805053B1 (en) 2013-02-25 2017-10-31 EMC IP Holding Company LLC Pluggable storage system for parallel query engines
US9898475B1 (en) 2013-02-25 2018-02-20 EMC IP Holding Company LLC Tiering with pluggable storage system for parallel query engines
US9984083B1 (en) * 2013-02-25 2018-05-29 EMC IP Holding Company LLC Pluggable storage system for parallel query engines across non-native file systems
US10915528B2 (en) 2013-02-25 2021-02-09 EMC IP Holding Company LLC Pluggable storage system for parallel query engines
US10831709B2 (en) 2013-02-25 2020-11-10 EMC IP Holding Company LLC Pluggable storage system for parallel query engines across non-native file systems
US20140330858A1 (en) * 2013-05-06 2014-11-06 Aol Inc. Systems and methods for processing geographic data
US10204139B2 (en) * 2013-05-06 2019-02-12 Verizon Patent And Licensing Inc. Systems and methods for processing geographic data
US10628417B2 (en) * 2013-12-01 2020-04-21 Paraccel Llc Physical planning of database queries using partial solutions
US20150154256A1 (en) * 2013-12-01 2015-06-04 Paraccel Llc Physical Planning of Database Queries Using Partial Solutions
US20170004410A1 (en) * 2015-07-03 2017-01-05 Christopher William Paran Standardized process to quantify the value of research manuscripts

Similar Documents

Publication Publication Date Title
US5655116A (en) Apparatus and methods for retrieving information
US5600831A (en) Apparatus and methods for retrieving information by modifying query plan based on description of information sources
Kirk et al. The information manifold
Mitra et al. A graph-oriented model for articulation of ontology interdependencies
Karvounarakis et al. Semiring-annotated data: queries and provenance?
Chang et al. Mind your vocabulary: Query mapping across heterogeneous information sources
Tahani A fuzzy model of document retrieval systems
Cross Fuzzy information retrieval
US20110218990A1 (en) Data storage, retrieval, manipulation and display tools enabling multiple hierarchical points of view
Fowler et al. Information retrieval using pathfinder networks
Beneventano Semantic search engines based on data integration systems
Basu et al. Using a relational database to support explanation in a knowledge-based system
Arens et al. Query processing in an information mediator
Sidebottom Hierarchical arc consistency applied to numeric processing in constraint logic programming
Ding et al. Ontology management: survey, requirements and directions
Gurský et al. User Preference Web Search--Experiments with a System Connecting Web and User
Kimbleton et al. XNDM: An Experimental Network Data Manager.
Wang et al. Intelligent browser for TEXPROS
Aloui et al. A new approach for flexible queries using fuzzy ontologies
Damiani et al. Terminological information management in ADKMS
Hurt Nonmonotonic logic for use in information retrieval: an exploratory paper
Kokkoras et al. COMFRESH: a common framework for expert systems and hypertext
Laenens et al. Advanced knowledge-base environments for large database systems
Meghini et al. Unifying the concept of collection in digital libraries
Kudagba Framework Architecture for Querying Distributed RDF Data

Legal Events

Date Code Title Description
AS Assignment

Owner name: AMERICAN TELEPHONE AND TELEGRAPH COMPANY, NEW YORK

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KIRK, THOMAS;LEVY, ALON YITZCHAK;SRIVASTAVA, DIVESH;REEL/FRAME:006942/0612

Effective date: 19940407

AS Assignment

Owner name: AT&T IPM CORP., FLORIDA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:AT&T CORP.;REEL/FRAME:007528/0038

Effective date: 19950523

Owner name: AT&T CORP., NEW YORK

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:AMERICAN TELELPHONE AND TELEGRAPH COMPANY;REEL/FRAME:007527/0274

Effective date: 19940420

AS Assignment

Owner name: LUCENT TECHNOLOGIES INC., NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:AT&T CORP.;REEL/FRAME:008196/0181

Effective date: 19960329

STCF Information on status: patent grant

Free format text: PATENTED CASE

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 4

AS Assignment

Owner name: THE CHASE MANHATTAN BANK, AS COLLATERAL AGENT, TEX

Free format text: CONDITIONAL ASSIGNMENT OF AND SECURITY INTEREST IN PATENT RIGHTS;ASSIGNOR:LUCENT TECHNOLOGIES INC. (DE CORPORATION);REEL/FRAME:011722/0048

Effective date: 20010222

FPAY Fee payment

Year of fee payment: 8

AS Assignment

Owner name: LUCENT TECHNOLOGIES INC., NEW JERSEY

Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENT RIGHTS;ASSIGNOR:JPMORGAN CHASE BANK, N.A. (FORMERLY KNOWN AS THE CHASE MANHATTAN BANK), AS ADMINISTRATIVE AGENT;REEL/FRAME:018584/0446

Effective date: 20061130

FPAY Fee payment

Year of fee payment: 12

AS Assignment

Owner name: CREDIT SUISSE AG, NEW YORK

Free format text: SECURITY INTEREST;ASSIGNOR:ALCATEL-LUCENT USA INC.;REEL/FRAME:030510/0627

Effective date: 20130130

AS Assignment

Owner name: SOUND VIEW INNOVATIONS, LLC, NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ALCATEL LUCENT;REEL/FRAME:032086/0016

Effective date: 20131223

AS Assignment

Owner name: ALCATEL-LUCENT USA INC., NEW JERSEY

Free format text: RELEASE OF SECURITY INTEREST;ASSIGNOR:CREDIT SUISSE AG;REEL/FRAME:032537/0133

Effective date: 20131223

AS Assignment

Owner name: ALCATEL-LUCENT USA INC., NEW JERSEY

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG;REEL/FRAME:033950/0261

Effective date: 20140819

AS Assignment

Owner name: ALCATEL-LUCENT USA INC., NEW JERSEY

Free format text: MERGER AND CHANGE OF NAME;ASSIGNORS:ALCATEL USA MARKETING, INC.;ALCATEL USA SOURCING, INC.;LUCENT TECHNOLOGIES, INC.;AND OTHERS;REEL/FRAME:050460/0510

Effective date: 20081017

AS Assignment

Owner name: NOKIA OF AMERICA CORPORATION, DELAWARE

Free format text: CHANGE OF NAME;ASSIGNOR:ALCATEL-LUCENT USA INC.;REEL/FRAME:050476/0085

Effective date: 20180103

AS Assignment

Owner name: ALCATEL LUCENT, FRANCE

Free format text: NUNC PRO TUNC ASSIGNMENT;ASSIGNOR:NOKIA OF AMERICA CORPORATION;REEL/FRAME:050662/0204

Effective date: 20190927