US5745746A - Method for localizing execution of subqueries and determining collocation of execution of subqueries in a parallel database - Google Patents

Method for localizing execution of subqueries and determining collocation of execution of subqueries in a parallel database Download PDF

Info

Publication number
US5745746A
US5745746A US08/672,013 US67201396A US5745746A US 5745746 A US5745746 A US 5745746A US 67201396 A US67201396 A US 67201396A US 5745746 A US5745746 A US 5745746A
Authority
US
United States
Prior art keywords
subquery
query
determining
partitioning
outer query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
US08/672,013
Inventor
Anant D. Jhingran
Lubor J. Kollar
Timothy R. Malkemus
Sriram K. Padmanabhan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US08/672,013 priority Critical patent/US5745746A/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: JHINGRAN, ANANT D., MALKEMUS, TIMOTHY R., KOLLAR, LUBOR J., PADMANABHAN, SRIRAM K.
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KOLLAR, LUBOR J., MALKEMUS, TIMOTHY R., JHINGRAN, ANANT D., PADMANABHAN, SRIRAM K.
Application granted granted Critical
Publication of US5745746A publication Critical patent/US5745746A/en
Priority to US09/585,927 priority patent/USRE37965E1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24535Query rewriting; Transformation of sub-queries or views
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99932Access augmentation or optimizing

Definitions

  • This invention relates to parallel-processor database systems and more particularly to a method for localizing execution and determining collocation of execution of subqueries in a parallel database.
  • a typical parallel processor computer system has a number of resources such as processors, memory buffers and the like. These resources can operate simultaneously, thereby greatly improving the performance of the computer when executing a task which has a number of sub-tasks that can be executed independently of each other.
  • Executing a sub-task usually involves executing a number of sub-tasks each of which in turn may have several parts.
  • each step in executing each part of the sub-task is performed sequentially.
  • a parallel processor computer several such operations can be performed simultaneously, but typically the parallel computer system does not have enough resources to go around. Resolving conflicting demands by the various sub-tasks for access to such resources has been a problem in the design of parallel processor computer systems, especially in the context of using such computer systems to evaluate complicated queries of a database.
  • a shared-nothing architecture comprises a collection of independent processors each having its own memory and disk and connected to the other processors via a high-speed communication network.
  • communication and synchronization overhead are critical factors in overall query performance.
  • Shared-nothing systems are particularly well-suited to evaluate queries that can be partitioned into independent sub-problems, each of which is executed in parallel with the others.
  • the present invention provides a method for localizing execution and determining collocation of execution of subqueries in a parallel database.
  • the method according to the present invention is suitable for both subqueries that involve correlation and subqueries that do not.
  • the method according to the present invention reduces the system resources needed for processing a query by reducing the number of processes used when a partitioning key of any table involved in the query is specified by an equality to a constant, host-variable, IN-list, or any internal run-time computation.
  • the method reduces the number of processes: (1) by reducing the number of nodes involved in the query; or (2) by combining multiple processes into one.
  • the method according to the present invention also uses the concept of "compatible partitioning" in shared-nothing database systems to eliminate excess processing and communication for subqueries thereby improving response time and throughput.
  • the present invention provides a method for determining locality for execution of subqueries for queries in a relational database management system, wherein said queries comprise an outer query and a subquery having a query-subquery operator and wherein partitioning columns for the query and subquery are provided, said method comprising the steps of: (a) determining if said outer query and said subquery are compatibly partitioned; (b) if said outer query and said subquery are compatibly partitioned then for each pair of partitioning columns in said outer query and said subquery determining an equivalence class for each of said columns in said pair; (c) determining if the partitioning column for said subquery belongs to the same equivalence class as the partitioning column for said outer query; (d) determining if said query-subquery operator comprises a selected operator; and and (e) if said steps (c) and (d) are true, then determining locality for said subquery so that said subquery is executable locally
  • the present invention provides a relational database management system for use with a computer system wherein queries are entered for retrieving data from tables and wherein partitioning columns and partitioning keys are provided, said system comprising: means for processing nested queries comprising an outer query and a subquery; means for determining locality of execution of said subquery including, (a) means for determining if said outer query and said subquery are compatibly partitioned; (b) means for determining an equivalence class for each column forming a corresponding pair of partitioning columns for said outer query and said subquery; (c) means for ascertaining if the partitioning column for said subquery belongs to the same equivalence class as the partitioning column for said outer query; (d) means for determining if said query-subquery operator comprises a selected operator; and (e) means responsive to said means for ascertaining and said means for determining said selected operator for determining locality of said subquery so that said subquery is locally executable with respect
  • the present invention provides a computer program product for use on a computer wherein queries are entered for retrieving data from tables, wherein said queries comprise an outer query and a subquery having a query-subquery operator and wherein partitioning columns for the query and subquery are provided, said computer program product comprising: a recording medium; means recorded on said medium for instructing said computer to perform the steps of, (a) determining if said outer query and said subquery are compatibly partitioned; (b) if said outer query and said subquery are compatibly partitioned then for each pair of partitioning columns in said outer query and said subquery determining an equivalence class for each of said columns in said pair; (c) determining if the partitioning column for said subquery belongs to the same equivalence class as the partitioning column for said outer query; (d) determining if said query-subquery operator comprises a selected operator; and (e) if said steps (c) and (d) are true, then determining
  • the FIGURE is a flow chart of a method for determining locality for execution of subqueries according to the present invention.
  • DBMS database management system
  • SQL Structured Query Language
  • Example 1 requests that the DBMS retrieve all column x fields from tuples in table t1 which have column y equal to ⁇ employee ⁇ .
  • the query can become quite complex. Multiple tables and multiple columns can be referenced. (In order to distinguish which column of which table is being referenced column x of table t1 may be written as t1.x).
  • SQL queries are called subqueries. With subqueries, one can compare the column expression of a query to the column expression of another query. One can also compare column expressions with subqueries whose result is a table, either by testing membership, testing if ANY row of the table has a property, or testing if ALL do. Often a query is formulated by using a subquery in the predicate. For example, to find all the employees who earn more than the average salary of the entire organization, one may write:
  • Example 2 illustrates a non-correlated subquery.
  • An example of a correlated subquery is provided below in Example 3.
  • Example 3 determines the names of employees who earn more than the average salary of their department.
  • the implementation of a subquery involves setting up processes to evaluate the subqueries, e.g. "SELECT avg(salary) . . . ", and setting up communication paths between the subquery processes and the outer query, e.g. "SELECT name FROM employee e WHERE> . . . ".
  • the result of a query (and subquery) execution is a table, and the communication path is needed for the subquery result.
  • an additional path is needed in order to send the correlated values.
  • the method according to the present invention utilizes compatible partitioning to localize database operations for subqueries.
  • the method leads to fewer processes and less communication.
  • the fewer processes ease the demand on system resources, and less communication improves response time and throughput for the DBMS.
  • FIGURE shows in flow chart form an overview of a method 10 according to the present invention.
  • the present invention is described with reference to a shared-nothing database, however, the invention has wider applicability to other parallel database architectures.
  • An SQL command comprising an outer query and a subquery is entered to retrieve information from a database.
  • the command is processed by the database management system (DBMS) on a database processing machine.
  • DBMS database management system
  • the command statement is parsed and the semantics of the statement are checked for compliance with grammatical/semantic rules, and then an internal representation of the command is made for the system to process the command.
  • a method is provided to determine if the subquery can be executed locally with respect to the outer query, in order to improve the processing of the query for retrieving data quickly. Improvements relate to improved response time and throughput by reducing system resources needed for execution of query and communication paths (between the subquery and the outer query).
  • the result of the subquery execution will have the same partitioning as both the input outer query and the input subquery.
  • An additional improvement is achieved because the method may be recursively applied to other query-subquery tuples.
  • a first step 12 in the method 10 involves determining if the outer query and the inner query are "compatibly partitioned".
  • the outer query and inner query are compatibly partitioned if they use the same "Partitioning Algorithm".
  • a Partitioning Algorithm is an algorithm which unambiguously identifies a single partition by considering only the column values of a given row in a table. The smallest subset of such column values is known as the "Partitioning Key" of the table, and the columns in the subset are known as the "Partitioning Columns" of the table. If a Partitioning Algorithm exists for a table, then the table is "Deterministically Partitioned".
  • the result of a query execution (and a subquery execution) is a table. If a Partitioning Algorithm does not exist for a table, then the table is not Deterministically Partitioned.
  • Query and subquery partitioning according to the method of the present invention involves considering the Partitioning Algorithm, i.e. Partitioning Columns, for the resulting tables.
  • t1 i.e. resulting table for the executed query
  • t2.a t2.a
  • the method 10 proceeds next to steps 14 and 16 to determine if the subquery is locally executable.
  • the method 10 determines if each corresponding pair of Partitioning Columns satisfies two conditions.
  • a corresponding pair of Partitioning Columns refers to columns which correspond to the order of Partitioning Columns according to the partitioning keys of the outer query and the subquery.
  • the first condition is tested in step 14 and involves determining if both columns in the pair belong to the same Query-Subquery (QS) Equivalence Class.
  • step 12 If the outer query and subquery are compatibly partitioned (step 12) and the two conditions (step 14 and 16) are satisfied, then according to the method of the present invention the subquery is processed or evaluated by the DBMS entirely locally in step 20 with respect to the outer query.
  • the method according to the present invention uses the column equivalence class to establish a one-to-one pairing between the partitioning columns of the query and the partitioning columns of the subquery. If there is a one-to-one pairing, then the subquery is executable locally with respect to the outer query as shown in step 20.
  • the method is also suitable for cases where the order of partitioning columns in the partitioning key is not significant.
  • the method may consider more partitioning columns of the subquery for pairing with a single partitioning column of the outer query and vice versa.
  • the method needs to establish any one-to-one pairing between the partitioning columns of the outer query and the partitioning columns of the subquery, where each pair satisfies the two conditions.
  • the second condition applied in step 16 of the method 10 involves determining if the outer query's column participates in a query-subquery predicate and the subquery's column participates in the SELECT list of the subquery in such a form, that only the equal values of the two columns need to be investigated to conclude validity of the predicate.
  • the second condition in step 16 is not satisfied, i.e.
  • One method for determining uniqueness is by considering if any unique key on the subquery result (i.e. table) is specified by equality to a constant, hostvar or other construct. The operation of this aspect of the method is shown by the following example:
  • the unique key t2.c is recognized as guaranteeing the uniqueness of the subquery result and therefore the subquery is executed locally with respect to the outer query.
  • the "elect list" for the subquery is not a simple column or constant, for example SELECT * FROM t1 WHERE a IN (SELECT avg(b) FROM t2), the method cannot guarantee uniqueness of the subquery result and therefore the subquery is not executed locally.
  • the method first determines if the QS operand, i.e. t1.a, and the subquery result, i.e. SELECT a from t2, are compatibly partitioned (step 12 in the FIGURE).
  • the method is also suitable for subquery having more than one table provided the subquery result is compatibly partitioned with the QS operand.
  • the following subquery includes another table t3 which is partitioned on t3.a:
  • the subquery in Example 7 is evaluated fully locally.
  • the method first checks the correlation values connecting the outer query block to the subquery block.
  • the subquery block has a correlation value of t1.a and the method determines that because of the correlation value, i.e. t1.a, for the subquery, the values of the outer query, i.e. SELECT * FROM t1, will only match those values of t2.b which come from the same node. Therefore, according to the method the subquery block is executable locally.
  • the operation of this aspect of the present method is shown by the following example:
  • the method determines if the correlation value, i.e. t1.a, for the subquery block guarantees locality.
  • the correlation value t1.a does not guarantee locality, i.e. matches will not all come from the same node, and the method proceeds as for a non-correlated subquery described above.

Abstract

A method for localizing execution of subqueries and determining collocation of execution of subqueries in a shared-nothing database. The concept of compatible partitioning is used to localize database operations in order to eliminate excess processes and communication, and thereby improve response time and throughput for the database management system. The method reduces the number of process by reducing the number of nodes involved in processing a query and by combining multiple processes.

Description

FIELD OF THE INVENTION
This invention relates to parallel-processor database systems and more particularly to a method for localizing execution and determining collocation of execution of subqueries in a parallel database.
DESCRIPTION OF THE RELATED ART
A typical parallel processor computer system has a number of resources such as processors, memory buffers and the like. These resources can operate simultaneously, thereby greatly improving the performance of the computer when executing a task which has a number of sub-tasks that can be executed independently of each other.
Executing a sub-task usually involves executing a number of sub-tasks each of which in turn may have several parts. In a computer having only one processor, each step in executing each part of the sub-task is performed sequentially. In a parallel processor computer, several such operations can be performed simultaneously, but typically the parallel computer system does not have enough resources to go around. Resolving conflicting demands by the various sub-tasks for access to such resources has been a problem in the design of parallel processor computer systems, especially in the context of using such computer systems to evaluate complicated queries of a database.
Various kinds of parallel-processor database computer architectures have been proposed. Most of the proposed architectures for parallel-processor computers use a "shared-nothing" approach. A shared-nothing architecture comprises a collection of independent processors each having its own memory and disk and connected to the other processors via a high-speed communication network. In a shared-nothing database architecture, communication and synchronization overhead are critical factors in overall query performance. Shared-nothing systems are particularly well-suited to evaluate queries that can be partitioned into independent sub-problems, each of which is executed in parallel with the others.
There is a continuing need for a way to optimize query execution in a shared-nothing computer so as to make the most effective use of the various resources of the computer.
In shared-nothing database systems, the concept of "compatible partitioning" to localize database operations is a known technique to minimize inter-processor communication. For example, by partitioning tables t1 and t2 on t1.a and t2.a respectively, all communication can be avoided in computation of the JOIN "t1.a=t2.a". This result follows since a partition of t1 will only join with a partition of t2 on the same node.
There still remains a need for an efficient way to optimize subqueries in a multi-processor or parallel computer system, and particularly in a "shared-nothing" computer system.
SUMMARY OF THE INVENTION
The present invention provides a method for localizing execution and determining collocation of execution of subqueries in a parallel database. The method according to the present invention is suitable for both subqueries that involve correlation and subqueries that do not.
The method according to the present invention reduces the system resources needed for processing a query by reducing the number of processes used when a partitioning key of any table involved in the query is specified by an equality to a constant, host-variable, IN-list, or any internal run-time computation. The method reduces the number of processes: (1) by reducing the number of nodes involved in the query; or (2) by combining multiple processes into one.
The method according to the present invention also uses the concept of "compatible partitioning" in shared-nothing database systems to eliminate excess processing and communication for subqueries thereby improving response time and throughput.
In a first aspect, the present invention provides a method for determining locality for execution of subqueries for queries in a relational database management system, wherein said queries comprise an outer query and a subquery having a query-subquery operator and wherein partitioning columns for the query and subquery are provided, said method comprising the steps of: (a) determining if said outer query and said subquery are compatibly partitioned; (b) if said outer query and said subquery are compatibly partitioned then for each pair of partitioning columns in said outer query and said subquery determining an equivalence class for each of said columns in said pair; (c) determining if the partitioning column for said subquery belongs to the same equivalence class as the partitioning column for said outer query; (d) determining if said query-subquery operator comprises a selected operator; and and (e) if said steps (c) and (d) are true, then determining locality for said subquery so that said subquery is executable locally with respect to said outer query by the relational database management.
In a second aspect, the present invention provides a relational database management system for use with a computer system wherein queries are entered for retrieving data from tables and wherein partitioning columns and partitioning keys are provided, said system comprising: means for processing nested queries comprising an outer query and a subquery; means for determining locality of execution of said subquery including, (a) means for determining if said outer query and said subquery are compatibly partitioned; (b) means for determining an equivalence class for each column forming a corresponding pair of partitioning columns for said outer query and said subquery; (c) means for ascertaining if the partitioning column for said subquery belongs to the same equivalence class as the partitioning column for said outer query; (d) means for determining if said query-subquery operator comprises a selected operator; and (e) means responsive to said means for ascertaining and said means for determining said selected operator for determining locality of said subquery so that said subquery is locally executable with respect to said outer query by the relational database management system.
In a third aspect, the present invention provides a computer program product for use on a computer wherein queries are entered for retrieving data from tables, wherein said queries comprise an outer query and a subquery having a query-subquery operator and wherein partitioning columns for the query and subquery are provided, said computer program product comprising: a recording medium; means recorded on said medium for instructing said computer to perform the steps of, (a) determining if said outer query and said subquery are compatibly partitioned; (b) if said outer query and said subquery are compatibly partitioned then for each pair of partitioning columns in said outer query and said subquery determining an equivalence class for each of said columns in said pair; (c) determining if the partitioning column for said subquery belongs to the same equivalence class as the partitioning column for said outer query; (d) determining if said query-subquery operator comprises a selected operator; and (e) if said steps (c) and (d) are true, then determining locality for said subquery so that said subquery is locally executable with respect to said outer query by the relational database management.
BRIEF DESCRIPTION OF THE DRAWINGS
Reference will now be made, by way of example, to the accompanying drawing which shows a preferred embodiment of the present invention, and in which:
The FIGURE is a flow chart of a method for determining locality for execution of subqueries according to the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
A database management system (DBMS) is a system for accepting commands to store, retrieve, and delete data. A widely used and well known set of commands for use with DBMS is the Structured Query Language (SQL). A simple example of a SQL query is:
EXAMPLE 1:
SELECT column x
FROM table t1
WHERE column y=`employee`
The query shown in Example 1 requests that the DBMS retrieve all column x fields from tuples in table t1 which have column y equal to `employee`. In practical applications, the query can become quite complex. Multiple tables and multiple columns can be referenced. (In order to distinguish which column of which table is being referenced column x of table t1 may be written as t1.x).
One of the most powerful features of SQL is the capability of nesting SQL query expressions within the predicate in the WHERE clause. Nested SQL queries are called subqueries. With subqueries, one can compare the column expression of a query to the column expression of another query. One can also compare column expressions with subqueries whose result is a table, either by testing membership, testing if ANY row of the table has a property, or testing if ALL do. Often a query is formulated by using a subquery in the predicate. For example, to find all the employees who earn more than the average salary of the entire organization, one may write:
EXAMPLE 2:
SELECT name FROM employee e
WHERE salary>(SELECT avg(salary) FROM employee e1)
The query shown in Example 2 illustrates a non-correlated subquery. An example of a correlated subquery is provided below in Example 3.
EXAMPLE 3:
SELECT name FROM employee e
WHERE salary>(SELECT avg(salary) FROM employee e1 WHERE e1.dept=e.dept)
The query shown in Example 3 determines the names of employees who earn more than the average salary of their department.
The implementation of a subquery according to the art involves setting up processes to evaluate the subqueries, e.g. "SELECT avg(salary) . . . ", and setting up communication paths between the subquery processes and the outer query, e.g. "SELECT name FROM employee e WHERE> . . . ". The result of a query (and subquery) execution is a table, and the communication path is needed for the subquery result. For correlated subqueries, an additional path is needed in order to send the correlated values. An example of a correlated subquery is "SELECT avg(salary) FROM employee e1 WHERE e1.dept=e.dept" . . . as shown above in Example 3.
The method according to the present invention utilizes compatible partitioning to localize database operations for subqueries. The method leads to fewer processes and less communication. The fewer processes ease the demand on system resources, and less communication improves response time and throughput for the DBMS.
Reference is made to the accompanying FIGURE which shows in flow chart form an overview of a method 10 according to the present invention. The present invention is described with reference to a shared-nothing database, however, the invention has wider applicability to other parallel database architectures.
An SQL command comprising an outer query and a subquery is entered to retrieve information from a database. The command is processed by the database management system (DBMS) on a database processing machine. In known manner, the command statement is parsed and the semantics of the statement are checked for compliance with grammatical/semantic rules, and then an internal representation of the command is made for the system to process the command. In one form of the invention, a method is provided to determine if the subquery can be executed locally with respect to the outer query, in order to improve the processing of the query for retrieving data quickly. Improvements relate to improved response time and throughput by reducing system resources needed for execution of query and communication paths (between the subquery and the outer query). For a subquery executed locally according to the method of the present invention, the result of the subquery execution will have the same partitioning as both the input outer query and the input subquery. An additional improvement is achieved because the method may be recursively applied to other query-subquery tuples.
A first step 12 in the method 10 involves determining if the outer query and the inner query are "compatibly partitioned". According to step 12 of the method, the outer query and inner query are compatibly partitioned if they use the same "Partitioning Algorithm". A Partitioning Algorithm is an algorithm which unambiguously identifies a single partition by considering only the column values of a given row in a table. The smallest subset of such column values is known as the "Partitioning Key" of the table, and the columns in the subset are known as the "Partitioning Columns" of the table. If a Partitioning Algorithm exists for a table, then the table is "Deterministically Partitioned". (The result of a query execution (and a subquery execution) is a table.) If a Partitioning Algorithm does not exist for a table, then the table is not Deterministically Partitioned. Query and subquery partitioning according to the method of the present invention involves considering the Partitioning Algorithm, i.e. Partitioning Columns, for the resulting tables.
In other words, compatible partitioning means that any matching tuple of t1 (i.e. resulting table for the executed query) on the clause t1.a=t2.a will occur on the same node as the tuple of t2 (i.e. the resulting table for the executed subquery). The utilization of compatible partitioning according to the present method is shown by the following example.
EXAMPLE 4:
If tables t1 and t2 are partitioned on t1.a and t2.a respectively, then the subquery.col
SELECT * FROM t1
WHERE t1.a IN
(SELECT a FROM t2)
can be evaluated by comparing t1.a with only the t2.b values on that node.
If the outer query and the subquery are compatibly partitioned, i.e. they utilize the same Partitioning Algorithm, then the method 10 proceeds next to steps 14 and 16 to determine if the subquery is locally executable.
In steps 14 and 16, the method 10 determines if each corresponding pair of Partitioning Columns satisfies two conditions. (In the following description, a corresponding pair of Partitioning Columns refers to columns which correspond to the order of Partitioning Columns according to the partitioning keys of the outer query and the subquery.) The first condition is tested in step 14 and involves determining if both columns in the pair belong to the same Query-Subquery (QS) Equivalence Class. The second condition is tested in step 16 and involves determining if the Query-Subquery (QS) operator comprises one of the four operators: "=ANY, ⋄ALL, NOT IN or IN". If the outer query and subquery are compatibly partitioned (step 12) and the two conditions (step 14 and 16) are satisfied, then according to the method of the present invention the subquery is processed or evaluated by the DBMS entirely locally in step 20 with respect to the outer query.
For step 14 of the method, the QS Equivalence Class is the list of columns each of which belong to any of the base tables (or derived tables) belonging to the Query-Subquery. Two columns c1,c2 belong to the same QS Equivalence Class, if there exists a Boolean Factor c1=c2 in either the query predicate or the subquery predicate. The method according to the present invention uses the column equivalence class to establish a one-to-one pairing between the partitioning columns of the query and the partitioning columns of the subquery. If there is a one-to-one pairing, then the subquery is executable locally with respect to the outer query as shown in step 20.
The method is also suitable for cases where the order of partitioning columns in the partitioning key is not significant. In such a case, the method may consider more partitioning columns of the subquery for pairing with a single partitioning column of the outer query and vice versa. To guarantee locality of execution of the subquery with respect to the outer query, the method needs to establish any one-to-one pairing between the partitioning columns of the outer query and the partitioning columns of the subquery, where each pair satisfies the two conditions.
The second condition applied in step 16 of the method 10 involves determining if the outer query's column participates in a query-subquery predicate and the subquery's column participates in the SELECT list of the subquery in such a form, that only the equal values of the two columns need to be investigated to conclude validity of the predicate. In the case of a query-subquery which does not comprise a QS operator of the four "=ANY, ⋄ALL, NOT IN or IN", it is still possible to execute the subquery locally if uniqueness in the subquery result can be guaranteed. As shown in the FIGURE, if the second condition in step 16 is not satisfied, i.e. the QS operator does not comprise=ANY, ⋄ALL, NOT IN or IN, the method 10 moves to step 18. In step 18, the method 10 checks if the QS operator is "=" and then determines if the locality of the subquery result can be guaranteed. If the outcome of step 18 is TRUE, then according to the invention the subquery is locally executable in step 20. One method for determining uniqueness is by considering if any unique key on the subquery result (i.e. table) is specified by equality to a constant, hostvar or other construct. The operation of this aspect of the method is shown by the following example:
EXAMPLE 5:
SELECT * FROM t1
WHERE a=(SELECT a FROM t2 WHERE t2.c=300)
Given t2.c is a unique key on table t2
According to the present method, the unique key t2.c is recognized as guaranteeing the uniqueness of the subquery result and therefore the subquery is executed locally with respect to the outer query. On the other hand, if the "elect list" for the subquery is not a simple column or constant, for example SELECT * FROM t1 WHERE a IN (SELECT avg(b) FROM t2), the method cannot guarantee uniqueness of the subquery result and therefore the subquery is not executed locally.
The operation of the method 10 according to the present invention is illustrated for a non-correlated subquery. A non-correlated subquery is evaluated entirely locally if the following conditions are satisfied: (1) the QS operand and the subquery result are "compatibly partitioned" and belong to the same equivalence class; and (2) the QS operator is one of four=ANY, ⋄ALL, NOT IN or IN. The method determines compatible partitioning (step 12 in the FIGURE) using a test similar to that used for determining compatibly partitioning for joins, e.g. T1.a=T2.a. The following example illustrates a non-correlated subquery.
EXAMPLE 6:
SELECT * FROM t1.a
WHERE t1.a IN (SELECT a FROM t2)
and given t1 and t2 are partitioned on t1.a and t2.a
For a non-correlated subquery, the method first determines if the QS operand, i.e. t1.a, and the subquery result, i.e. SELECT a from t2, are compatibly partitioned (step 12 in the FIGURE). The next step in the method involves determining if the QS operand is=ANY, ⋄ANY, NOT IN or IN (step 16 in the FIGURE). Given that t1 and t2 are partitioned on t1.a and t2.a and the QS operand is "IN", then according to the method 10 the subquery is evaluated locally by comparing the t1.a value with only the t2.b values on that node. Evaluating the subquery locally improves the response time and throughput for the DBMS because fewer system resources and communication are needed.
The method is also suitable for subquery having more than one table provided the subquery result is compatibly partitioned with the QS operand. For example, the following subquery includes another table t3 which is partitioned on t3.a:
EXAMPLE 7:
SELECT * FROM t1
WHERE t1.a ⋄ALL
(SELECT t2.a FROM t2,t3 WHERE t2.a=t3.a)
According to the method, the subquery in Example 7 is evaluated fully locally.
The operation of the method according to the present invention for a correlated subquery is shown for the following example query.
EXAMPLE 8:
SELECT * FROM t1
WHERE a IN (SELECT b FROM t2 WHERE t2.a=t1.a)
The method first checks the correlation values connecting the outer query block to the subquery block. For the query of Example 8, the subquery block has a correlation value of t1.a and the method determines that because of the correlation value, i.e. t1.a, for the subquery, the values of the outer query, i.e. SELECT * FROM t1, will only match those values of t2.b which come from the same node. Therefore, according to the method the subquery block is executable locally.
If the correlation values do not guarantee locality, then the method proceeds as described above for a non-correlated subquery, i.e. the method connects the QS operand, IN, NOT IN,=ANY or ⋄ALL predicate, with the subquery result. The operation of this aspect of the present method is shown by the following example:
EXAMPLE 9:
SELECT * FROM t1
WHERE a IN (SELECT a FROM t2 WHERE t2.b=t1.a)
In the above example, the method determines if the correlation value, i.e. t1.a, for the subquery block guarantees locality. For this example, the correlation value t1.a does not guarantee locality, i.e. matches will not all come from the same node, and the method proceeds as for a non-correlated subquery described above. According to the method of the present invention, the QS predicate, i.e. IN, NOT IN, ⋄ANY,=ANY, guarantees locality and the subquery is executable locally with respect to the outer query.
It is another feature of the present method that if the subquery can be executed locally with respect to the outer query, then the result of the execution and application of the subquery produces a new query with the same partitioning as both the original outer query and the subquery. This means that the steps described above can be applied recursively to other query-subquery tuples.
The present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. Therefore, the presently discussed embodiments are considered to be illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims (13)

We claim:
1. A method for determining locality for execution of subqueries for queries in a relational database management system, wherein said queries comprise an outer query and a subquery having a query-subquery operator and wherein partitioning columns for the query and subquery are provided, said method comprising the steps of:
(a) determining if said outer query and said subquery are compatibly partitioned;
(b) if said outer query and said subquery are compatibly partitioned then for each pair of partitioning columns in said outer query and said subquery determining an equivalence class for each of said columns in said pair;
(c) determining if the partitioning column for said subquery belongs to the same equivalence class as the partitioning column for said outer query;
(d) determining if said query-subquery operator comprises a selected operator; and
(e) if said steps (c) and (d) are true, determining locality for said subquery so that said subquery is locally executable with respect to said outer query by the relational database management.
2. The method as claimed in claim 1, wherein said step (d) comprises checking if said selected operator is the=operator and determining if a correlation value connecting said outer query to said subquery guarantees localized execution of the said subquery with respect to said outer query.
3. The method as claimed in claim 2, wherein said step (c) comprises determining if a column for said outer query will match on values of a column for said subquery and which come from a same node.
4. The method as claimed in claim 1, wherein said selected operator belongs to a group of operators comprising=ANY, ⋄ALL, NOT IN or IN.
5. The method as claimed in claim 1, wherein said step (c) comprises determining if a partitioning key for said subquery is specified by a construct.
6. The method as claimed in claim 4, wherein said construct comprises a constant.
7. The method as claimed in claim 4, wherein said construct comprises a HOSTVAR.
8. A method for determining locality for execution of subqueries in queries in a relational database management system, wherein said queries comprise an outer query and a subquery having a query-subquery operator and wherein partitioning columns for the query and subquery are provided, said method comprising the steps of:
(a) determining if said outer query and said subquery are compatibly partitioned;
(b) if said outer query and said subquery are compatibly partitioned then for each pair of partitioning columns in said outer query and said subquery determining an equivalence class for each of said columns in said pair;
(c) determining if the partitioning column for said subquery belongs to the same equivalence class as the partitioning column for said outer query;
(d) determining if said query-subquery operator belongs to a group of operators comprising=ANY, ⋄ALL, NOT IN or IN; and
(e) if said steps (c) and (d) are true, then concluding locality for execution of said subquery so that said subquery is executable locally with respect to said outer query by the relational database management system.
9. A relational database management system for use with a computer system wherein queries are entered for retrieving data from tables and wherein partitioning columns and partitioning keys are provided, said system comprising:
means for processing nested queries comprising an outer query and a subquery;
means for determining locality of execution of said subquery including,
(a) means for determining if said outer query and said subquery are compatibly partitioned;
(b) means for determining an equivalence class for each column forming a corresponding pair of partitioning columns for said outer query and said subquery;
(c) means for ascertaining if the partitioning column for said subquery belongs to the same equivalence class as the partitioning column for said outer query;
(d) means for determining if said query-subquery operator comprises a selected operator; and
(e) means responsive to said means for ascertaining and said means for determining said selected operator for determining locality of said subquery so that said subquery is locally executable with respect to said outer query by the relational database management system.
10. The system as claimed in claim 9, wherein said means for determining said query-subquery is responsive to a query-subquery operator belonging to a group of operators comprising=ANY, ⋄ALL, NOT IN or IN.
11. A computer program product for use on a computer wherein queries are entered for retrieving data from tables, wherein said queries comprise an outer query and a subquery having a query-subquery operator and wherein partitioning columns for the query and subquery are provided, said computer program product comprising:
a recording medium;
means recorded on said medium for instructing said computer to perform the steps of,
(a) determining if said outer query and said subquery are compatibly partitioned;
(b) if said outer query and said subquery are compatibly partitioned then for each pair of partitioning columns in said outer query and said subquery determining an equivalence class for each of said columns in said pair;
(c) determining if the partitioning column for said subquery belongs to the same equivalence class as the partitioning column for said outer query;
(d) determining if said query-subquery operator comprises a selected operator; and
(e) if said steps (c) and (d) are true, then determining locality for said subquery so that said subquery is locally executable with respect to said outer query by the relational database management.
12. The computer program product as claimed in claim 11, wherein step (d) comprises determining if a correlation value connecting said outer query to said subquery guarantees localized execution of said subquery with respect to said outer query.
13. The computer program product as claimed in claim 11, wherein said selected operator belongs to a group of operators comprising=ANY, ⋄ALL, NOT IN or IN.
US08/672,013 1995-09-27 1996-06-24 Method for localizing execution of subqueries and determining collocation of execution of subqueries in a parallel database Ceased US5745746A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US08/672,013 US5745746A (en) 1996-06-24 1996-06-24 Method for localizing execution of subqueries and determining collocation of execution of subqueries in a parallel database
US09/585,927 USRE37965E1 (en) 1995-09-27 2000-06-02 Method for localizing execution or subqueries and determining collocation of execution of subqueries in a parallel database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US08/672,013 US5745746A (en) 1996-06-24 1996-06-24 Method for localizing execution of subqueries and determining collocation of execution of subqueries in a parallel database

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US09/585,927 Reissue USRE37965E1 (en) 1995-09-27 2000-06-02 Method for localizing execution or subqueries and determining collocation of execution of subqueries in a parallel database

Publications (1)

Publication Number Publication Date
US5745746A true US5745746A (en) 1998-04-28

Family

ID=24696802

Family Applications (1)

Application Number Title Priority Date Filing Date
US08/672,013 Ceased US5745746A (en) 1995-09-27 1996-06-24 Method for localizing execution of subqueries and determining collocation of execution of subqueries in a parallel database

Country Status (1)

Country Link
US (1) US5745746A (en)

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5864840A (en) * 1997-06-30 1999-01-26 International Business Machines Corporation Evaluation of existential and universal subquery in a relational database management system for increased efficiency
US6076087A (en) * 1997-11-26 2000-06-13 At&T Corp Query evaluation on distributed semi-structured data
US6081801A (en) * 1997-06-30 2000-06-27 International Business Machines Corporation Shared nothing parallel execution of procedural constructs in SQL
US6092062A (en) * 1997-06-30 2000-07-18 International Business Machines Corporation Relational database query optimization to perform query evaluation plan, pruning based on the partition properties
US6112198A (en) * 1997-06-30 2000-08-29 International Business Machines Corporation Optimization of data repartitioning during parallel query optimization
US6405198B1 (en) 1998-09-04 2002-06-11 International Business Machines Corporation Complex data query support in a partitioned database system
US6704724B1 (en) * 1999-12-14 2004-03-09 Ncr Corporation Parallel optimizer hints with a direct manipulation user interface
US20040098373A1 (en) * 2002-11-14 2004-05-20 David Bayliss System and method for configuring a parallel-processing database system
US20040098374A1 (en) * 2002-11-14 2004-05-20 David Bayliss Query scheduling in a parallel-processing database system
US20040098371A1 (en) * 2002-11-14 2004-05-20 David Bayliss Failure recovery in a parallel-processing database system
US20040098372A1 (en) * 2002-11-14 2004-05-20 David Bayliss Global-results processing matrix for processing queries
US20040098390A1 (en) * 2002-11-14 2004-05-20 David Bayliss Method for sorting and distributing data among a plurality of nodes
US20050154664A1 (en) * 2000-08-22 2005-07-14 Guy Keith A. Credit and financial information and management system
US6925642B1 (en) * 1999-04-29 2005-08-02 Hewlett-Packard Development Company, L.P. Distributed computer network which spawns inter-node parallel processes based on resource availability
US6968335B2 (en) 2002-11-14 2005-11-22 Sesint, Inc. Method and system for parallel processing of database queries
US20070198471A1 (en) * 2004-03-08 2007-08-23 Schneider Donavan A Using query persistence for efficient subquery evaluation in federated databases
US20080129452A1 (en) * 2006-12-05 2008-06-05 International Business Machines Corporation Middleware for query processing across a network of rfid databases
US7403942B1 (en) 2003-02-04 2008-07-22 Seisint, Inc. Method and system for processing data records
US20090271694A1 (en) * 2008-04-24 2009-10-29 Lexisnexis Risk & Information Analytics Group Inc. Automated detection of null field values and effectively null field values
US20100005090A1 (en) * 2008-07-02 2010-01-07 Lexisnexis Risk & Information Analytics Group Inc. Statistical measure and calibration of search criteria where one or both of the search criteria and database is incomplete
US7657540B1 (en) 2003-02-04 2010-02-02 Seisint, Inc. Method and system for linking and delinking data records
US7720846B1 (en) 2003-02-04 2010-05-18 Lexisnexis Risk Data Management, Inc. System and method of using ghost identifiers in a database
US7912842B1 (en) 2003-02-04 2011-03-22 Lexisnexis Risk Data Management Inc. Method and system for processing and linking data records
US20110153650A1 (en) * 2009-12-18 2011-06-23 Electronics And Telecommunications Research Institute Column-based data managing method and apparatus, and column-based data searching method
US20150149507A1 (en) * 2012-09-14 2015-05-28 Hitachi, Ltd. Stream data multiprocessing method
US9189505B2 (en) 2010-08-09 2015-11-17 Lexisnexis Risk Data Management, Inc. System of and method for entity representation splitting without the need for human interaction
US9411859B2 (en) 2009-12-14 2016-08-09 Lexisnexis Risk Solutions Fl Inc External linking based on hierarchical level weightings
US9811845B2 (en) 2013-06-11 2017-11-07 Sap Se System for accelerated price master database lookup
US11138230B2 (en) * 2018-03-26 2021-10-05 Mcafee, Llc Methods, apparatus, and systems to aggregate partitioned computer database data
US11144526B2 (en) * 2006-10-05 2021-10-12 Splunk Inc. Applying time-based search phrases across event data

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4432057A (en) * 1981-11-27 1984-02-14 International Business Machines Corporation Method for the dynamic replication of data under distributed system control to control utilization of resources in a multiprocessing, distributed data base system
US4769772A (en) * 1985-02-28 1988-09-06 Honeywell Bull, Inc. Automated query optimization method using both global and parallel local optimizations for materialization access planning for distributed databases
US5121494A (en) * 1989-10-05 1992-06-09 Ibm Corporation Joining two database relations on a common field in a parallel relational database field
US5165018A (en) * 1987-01-05 1992-11-17 Motorola, Inc. Self-configuration of nodes in a distributed message-based operating system
US5216593A (en) * 1991-01-24 1993-06-01 International Business Machines Corporation Method and apparatus for discrete activity resourse allocation through cardinality constraint generation
US5241652A (en) * 1989-06-08 1993-08-31 Digital Equipment Corporation System for performing rule partitioning in a rete network
US5276870A (en) * 1987-12-11 1994-01-04 Hewlett-Packard Company View composition in a data base management system
US5287493A (en) * 1990-08-31 1994-02-15 International Business Machines Corporation Database interactive prompted query system having named database tables linked together by a user through join statements
US5307484A (en) * 1991-03-06 1994-04-26 Chrysler Corporation Relational data base repository system for managing functional and physical data structures of nodes and links of multiple computer networks
US5325525A (en) * 1991-04-04 1994-06-28 Hewlett-Packard Company Method of automatically controlling the allocation of resources of a parallel processor computer system by calculating a minimum execution time of a task and scheduling subtasks against resources to execute the task in the minimum time
US5367675A (en) * 1991-12-13 1994-11-22 International Business Machines Corporation Computer automated system and method for optimizing the processing of a query in a relational database system by merging subqueries with the query
US5446886A (en) * 1992-03-12 1995-08-29 Ricoh Company, Ltd. System from optimizing query processing of multi-attribute distributed relations using local relation tuple numbers to choose semijoins
US5544355A (en) * 1993-06-14 1996-08-06 Hewlett-Packard Company Method and apparatus for query optimization in a relational database system having foreign functions
US5548755A (en) * 1995-02-17 1996-08-20 International Business Machines Corporation System for optimizing correlated SQL queries in a relational database using magic decorrelation
US5551031A (en) * 1991-08-23 1996-08-27 International Business Machines Corporation Program storage device and computer program product for outer join operations using responsibility regions assigned to inner tables in a relational database
US5600831A (en) * 1994-02-28 1997-02-04 Lucent Technologies Inc. Apparatus and methods for retrieving information by modifying query plan based on description of information sources

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4432057A (en) * 1981-11-27 1984-02-14 International Business Machines Corporation Method for the dynamic replication of data under distributed system control to control utilization of resources in a multiprocessing, distributed data base system
US4769772A (en) * 1985-02-28 1988-09-06 Honeywell Bull, Inc. Automated query optimization method using both global and parallel local optimizations for materialization access planning for distributed databases
US5165018A (en) * 1987-01-05 1992-11-17 Motorola, Inc. Self-configuration of nodes in a distributed message-based operating system
US5276870A (en) * 1987-12-11 1994-01-04 Hewlett-Packard Company View composition in a data base management system
US5241652A (en) * 1989-06-08 1993-08-31 Digital Equipment Corporation System for performing rule partitioning in a rete network
US5121494A (en) * 1989-10-05 1992-06-09 Ibm Corporation Joining two database relations on a common field in a parallel relational database field
US5287493A (en) * 1990-08-31 1994-02-15 International Business Machines Corporation Database interactive prompted query system having named database tables linked together by a user through join statements
US5216593A (en) * 1991-01-24 1993-06-01 International Business Machines Corporation Method and apparatus for discrete activity resourse allocation through cardinality constraint generation
US5307484A (en) * 1991-03-06 1994-04-26 Chrysler Corporation Relational data base repository system for managing functional and physical data structures of nodes and links of multiple computer networks
US5325525A (en) * 1991-04-04 1994-06-28 Hewlett-Packard Company Method of automatically controlling the allocation of resources of a parallel processor computer system by calculating a minimum execution time of a task and scheduling subtasks against resources to execute the task in the minimum time
US5551031A (en) * 1991-08-23 1996-08-27 International Business Machines Corporation Program storage device and computer program product for outer join operations using responsibility regions assigned to inner tables in a relational database
US5367675A (en) * 1991-12-13 1994-11-22 International Business Machines Corporation Computer automated system and method for optimizing the processing of a query in a relational database system by merging subqueries with the query
US5446886A (en) * 1992-03-12 1995-08-29 Ricoh Company, Ltd. System from optimizing query processing of multi-attribute distributed relations using local relation tuple numbers to choose semijoins
US5544355A (en) * 1993-06-14 1996-08-06 Hewlett-Packard Company Method and apparatus for query optimization in a relational database system having foreign functions
US5600831A (en) * 1994-02-28 1997-02-04 Lucent Technologies Inc. Apparatus and methods for retrieving information by modifying query plan based on description of information sources
US5548755A (en) * 1995-02-17 1996-08-20 International Business Machines Corporation System for optimizing correlated SQL queries in a relational database using magic decorrelation

Cited By (89)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6032143A (en) * 1997-06-30 2000-02-29 International Business Machines Corporation Evaluation of existential and universal subquery in a relational database management system for increased efficiency
US6081801A (en) * 1997-06-30 2000-06-27 International Business Machines Corporation Shared nothing parallel execution of procedural constructs in SQL
US6092062A (en) * 1997-06-30 2000-07-18 International Business Machines Corporation Relational database query optimization to perform query evaluation plan, pruning based on the partition properties
US6112198A (en) * 1997-06-30 2000-08-29 International Business Machines Corporation Optimization of data repartitioning during parallel query optimization
US6345267B1 (en) * 1997-06-30 2002-02-05 International Business Machines Corporation Method and system for look ahead query evaluation planning based on interesting partition properties
US5864840A (en) * 1997-06-30 1999-01-26 International Business Machines Corporation Evaluation of existential and universal subquery in a relational database management system for increased efficiency
US6076087A (en) * 1997-11-26 2000-06-13 At&T Corp Query evaluation on distributed semi-structured data
US6405198B1 (en) 1998-09-04 2002-06-11 International Business Machines Corporation Complex data query support in a partitioned database system
US6925642B1 (en) * 1999-04-29 2005-08-02 Hewlett-Packard Development Company, L.P. Distributed computer network which spawns inter-node parallel processes based on resource availability
US6704724B1 (en) * 1999-12-14 2004-03-09 Ncr Corporation Parallel optimizer hints with a direct manipulation user interface
US20050154664A1 (en) * 2000-08-22 2005-07-14 Guy Keith A. Credit and financial information and management system
US7293024B2 (en) 2002-11-14 2007-11-06 Seisint, Inc. Method for sorting and distributing data among a plurality of nodes
US7945581B2 (en) 2002-11-14 2011-05-17 Lexisnexis Risk Data Management, Inc. Global-results processing matrix for processing queries
US20040098390A1 (en) * 2002-11-14 2004-05-20 David Bayliss Method for sorting and distributing data among a plurality of nodes
US20040098371A1 (en) * 2002-11-14 2004-05-20 David Bayliss Failure recovery in a parallel-processing database system
US20040098374A1 (en) * 2002-11-14 2004-05-20 David Bayliss Query scheduling in a parallel-processing database system
US6968335B2 (en) 2002-11-14 2005-11-22 Sesint, Inc. Method and system for parallel processing of database queries
US7185003B2 (en) 2002-11-14 2007-02-27 Seisint, Inc. Query scheduling in a parallel-processing database system
US7240059B2 (en) 2002-11-14 2007-07-03 Seisint, Inc. System and method for configuring a parallel-processing database system
US20040098372A1 (en) * 2002-11-14 2004-05-20 David Bayliss Global-results processing matrix for processing queries
US8676843B2 (en) 2002-11-14 2014-03-18 LexiNexis Risk Data Management Inc. Failure recovery in a parallel-processing database system
US20040098373A1 (en) * 2002-11-14 2004-05-20 David Bayliss System and method for configuring a parallel-processing database system
US7403942B1 (en) 2003-02-04 2008-07-22 Seisint, Inc. Method and system for processing data records
US7912842B1 (en) 2003-02-04 2011-03-22 Lexisnexis Risk Data Management Inc. Method and system for processing and linking data records
US7720846B1 (en) 2003-02-04 2010-05-18 Lexisnexis Risk Data Management, Inc. System and method of using ghost identifiers in a database
US7657540B1 (en) 2003-02-04 2010-02-02 Seisint, Inc. Method and system for linking and delinking data records
US9043359B2 (en) 2003-02-04 2015-05-26 Lexisnexis Risk Solutions Fl Inc. Internal linking co-convergence using clustering with no hierarchy
US9015171B2 (en) 2003-02-04 2015-04-21 Lexisnexis Risk Management Inc. Method and system for linking and delinking data records
US9020971B2 (en) 2003-02-04 2015-04-28 Lexisnexis Risk Solutions Fl Inc. Populating entity fields based on hierarchy partial resolution
US9037606B2 (en) 2003-02-04 2015-05-19 Lexisnexis Risk Solutions Fl Inc. Internal linking co-convergence using clustering with hierarchy
US9384262B2 (en) 2003-02-04 2016-07-05 Lexisnexis Risk Solutions Fl Inc. Internal linking co-convergence using clustering with hierarchy
US20070198471A1 (en) * 2004-03-08 2007-08-23 Schneider Donavan A Using query persistence for efficient subquery evaluation in federated databases
US7925665B2 (en) * 2004-03-08 2011-04-12 Siebel Systems, Inc. Using query persistence for efficient subquery evaluation in federated databases
US11144526B2 (en) * 2006-10-05 2021-10-12 Splunk Inc. Applying time-based search phrases across event data
US11249971B2 (en) 2006-10-05 2022-02-15 Splunk Inc. Segmenting machine data using token-based signatures
US11526482B2 (en) 2006-10-05 2022-12-13 Splunk Inc. Determining timestamps to be associated with events in machine data
US11537585B2 (en) 2006-10-05 2022-12-27 Splunk Inc. Determining time stamps in machine data derived events
US11550772B2 (en) 2006-10-05 2023-01-10 Splunk Inc. Time series search phrase processing
US11561952B2 (en) 2006-10-05 2023-01-24 Splunk Inc. Storing events derived from log data and performing a search on the events and data that is not log data
US11947513B2 (en) 2006-10-05 2024-04-02 Splunk Inc. Search phrase processing
US20080129452A1 (en) * 2006-12-05 2008-06-05 International Business Machines Corporation Middleware for query processing across a network of rfid databases
US8244747B2 (en) 2006-12-05 2012-08-14 International Business Machines Corporation Middleware for query processing across a network of RFID databases
US8275770B2 (en) 2008-04-24 2012-09-25 Lexisnexis Risk & Information Analytics Group Inc. Automated selection of generic blocking criteria
US9836524B2 (en) 2008-04-24 2017-12-05 Lexisnexis Risk Solutions Fl Inc. Internal linking co-convergence using clustering with hierarchy
US8266168B2 (en) 2008-04-24 2012-09-11 Lexisnexis Risk & Information Analytics Group Inc. Database systems and methods for linking records and entity representations with sufficiently high confidence
US8135681B2 (en) 2008-04-24 2012-03-13 Lexisnexis Risk Solutions Fl Inc. Automated calibration of negative field weighting without the need for human interaction
US8135680B2 (en) 2008-04-24 2012-03-13 Lexisnexis Risk Solutions Fl Inc. Statistical record linkage calibration for reflexive, symmetric and transitive distance measures at the field and field value levels without the need for human interaction
US8135679B2 (en) 2008-04-24 2012-03-13 Lexisnexis Risk Solutions Fl Inc. Statistical record linkage calibration for multi token fields without the need for human interaction
US20090271404A1 (en) * 2008-04-24 2009-10-29 Lexisnexis Risk & Information Analytics Group, Inc. Statistical record linkage calibration for interdependent fields without the need for human interaction
US8195670B2 (en) 2008-04-24 2012-06-05 Lexisnexis Risk & Information Analytics Group Inc. Automated detection of null field values and effectively null field values
US8046362B2 (en) 2008-04-24 2011-10-25 Lexisnexis Risk & Information Analytics Group, Inc. Statistical record linkage calibration for reflexive and symmetric distance measures at the field and field value levels without the need for human interaction
US20090271424A1 (en) * 2008-04-24 2009-10-29 Lexisnexis Group Database systems and methods for linking records and entity representations with sufficiently high confidence
US8135719B2 (en) 2008-04-24 2012-03-13 Lexisnexis Risk Solutions Fl Inc. Statistical record linkage calibration at the field and field value levels without the need for human interaction
US20090271694A1 (en) * 2008-04-24 2009-10-29 Lexisnexis Risk & Information Analytics Group Inc. Automated detection of null field values and effectively null field values
US8250078B2 (en) 2008-04-24 2012-08-21 Lexisnexis Risk & Information Analytics Group Inc. Statistical record linkage calibration for interdependent fields without the need for human interaction
US8316047B2 (en) 2008-04-24 2012-11-20 Lexisnexis Risk Solutions Fl Inc. Adaptive clustering of records and entity representations
US20090271405A1 (en) * 2008-04-24 2009-10-29 Lexisnexis Risk & Information Analytics Grooup Inc. Statistical record linkage calibration for reflexive, symmetric and transitive distance measures at the field and field value levels without the need for human interaction
US8484168B2 (en) 2008-04-24 2013-07-09 Lexisnexis Risk & Information Analytics Group, Inc. Statistical record linkage calibration for multi token fields without the need for human interaction
US8489617B2 (en) 2008-04-24 2013-07-16 Lexisnexis Risk Solutions Fl Inc. Automated detection of null field values and effectively null field values
US20090271397A1 (en) * 2008-04-24 2009-10-29 Lexisnexis Risk & Information Analytics Group Inc. Statistical record linkage calibration at the field and field value levels without the need for human interaction
US8495077B2 (en) 2008-04-24 2013-07-23 Lexisnexis Risk Solutions Fl Inc. Database systems and methods for linking records and entity representations with sufficiently high confidence
US20090292695A1 (en) * 2008-04-24 2009-11-26 Lexisnexis Risk & Information Analytics Group Inc. Automated selection of generic blocking criteria
US8572052B2 (en) 2008-04-24 2013-10-29 LexisNexis Risk Solution FL Inc. Automated calibration of negative field weighting without the need for human interaction
US20090292694A1 (en) * 2008-04-24 2009-11-26 Lexisnexis Risk & Information Analytics Group Inc. Statistical record linkage calibration for multi token fields without the need for human interaction
US9031979B2 (en) 2008-04-24 2015-05-12 Lexisnexis Risk Solutions Fl Inc. External linking based on hierarchical level weightings
US8661026B2 (en) 2008-07-02 2014-02-25 Lexisnexis Risk Solutions Fl Inc. Entity representation identification using entity representation level information
US20100005090A1 (en) * 2008-07-02 2010-01-07 Lexisnexis Risk & Information Analytics Group Inc. Statistical measure and calibration of search criteria where one or both of the search criteria and database is incomplete
US20100010988A1 (en) * 2008-07-02 2010-01-14 Lexisnexis Risk & Information Analytics Group Inc. Entity representation identification using entity representation level information
US20100005091A1 (en) * 2008-07-02 2010-01-07 Lexisnexis Risk & Information Analytics Group Inc. Statistical measure and calibration of reflexive, symmetric and transitive fuzzy search criteria where one or both of the search criteria and database is incomplete
US8639691B2 (en) 2008-07-02 2014-01-28 Lexisnexis Risk Solutions Fl Inc. System for and method of partitioning match templates
US20100005079A1 (en) * 2008-07-02 2010-01-07 Lexisnexis Risk & Information Analytics Group Inc. System for and method of partitioning match templates
US20100005078A1 (en) * 2008-07-02 2010-01-07 Lexisnexis Risk & Information Analytics Group Inc. System and method for identifying entity representations based on a search query using field match templates
US8090733B2 (en) 2008-07-02 2012-01-03 Lexisnexis Risk & Information Analytics Group, Inc. Statistical measure and calibration of search criteria where one or both of the search criteria and database is incomplete
US8190616B2 (en) 2008-07-02 2012-05-29 Lexisnexis Risk & Information Analytics Group Inc. Statistical measure and calibration of reflexive, symmetric and transitive fuzzy search criteria where one or both of the search criteria and database is incomplete
US8572070B2 (en) 2008-07-02 2013-10-29 LexisNexis Risk Solution FL Inc. Statistical measure and calibration of internally inconsistent search criteria where one or both of the search criteria and database is incomplete
US8495076B2 (en) 2008-07-02 2013-07-23 Lexisnexis Risk Solutions Fl Inc. Statistical measure and calibration of search criteria where one or both of the search criteria and database is incomplete
US20100017399A1 (en) * 2008-07-02 2010-01-21 Lexisnexis Risk & Information Analytics Group Inc. Technique for recycling match weight calculations
US8285725B2 (en) 2008-07-02 2012-10-09 Lexisnexis Risk & Information Analytics Group Inc. System and method for identifying entity representations based on a search query using field match templates
US8484211B2 (en) 2008-07-02 2013-07-09 Lexisnexis Risk Solutions Fl Inc. Batch entity representation identification using field match templates
US8639705B2 (en) 2008-07-02 2014-01-28 Lexisnexis Risk Solutions Fl Inc. Technique for recycling match weight calculations
US9836508B2 (en) 2009-12-14 2017-12-05 Lexisnexis Risk Solutions Fl Inc. External linking based on hierarchical level weightings
US9411859B2 (en) 2009-12-14 2016-08-09 Lexisnexis Risk Solutions Fl Inc External linking based on hierarchical level weightings
US20110153650A1 (en) * 2009-12-18 2011-06-23 Electronics And Telecommunications Research Institute Column-based data managing method and apparatus, and column-based data searching method
US9501505B2 (en) 2010-08-09 2016-11-22 Lexisnexis Risk Data Management, Inc. System of and method for entity representation splitting without the need for human interaction
US9189505B2 (en) 2010-08-09 2015-11-17 Lexisnexis Risk Data Management, Inc. System of and method for entity representation splitting without the need for human interaction
US9798830B2 (en) * 2012-09-14 2017-10-24 Hitachi, Ltd. Stream data multiprocessing method
US20150149507A1 (en) * 2012-09-14 2015-05-28 Hitachi, Ltd. Stream data multiprocessing method
US9811845B2 (en) 2013-06-11 2017-11-07 Sap Se System for accelerated price master database lookup
US11138230B2 (en) * 2018-03-26 2021-10-05 Mcafee, Llc Methods, apparatus, and systems to aggregate partitioned computer database data

Similar Documents

Publication Publication Date Title
US5745746A (en) Method for localizing execution of subqueries and determining collocation of execution of subqueries in a parallel database
Selinger et al. Access path selection in a relational database management system
US6032143A (en) Evaluation of existential and universal subquery in a relational database management system for increased efficiency
US5960426A (en) Database system and method for supporting current of cursor updates and deletes from a select query from one or more updatable tables in single node and mpp environments
US5557791A (en) Outer join operations using responsibility regions assigned to inner tables in a relational database
US5619692A (en) Semantic optimization of query order requirements using order detection by normalization in a query compiler system
US5590319A (en) Query processor for parallel processing in homogenous and heterogenous databases
US5778354A (en) Database management system with improved indexed accessing
US5590324A (en) Optimization of SQL queries using universal quantifiers, set intersection, and max/min aggregation in the presence of nullable columns
US6112198A (en) Optimization of data repartitioning during parallel query optimization
US4769772A (en) Automated query optimization method using both global and parallel local optimizations for materialization access planning for distributed databases
US6834279B1 (en) Method and system for inclusion hash joins and exclusion hash joins in relational databases
US5276870A (en) View composition in a data base management system
US5845274A (en) Computer program product for avoiding complete index tree traversals in sequential and almost sequential index probes
US5903893A (en) Method and apparatus for optimizing a merge-join operation across heterogeneous databases
US6401083B1 (en) Method and mechanism for associating properties with objects and instances
US6957210B1 (en) Optimizing an exclusion join operation using a bitmap index structure
US20050091210A1 (en) Method for integrating and accessing of heterogeneous data sources
US6353819B1 (en) Method and system for using dynamically generated code to perform record management layer functions in a relational database manager
Martins et al. Comparing oracle and postgresql, performance and optimization
USRE37965E1 (en) Method for localizing execution or subqueries and determining collocation of execution of subqueries in a parallel database
Kang et al. Multiple-query optimization at algorithm-level
Buron et al. Revisiting RDF storage layouts for efficient query answering
CA2159270C (en) Method for localizing execution of subqueries and determining collocation of execution of subqueries in a parallel database
US5953715A (en) Utilizing pseudotables as a method and mechanism providing database monitor information

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:JHINGRAN, ANANT D.;KOLLAR, LUBOR J.;MALKEMUS, TIMOTHY R.;AND OTHERS;REEL/FRAME:008055/0614;SIGNING DATES FROM 19951201 TO 19951211

AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KOLLAR, LUBOR J.;JHINGRAN, ANANT D.;MALKEMUS, TIMOTHY R.;AND OTHERS;REEL/FRAME:008214/0450;SIGNING DATES FROM 19951201 TO 19951211

STCF Information on status: patent grant

Free format text: PATENTED CASE

RF Reissue application filed

Effective date: 20000602

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 4