US20120011144A1 - Aggregation in parallel computation environments with shared memory - Google Patents

Aggregation in parallel computation environments with shared memory Download PDF

Info

Publication number
US20120011144A1
US20120011144A1 US12/978,194 US97819410A US2012011144A1 US 20120011144 A1 US20120011144 A1 US 20120011144A1 US 97819410 A US97819410 A US 97819410A US 2012011144 A1 US2012011144 A1 US 2012011144A1
Authority
US
United States
Prior art keywords
execution threads
local hash
local
hash
hash tables
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/978,194
Inventor
Frederik Transier
Christian Mathis
Nico Bohnsack
Kai Stammerjohann
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SAP SE
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US12/978,194 priority Critical patent/US20120011144A1/en
Assigned to SAP AG reassignment SAP AG ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BOHNSACK, NICO, MATHIS, CHRISTIAN, STAMMERJOHANN, KAI, TRANSIER, FREDERIK
Priority to EP11004931.9A priority patent/EP2469423B1/en
Publication of US20120011144A1 publication Critical patent/US20120011144A1/en
Assigned to SAP AG reassignment SAP AG ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: SANDERS, PETER, MULLER, INGO
Assigned to SAP SE reassignment SAP SE CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: SAP AG
Priority to US15/016,978 priority patent/US10127281B2/en
Priority to US15/040,501 priority patent/US10114866B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24542Plan optimisation
    • G06F16/24544Join order optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries

Definitions

  • Some embodiments relate to a data structure. More specifically, some embodiments provide a method and system for a data structure and use of same in parallel computing environments.
  • a number of presently developed and developing computer systems include multiple processors in an attempt to provide increased computing performance. Advances in computing performance, including for example processing speed and throughput, may be provided by parallel computing systems and devices as compared to single processing systems that sequentially process programs and instructions.
  • FIG. 1 is block diagram of a system according to some embodiments.
  • FIG. 2 is a block diagram of an operating environment according to some embodiments.
  • FIGS. 3A-3D are illustrative depictions of various aspects of a data structure according to some embodiments.
  • FIG. 4 is a flow diagram of a method relating to a data structure, according to some embodiments herein.
  • FIGS. 5A-5D provide illustrative examples of some data tables according to some embodiments.
  • FIG. 6 is an illustrative depiction of an aggregation flow, in some embodiments herein.
  • FIG. 7 is a flow diagram of a method relating to an aggregation flow, according to some embodiments herein.
  • a data structure and techniques of using that data structure may be developed to fully exploit the design characteristics and capabilities of that particular computing environment.
  • a data structure and techniques for using that data structure i.e., algorithms
  • the term parallel computation environment with shared memory refers to a system or device having more than one processing unit.
  • the multiple processing units may be processors, processor cores, multi-core processors, etc. All of the processing units can access a main memory (i.e., a shared memory architecture). All of the processing units can run or execute the same program(s). As used herein, a running program may be referred to as a thread.
  • Memory may be organized in a hierarchy of multiple levels, where faster but smaller memory units are located closer to the processing units. The smaller and faster memory units located nearer the processing units as compared to the main memory are referred to as cache.
  • FIG. 1 is a block diagram overview of a device, system, or apparatus 100 that may be used in a providing an index hash table or hash map in accordance with some aspects and embodiments herein, as well as providing a parallel aggregation based on such data structures.
  • System 100 may be, for example, associated with any of the devices described herein and may include a plurality of processing units 105 , 110 , and 115 .
  • the processing units may comprise one or more commercially available Central Processing Units (CPUs) in form of one-chip microprocessors or a multi-core processor, coupled to a communication device 120 configured to communicate via a communication network (not shown in FIG. 1 ) to a end client (not shown in FIG. 1 ).
  • CPUs Central Processing Units
  • Device 100 may also include a local cache memory associated with each of the processing units 105 , 110 , and 115 such as RAM memory modules.
  • Communication device 515 may be used to communicate, for example, with one or more client devices or business service providers.
  • System 100 further includes an input device 125 (e.g., a mouse and/or keyboard to enter content) and an output device 130 (e.g., a computer monitor to display a user interface element).
  • Processing units 105 , 110 , and 115 communicates with a shared memory 135 via a system bus 175 .
  • System bus also provides a mechanism for the processing units to communicate with a storage device 140 .
  • Storage device 140 may include any appropriate information storage device, including combinations of magnetic storage devices (e.g., a hard disk drive), optical storage devices, and/or semiconductor memory devices for storing data and programs.
  • Storage device 140 stores a program 145 for controlling the processing units 105 , 110 , and 115 and query engine application 150 for executing queries.
  • Processing units 105 , 110 , and 115 may perform instructions of the program 145 and thereby operate in accordance with any of the embodiments described herein. For example, the processing units may concurrently execute a plurality of execution threads to build the index hash table data structures disclosed herein.
  • query engine 150 may operate to execute a parallel aggregation operation in accordance with aspects herein in cooperation with the processing units and by accessing database 155 .
  • Program 145 and other instructions may be stored in a compressed, uncompiled and/or encrypted format.
  • Program 645 may also include other program elements, such as an operating system, a database management system, and/or device drivers used by the processing units 105 , 110 , and 115 to interface with peripheral devices.
  • storage device 140 includes a database 155 to facilitate the execution of queries based on input table data.
  • the database may include data structures (e.g., index hash tables), rules, and conditions for executing a query in a parallel computation environment such as that of FIGS. 1 and 2 .
  • the data structure disclosed herein as being developed for use in parallel computing environments with shared memory is referred to as a parallel hash table.
  • the parallel hash table may also be referred to as a parallel hash map.
  • a hash table may be provided and used as index structures for data storage to enable fast data retrieval.
  • the parallel hash table disclosed herein may be used in a parallel computation environment where multiple concurrently executing (i.e., running) threads insert and retrieve data in tables.
  • an aggregation algorithm that uses the parallel hash tables herein is provided for computing an aggregate in a parallel computation environment.
  • FIG. 2 provides an illustrative example of a computation environment 100 compatible with some embodiments herein. While computation environment 100 may be compatible with some embodiments of the data structures and the methods herein, the data structures and the methods herein are not limited to the example computation environment 100 . Processes to store, retrieve, and perform operations on data may be facilitated by a database system (DBS) and a database warehouse (DWH).
  • DBS database system
  • DWH database warehouse
  • DBS 210 is a server.
  • DBS 210 further includes a database management system (DBMS) 215 .
  • DBMS 215 may comprise software (e.g., programs, instructions, code, applications, services, etc.) that controls the organization of and access to database 225 that stores data.
  • Database 225 may include an internal memory, an external memory, or other configurations of memory.
  • Database 225 may be capable of storing large amounts of data, including relational data. The relational data may be stored in tables.
  • a plurality of clients such as example client 205 , may communicate with DBS 210 via a communication link (e.g., a network) and specified application programming interfaces (APIs).
  • the API language provided by DBS 210 is SQL, the Structured Query Language.
  • Client 205 may communicate with DBS 115 using SQL to, for example, create and delete tables; insert, update, and delete data; and query data.
  • a user may submit a query from client 205 in the form of a SQL query statement to DBS 210 .
  • DBMS 215 may execute the query by evaluating the parameters of the query statement and accessing database 225 as needed to produce a result 230 .
  • the result 230 may be provided to client 205 for storage and/or presentation to the user.
  • One type of query is an aggregation query.
  • a parallel aggregation algorithm, process, or operation may be used to compute SQL aggregates.
  • client 205 wanting to group or aggregate data of a table stored in database 225 (e.g., a user at client 205 may desire to know the average salaries of the employees in all of a company's departments).
  • Client 205 may connect to DBS 210 and issue a SQL query statement that describes and specifies the desired aggregation.
  • DBMS 215 may create a executable instance of the parallel aggregation algorithm herein, provide it with the information needed to run the parallel aggregation algorithm (e.g., the name of a table to access, the columns to group by, the columns to aggregate, the aggregation function, etc.), and run the parallel aggregation operation or algorithm.
  • the parallel aggregation algorithm herein may create an index hash map 220 .
  • the index hash map may be used to keep track of intermediate result data.
  • An overall result comprising a result table may be computed based on the index hash map(s) containing the intermediate results.
  • the overall parallel aggregation result may be transmitted to client 205 .
  • DWHs may be built on top of DBSs.
  • a use-case of a DWH may be similar in some respects to DBS 210 of FIG. 2 .
  • the computation environment of FIG. 2 may include a plurality of processors that can operate concurrently, in parallel and include a device or system similar to that described in FIG. 1 . Additionally, the computation environment of FIG. 2 may have a memory that is shared amongst the plurality of processors, for example, like the system of FIG. 1 . In order to fully capitalize on the parallel processing power of such a computation environment, the data structures used by the system may be designed, developed or adapted for being efficiently used in the parallel computing environment.
  • a hash table is a fundamental data structure in computer science that is used for mapping “keys” (e.g., the names of people) to the associated values of the keys (e.g., the phone number of the people) for fast data look-up.
  • a conventional hash table stores key—value pairs.
  • Conventional hash tables are designed for sequential processing.
  • the data structure of an index hash map provides a lock-free cache-efficient hash data structure developed to parallel computation environments with shared memory.
  • the index hash map may be adapted to column stores.
  • the index hash map herein does not store key—value pairs.
  • the index hash map herein generates key—index pairs by mapping each distinct key to a unique integer.
  • each time a new distinct key is inserted in the index hash map the index hash map increments an internal counter and assigns the value of the counter to the key to produce a key—index pair.
  • the counter may provide, at any time, the cardinality of an input set of keys that have thus far been inserted in the hash map.
  • the key—index mapping may be used to share a single hash map among different columns (or value arrays).
  • the associated index for the key has to be calculated just once.
  • the use of key—index pairs may facilitate bulk insertion in columnar storages. Inserting a set of key—index pairs may entail inserting the keys in a hash map to obtain a mapping vector containing indexes. This mapping vector may be used to build a value array per value column.
  • FIGS. 3A-3D input data is illustrated in FIG. 3A including a key array 305 .
  • the index hash map returns an index 320 (i.e., a unique integer), as seen in FIG. 3B .
  • the mapping vector of FIG. 3C results.
  • the entries in the mapping of FIG. 3C are the indexes that point to a value array “A” 330 illustrated in FIG. 3D .
  • the mapping of FIG. 3C may be used to aggregate the “Kf” columns 310 shown in FIG. 3A .
  • the result of the aggregation of column 310 is depicted in FIG. 3D at 335 .
  • index hash maps herein may be designed to avoid locking when being operated on by concurrently executing threads by producing wide data independence.
  • index hash maps herein may be described by a framework defining a two step process. In a first step, input data is split or separated into equal-sized blocks and the blocks are assigned to worker execution threads. These worker execution threads may produce intermediate results by building relatively small local hash tables or hash maps. The local hash maps are private to the respective thread that produces it. Accordingly, other threads may not see or access the local hash map produced by a given thread.
  • the local hash maps including the intermediate results may be merged to obtain a global result by concurrently executing merger threads.
  • each of the merger threads may only consider a dedicated range of hash values.
  • the merger threads may process hash-disjoint partitions of the local hash maps and produce disjoint result hash tables that may be concatenated to build an overall result.
  • FIG. 4 is a flow diagram related to a data structure framework 400 , in accordance with some embodiments herein.
  • an input data table is separated or divided into a plurality of partitions.
  • the size of the partitions may relate to or even be the size of a memory unit such as, for example, a cache associated with parallel processing units.
  • the partitions are equal in size.
  • a first plurality of execution threads running in parallel may each generate a local hash table or hash map. Each of the local hash maps is private to the one of the plurality of threads that generated the local hash map.
  • the second step of the data structure framework herein is depicted in FIG. 4 at S 410 .
  • the local hash maps are merged.
  • the merging of the local hash maps produces a set of disjoint result hash tables or hash maps.
  • each of the merger threads may only consider a dedicated range of hash values. From a logical perspective, the local hash maps may be considered as being partitioned by their hash value.
  • One implementation may use, for example, some first bits of the hash value to form a range of hash values. The same ranges are used for all local hash maps, thus the “partitions” of the local hash maps are disjunctive. As an example, if a value “a” is in range 5 of a local hash map, then the value will be in the same range of other local hash maps. In this manner, all identical values of all local hash maps may be merged into a single result hash map. Since the “partitions” are disjunctive, the merged result hash maps may be created without a need for locks. Additionally, further processing on the merged result hash maps may be performed without locks since any execution threads will be operating on disjunctive data.
  • the local (index) hash maps providing the intermediate results may be of a fixed size. Instead of resizing a local hash map, the corresponding worker execution thread may replace its local hash map with a new hash map when a certain load factor is reached and place the current local hash map into a buffer containing hash maps that are ready to be merged.
  • the size of the local hash maps may be sized such that the local hash maps fit in a cache (e.g., L2 or L3). The specific size of the cache may depend on the sizes of caches in a given CPU architecture.
  • insertions and lookups of keys may largely take place in cache.
  • over-crowded areas within a local hash map may be avoided by maintaining statistical data regarding the local hash maps. The statistical data may indicate when the local hash map should be declared full (independent of an actual load factor).
  • the size of a buffer of a computing system and environment holding local hash maps ready to be merged is a tuning parameter, wherein a smaller buffer may induce more merge operations while a larger buffer will necessarily require more memory.
  • a global result may be organized into bucketed index hash maps where each result hash map includes multiple fixed-size physical memory blocks.
  • cache-efficient merging may be realized, as well as memory allocation being more efficient and sustainable since allocated blocks may be shared between queries.
  • the hash map may be resized. Resizing a hash map may be accomplished by increasing its number of memory blocks. Resizing of a bucketed index hash map may entail needing to know the entries to be repositioned.
  • the maps' hash function may be chosen such that its codomain increases by adding further least significant bits of need during a resize operation. In an effort to avoid too many resize operations, an estimate of a final target size may be determined before an actual resizing of the hash map.
  • the index hash map framework discussed above may provide an infrastructure to implement parallelized query processing algorithms or operations.
  • One embodiment of a parallelized query processing algorithm includes a hash-based aggregation, as will be discussed in greater detail below.
  • a parallelized aggregation refers to a relational aggregation that groups and condenses relational data stored in tables.
  • An example of a table that may form an input of a parallel aggregation operation herein is depicted in FIG. 5A .
  • Table 500 includes sales data. The sales data is organized in three columns—a Product column 505 , a Country column 510 , and a Revenue column 15 .
  • Table 500 may be grouped and aggregated by, for example, four combinations of columns—by Product and Country, by Product, and by Country. In the following discussion the columns by which an aggregation groups the data is referred to as group columns.
  • FIGS. 5B-5D Aggregation result tables determined by the four different groupings are illustrated in FIGS. 5B-5D .
  • Each of the result tables 520 , 540 , 555 , and 570 contain the distinct values (groups) of the desired group columns and, per group, the aggregated value.
  • table 520 includes the results based on grouping by Product and Country.
  • Columns 525 and 530 include the distinct Product and Country values (i.e., groups) of the desired Product and Country columns ( FIG. 5A , columns 505 and 510 ) and the aggregated value for each distinct Product and Country group is included in column 535 .
  • table 540 includes the results based on grouping by Product.
  • Column 545 includes the distinct Product values (i.e., groups) of the desired Product column ( FIG.
  • Table 555 includes the results based on grouping by Country where columns 560 and 565 include the distinct Country values (i.e., groups) of the desired Country column ( FIG. 5A , column 510 ) and the aggregated value for each distinct Country group.
  • a summation function SUM is used to aggregate values.
  • other aggregation functions such as, for example and not as a limitation, a COUNT, a MIN, a MAX, and an AVG aggregation function may be used.
  • the column containing the aggregates may be referred to herein as the aggregate column.
  • the aggregate columns in FIGS. 5A-5E are columns 535 , 550 , 560 , and 575 , respectively.
  • an aggregation operation should be computed and determined in parallel.
  • the processing performance for the aggregation would be bound by the speed of a single processing unit instead of being realized by the multiple processing units available in the parallel computing environment.
  • FIG. 6 is an illustrative depiction of a parallel aggregation flow, according to some embodiments herein.
  • the parallel aggregation flow 600 uses the index hash table framework discussed hereinabove.
  • two degrees of parallelism are depicted and are achieved by the concurrent execution of two execution threads.
  • the concepts conveyed by FIG. 6 may be extended to additional degrees of parallelism, including computation environments now known and those that become known in the future.
  • input table 605 is separated into a plurality of partitions.
  • Input table 605 is shown divided into partitions 610 and 615 . All of or a portion of table 605 may be split into partitions for parallel aggregation.
  • Portions of table 605 not initially partitioned and processed by a parallel aggregation operation may subsequently be partitioned for parallel aggregation processing.
  • Table 605 may be partitioned into equal-sized partitions.
  • Partitions 610 and 615 are but two example partitions, and additional partitions may exist and be processed in the parallel aggregation operations herein.
  • a first plurality of execution threads, aggregator threads are initially running and a second plurality of execution threads are not initially running or are in a sleep state.
  • the concurrently operating aggregator threads operate to fetch an exclusive part of table 605 .
  • Partition 610 is fetched by aggregator thread 620 and partition 615 is fetched by aggregator thread 625 .
  • Each of the aggregator threads may read their partition and aggregate the values of each partition into a private local hash table or hash map.
  • Aggregator thread 620 produces private hash map 630 and aggregator thread 625 produces local hash map 635 . Since each aggregator thread processes its own separate portion of input table 605 , and has its private hash map, the parallel processing of the partitions may be accomplished lock-free.
  • the local hash tables may be the same size as the cache associated with the processing unit executing an aggregator thread. Sizing the local hash tables in this manner may function to avoid cache misses.
  • input data may be read from table 605 to aggregate and written to the local hash tables row-wise or column-wise.
  • the aggregator thread may fetch another, unprocessed partition of input table 605 .
  • the aggregator threads move their associated local hash maps into a buffer 640 when the local hash table reaches a threshold size, initiate a new local hash table, and proceed.
  • the aggregator threads may wake up a second plurality of execution threads, referred to in the present example as merger threads, and the aggregator threads may enter a sleep state.
  • the local hash maps may be retained in buffer 640 until the entire input table 605 is consumed by the aggregations threads 620 and 625 .
  • the second plurality of execution threads, the merger threads are awaken and the aggregator threads enter a sleep state.
  • Each of the merger threads is responsible for a certain partition of all of the private hash maps in buffer 640 .
  • the particular data partition each merger thread is responsible for may be determined by assigning distinct, designated key values of the local hash maps to each of the merger threads. That is, the partition of the data of the portion the data for which each merger thread is responsible may be determined by “key splitting” in the local hash maps.
  • merger thread 1 is responsible for designated keys 665
  • merger thread 2 is responsible for keys 670 .
  • Each of the merger threads 1 and 2 operate to iterate over all of the private hash maps in buffer 640 , read their respective data partition as determined by the key splitting, and merge their respective data partition into a thread-local part hash table (or part hash map).
  • merger thread 1 ( 662 ) and merger thread 2 ( 664 ) each consider all of the private hash maps in buffer 640 based on the key based partitions they are each responsible for and produce, respectively, part hash map 1 ( 675 ) and part hash map 2 ( 680 ).
  • the executing merger threads may acquire responsibility for a new data partition and proceed to process the new data partition as discussed above.
  • the merger threads may enter a sleep state and the aggregator threads may return to an active, running state. Upon returning to the active, running state, the processes discussed above may repeat.
  • the parallel aggregation operation herein may terminate.
  • the results of the aggregation process will be contained in the set of part hash maps (e.g., 675 and 680 ).
  • the part hash maps may be seen as forming a parallel result since the part hash maps are disjoint.
  • the part hash maps may be processed in parallel.
  • a having clause may be evaluated and applied to every group or parallel sorting and merging may be performed thereon.
  • An overall result may be obtained from the disjoint part hash maps by concatenating them together, as depicted in FIG. 6 at 685 .
  • FIG. 7 is an illustrative example of a flow diagram 700 relating to some parallel aggregation embodiments herein.
  • exclusive partitions of an input data table are received or retrieved for aggregating in parallel.
  • the values of each of the exclusive partitions are aggregated.
  • the values of each partition are aggregated into a local hash map by one of a plurality of concurrently running execution threads.
  • process 700 operates to generate a global result by assembling the results obtained at S 725 into a composite result table.
  • the overall result may be produced by concatenating the part hash maps of S 725 to each other.
  • Each system described herein may be implemented by any number of devices in communication via any number of other public and/or private networks. Two or more of the devices herein may be co-located, may be a single device, or may be located remote from one another and may communicate with one another via any known manner of network(s) and/or a dedicated connection. Moreover, each device may comprise any number of hardware and/or software elements suitable to provide the functions described herein as well as any other functions. Other topologies may be used in conjunction with other embodiments.
  • All systems and processes discussed herein may be embodied in program code stored on one or more computer-readable media.
  • Such media may include, for example, a floppy disk, a CD-ROM, a DVD-ROM, magnetic tape, and solid state Random Access Memory (RAM) or Read Only Memory (ROM) storage units.
  • RAM Random Access Memory
  • ROM Read Only Memory
  • a memory storage unit may be associated with access patterns and may be independent from the device (e.g., magnetic, optoelectronic, semiconductor/solid-state, etc.)
  • in-memory technologies may be used such that databases, etc. may be completely operated in RAM memory at a processor. Embodiments are therefore not limited to any specific combination of hardware and software.

Abstract

According to some embodiments, a data structure may be provided by separating an input table into a plurality of partitions; generating, by each of a first plurality of execution threads operating concurrently, a local hash table for each of the threads, each local hash table storing key—index pairs; and merging the local hash tables, by a second plurality of execution threads operating concurrently, to produce a set of disjoint result hash tables. An overall result may be obtained from the result set of disjoint result hash tables. The data structure may used in a parallel computing environment to determine an aggregation.

Description

    FIELD
  • Some embodiments relate to a data structure. More specifically, some embodiments provide a method and system for a data structure and use of same in parallel computing environments.
  • BACKGROUND
  • A number of presently developed and developing computer systems include multiple processors in an attempt to provide increased computing performance. Advances in computing performance, including for example processing speed and throughput, may be provided by parallel computing systems and devices as compared to single processing systems that sequentially process programs and instructions.
  • For parallel shared-memory aggregation processes, a number of approaches have been proposed. However, the previous approaches each include sequential operations and/or synchronization operations such as, locking, to avoid inconsistencies or lapses in data coherency. Thus, prior proposed solutions for parallel aggregation in parallel computation environments with shared memory either contain a sequential step or require some sort of synchronization on the data structures.
  • Accordingly, a method and mechanism for efficiently processing data in parallel computation environments and the use of same in parallel aggregation processes are provided by some embodiments herein.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is block diagram of a system according to some embodiments.
  • FIG. 2 is a block diagram of an operating environment according to some embodiments.
  • FIGS. 3A-3D are illustrative depictions of various aspects of a data structure according to some embodiments.
  • FIG. 4 is a flow diagram of a method relating to a data structure, according to some embodiments herein.
  • FIGS. 5A-5D provide illustrative examples of some data tables according to some embodiments.
  • FIG. 6 is an illustrative depiction of an aggregation flow, in some embodiments herein.
  • FIG. 7 is a flow diagram of a method relating to an aggregation flow, according to some embodiments herein.
  • DETAILED DESCRIPTION
  • In an effort to more fully and efficiently use the resources of a particular computing environment, a data structure and techniques of using that data structure may be developed to fully exploit the design characteristics and capabilities of that particular computing environment. In some embodiments herein, a data structure and techniques for using that data structure (i.e., algorithms) are provided for efficiently using the data structure disclosed herein in a parallel computing environment with shared memory.
  • As used herein, the term parallel computation environment with shared memory refers to a system or device having more than one processing unit. The multiple processing units may be processors, processor cores, multi-core processors, etc. All of the processing units can access a main memory (i.e., a shared memory architecture). All of the processing units can run or execute the same program(s). As used herein, a running program may be referred to as a thread. Memory may be organized in a hierarchy of multiple levels, where faster but smaller memory units are located closer to the processing units. The smaller and faster memory units located nearer the processing units as compared to the main memory are referred to as cache.
  • FIG. 1 is a block diagram overview of a device, system, or apparatus 100 that may be used in a providing an index hash table or hash map in accordance with some aspects and embodiments herein, as well as providing a parallel aggregation based on such data structures. System 100 may be, for example, associated with any of the devices described herein and may include a plurality of processing units 105, 110, and 115. The processing units may comprise one or more commercially available Central Processing Units (CPUs) in form of one-chip microprocessors or a multi-core processor, coupled to a communication device 120 configured to communicate via a communication network (not shown in FIG. 1) to a end client (not shown in FIG. 1). Device 100 may also include a local cache memory associated with each of the processing units 105, 110, and 115 such as RAM memory modules. Communication device 515 may be used to communicate, for example, with one or more client devices or business service providers. System 100 further includes an input device 125 (e.g., a mouse and/or keyboard to enter content) and an output device 130 (e.g., a computer monitor to display a user interface element).
  • Processing units 105, 110, and 115 communicates with a shared memory 135 via a system bus 175. System bus also provides a mechanism for the processing units to communicate with a storage device 140. Storage device 140 may include any appropriate information storage device, including combinations of magnetic storage devices (e.g., a hard disk drive), optical storage devices, and/or semiconductor memory devices for storing data and programs.
  • Storage device 140 stores a program 145 for controlling the processing units 105, 110, and 115 and query engine application 150 for executing queries. Processing units 105, 110, and 115 may perform instructions of the program 145 and thereby operate in accordance with any of the embodiments described herein. For example, the processing units may concurrently execute a plurality of execution threads to build the index hash table data structures disclosed herein. Furthermore, query engine 150 may operate to execute a parallel aggregation operation in accordance with aspects herein in cooperation with the processing units and by accessing database 155. Program 145 and other instructions may be stored in a compressed, uncompiled and/or encrypted format. Program 645 may also include other program elements, such as an operating system, a database management system, and/or device drivers used by the processing units 105, 110, and 115 to interface with peripheral devices.
  • In some embodiments, storage device 140 includes a database 155 to facilitate the execution of queries based on input table data. The database may include data structures (e.g., index hash tables), rules, and conditions for executing a query in a parallel computation environment such as that of FIGS. 1 and 2.
  • In some embodiments, the data structure disclosed herein as being developed for use in parallel computing environments with shared memory is referred to as a parallel hash table. In some instances, the parallel hash table may also be referred to as a parallel hash map. In general, a hash table may be provided and used as index structures for data storage to enable fast data retrieval. The parallel hash table disclosed herein may be used in a parallel computation environment where multiple concurrently executing (i.e., running) threads insert and retrieve data in tables. Furthermore, an aggregation algorithm that uses the parallel hash tables herein is provided for computing an aggregate in a parallel computation environment.
  • FIG. 2 provides an illustrative example of a computation environment 100 compatible with some embodiments herein. While computation environment 100 may be compatible with some embodiments of the data structures and the methods herein, the data structures and the methods herein are not limited to the example computation environment 100. Processes to store, retrieve, and perform operations on data may be facilitated by a database system (DBS) and a database warehouse (DWH).
  • As shown in FIG. 2, DBS 210 is a server. DBS 210 further includes a database management system (DBMS) 215. DBMS 215 may comprise software (e.g., programs, instructions, code, applications, services, etc.) that controls the organization of and access to database 225 that stores data. Database 225 may include an internal memory, an external memory, or other configurations of memory. Database 225 may be capable of storing large amounts of data, including relational data. The relational data may be stored in tables. In some embodiments, a plurality of clients, such as example client 205, may communicate with DBS 210 via a communication link (e.g., a network) and specified application programming interfaces (APIs). In some embodiments, the API language provided by DBS 210 is SQL, the Structured Query Language. Client 205 may communicate with DBS 115 using SQL to, for example, create and delete tables; insert, update, and delete data; and query data.
  • In general, a user may submit a query from client 205 in the form of a SQL query statement to DBS 210. DBMS 215 may execute the query by evaluating the parameters of the query statement and accessing database 225 as needed to produce a result 230. The result 230 may be provided to client 205 for storage and/or presentation to the user.
  • One type of query is an aggregation query. As will be explained in greater detail below, a parallel aggregation algorithm, process, or operation may be used to compute SQL aggregates. In general with reference to FIG. 2, some embodiments herein may include client 205 wanting to group or aggregate data of a table stored in database 225 (e.g., a user at client 205 may desire to know the average salaries of the employees in all of a company's departments). Client 205 may connect to DBS 210 and issue a SQL query statement that describes and specifies the desired aggregation. DBMS 215 may create a executable instance of the parallel aggregation algorithm herein, provide it with the information needed to run the parallel aggregation algorithm (e.g., the name of a table to access, the columns to group by, the columns to aggregate, the aggregation function, etc.), and run the parallel aggregation operation or algorithm. In the process of running, the parallel aggregation algorithm herein may create an index hash map 220. The index hash map may be used to keep track of intermediate result data. An overall result comprising a result table may be computed based on the index hash map(s) containing the intermediate results. The overall parallel aggregation result may be transmitted to client 205.
  • As an extension of FIG. 2, DWHs may be built on top of DBSs. Thus, a use-case of a DWH may be similar in some respects to DBS 210 of FIG. 2.
  • The computation environment of FIG. 2 may include a plurality of processors that can operate concurrently, in parallel and include a device or system similar to that described in FIG. 1. Additionally, the computation environment of FIG. 2 may have a memory that is shared amongst the plurality of processors, for example, like the system of FIG. 1. In order to fully capitalize on the parallel processing power of such a computation environment, the data structures used by the system may be designed, developed or adapted for being efficiently used in the parallel computing environment.
  • A hash table is a fundamental data structure in computer science that is used for mapping “keys” (e.g., the names of people) to the associated values of the keys (e.g., the phone number of the people) for fast data look-up. A conventional hash table stores key—value pairs. Conventional hash tables are designed for sequential processing.
  • However, for parallel computation environments there exists a need for data structures particularly suitable for use in the parallel computing environment. In some embodiments herein, the data structure of an index hash map is provided. In some aspects, the index hash map provides a lock-free cache-efficient hash data structure developed to parallel computation environments with shared memory. In some embodiments, the index hash map may be adapted to column stores.
  • In a departure from conventional hash tables that store key—value pairs, the index hash map herein does not store key—value pairs. The index hash map herein generates key—index pairs by mapping each distinct key to a unique integer. In some embodiments, each time a new distinct key is inserted in the index hash map, the index hash map increments an internal counter and assigns the value of the counter to the key to produce a key—index pair. The counter may provide, at any time, the cardinality of an input set of keys that have thus far been inserted in the hash map. In some respects, the key—index mapping may be used to share a single hash map among different columns (or value arrays). For example, for processing a plurality of values distributed among different columns, the associated index for the key has to be calculated just once. The use of key—index pairs may facilitate bulk insertion in columnar storages. Inserting a set of key—index pairs may entail inserting the keys in a hash map to obtain a mapping vector containing indexes. This mapping vector may be used to build a value array per value column.
  • Referring to FIGS. 3A-3D, input data is illustrated in FIG. 3A including a key array 305. For each distinct key 315 from keys array 305, the index hash map returns an index 320 (i.e., a unique integer), as seen in FIG. 3B. When all of the keys, from a column for example, have been inserted in the hash map, the mapping vector of FIG. 3C results. The entries in the mapping of FIG. 3C are the indexes that point to a value array “A” 330 illustrated in FIG. 3D. The mapping of FIG. 3C may be used to aggregate the “Kf” columns 310 shown in FIG. 3A. The result of the aggregation of column 310 is depicted in FIG. 3D at 335.
  • To achieve a maximum parallel processor utilization, the index hash maps herein may be designed to avoid locking when being operated on by concurrently executing threads by producing wide data independence. In some embodiments, index hash maps herein may be described by a framework defining a two step process. In a first step, input data is split or separated into equal-sized blocks and the blocks are assigned to worker execution threads. These worker execution threads may produce intermediate results by building relatively small local hash tables or hash maps. The local hash maps are private to the respective thread that produces it. Accordingly, other threads may not see or access the local hash map produced by a given thread.
  • In a second step, the local hash maps including the intermediate results may be merged to obtain a global result by concurrently executing merger threads. When accessing and processing the local hash maps, each of the merger threads may only consider a dedicated range of hash values. The merger threads may process hash-disjoint partitions of the local hash maps and produce disjoint result hash tables that may be concatenated to build an overall result.
  • FIG. 4 is a flow diagram related to a data structure framework 400, in accordance with some embodiments herein. At S405, an input data table is separated or divided into a plurality of partitions. The size of the partitions may relate to or even be the size of a memory unit such as, for example, a cache associated with parallel processing units. In some embodiments, the partitions are equal in size. Furthermore, a first plurality of execution threads running in parallel may each generate a local hash table or hash map. Each of the local hash maps is private to the one of the plurality of threads that generated the local hash map.
  • The second step of the data structure framework herein is depicted in FIG. 4 at S410. At S410, the local hash maps are merged. The merging of the local hash maps produces a set of disjoint result hash tables or hash maps.
  • In some embodiments, when accessing and processing the local hash maps, each of the merger threads may only consider a dedicated range of hash values. From a logical perspective, the local hash maps may be considered as being partitioned by their hash value. One implementation may use, for example, some first bits of the hash value to form a range of hash values. The same ranges are used for all local hash maps, thus the “partitions” of the local hash maps are disjunctive. As an example, if a value “a” is in range 5 of a local hash map, then the value will be in the same range of other local hash maps. In this manner, all identical values of all local hash maps may be merged into a single result hash map. Since the “partitions” are disjunctive, the merged result hash maps may be created without a need for locks. Additionally, further processing on the merged result hash maps may be performed without locks since any execution threads will be operating on disjunctive data.
  • In some embodiments, the local (index) hash maps providing the intermediate results may be of a fixed size. Instead of resizing a local hash map, the corresponding worker execution thread may replace its local hash map with a new hash map when a certain load factor is reached and place the current local hash map into a buffer containing hash maps that are ready to be merged. In some embodiments, the size of the local hash maps may be sized such that the local hash maps fit in a cache (e.g., L2 or L3). The specific size of the cache may depend on the sizes of caches in a given CPU architecture.
  • In some aspects, insertions and lookups of keys may largely take place in cache. In some embodiments, over-crowded areas within a local hash map may be avoided by maintaining statistical data regarding the local hash maps. The statistical data may indicate when the local hash map should be declared full (independent of an actual load factor). In some aspects and embodiments, the size of a buffer of a computing system and environment holding local hash maps ready to be merged is a tuning parameter, wherein a smaller buffer may induce more merge operations while a larger buffer will necessarily require more memory.
  • In some embodiments, a global result may be organized into bucketed index hash maps where each result hash map includes multiple fixed-size physical memory blocks. In this configuration, cache-efficient merging may be realized, as well as memory allocation being more efficient and sustainable since allocated blocks may be shared between queries. In some aspects, when a certain load factor within a global result hash map is reached during a merge operation, the hash map may be resized. Resizing a hash map may be accomplished by increasing its number of memory blocks. Resizing of a bucketed index hash map may entail needing to know the entries to be repositioned. In some embodiments, the maps' hash function may be chosen such that its codomain increases by adding further least significant bits of need during a resize operation. In an effort to avoid too many resize operations, an estimate of a final target size may be determined before an actual resizing of the hash map.
  • In some embodiments, the index hash map framework discussed above may provide an infrastructure to implement parallelized query processing algorithms or operations. One embodiment of a parallelized query processing algorithm includes a hash-based aggregation, as will be discussed in greater detail below.
  • In some embodiments, a parallelized aggregation refers to a relational aggregation that groups and condenses relational data stored in tables. An example of a table that may form an input of a parallel aggregation operation herein is depicted in FIG. 5A. Table 500 includes sales data. The sales data is organized in three columns—a Product column 505, a Country column 510, and a Revenue column 15. Table 500 may be grouped and aggregated by, for example, four combinations of columns—by Product and Country, by Product, and by Country. In the following discussion the columns by which an aggregation groups the data is referred to as group columns.
  • Aggregation result tables determined by the four different groupings are illustrated in FIGS. 5B-5D. Each of the result tables 520, 540, 555, and 570 contain the distinct values (groups) of the desired group columns and, per group, the aggregated value. For example, table 520 includes the results based on grouping by Product and Country. Columns 525 and 530 include the distinct Product and Country values (i.e., groups) of the desired Product and Country columns (FIG. 5A, columns 505 and 510) and the aggregated value for each distinct Product and Country group is included in column 535. Furthermore, table 540 includes the results based on grouping by Product. Column 545 includes the distinct Product values (i.e., groups) of the desired Product column (FIG. 5A, column 505) and the aggregated value for each distinct Product group. Table 555 includes the results based on grouping by Country where columns 560 and 565 include the distinct Country values (i.e., groups) of the desired Country column (FIG. 5A, column 510) and the aggregated value for each distinct Country group.
  • In some embodiments, such as the examples of FIGS. 5A-5D, a summation function SUM is used to aggregate values. However, other aggregation functions such as, for example and not as a limitation, a COUNT, a MIN, a MAX, and an AVG aggregation function may be used. The column containing the aggregates may be referred to herein as the aggregate column. Thus, the aggregate columns in FIGS. 5A-5E are columns 535, 550, 560, and 575, respectively.
  • In an effort to fully utilize the resources of parallel computing environments with shared memory, an aggregation operation should be computed and determined in parallel. In an instance the aggregation is not computed in parallel, the processing performance for the aggregation would be bound by the speed of a single processing unit instead of being realized by the multiple processing units available in the parallel computing environment.
  • FIG. 6 is an illustrative depiction of a parallel aggregation flow, according to some embodiments herein. In some aspects, the parallel aggregation flow 600 uses the index hash table framework discussed hereinabove. In the example of FIG. 6, two degrees of parallelism are depicted and are achieved by the concurrent execution of two execution threads. However, the concepts conveyed by FIG. 6 may be extended to additional degrees of parallelism, including computation environments now known and those that become known in the future. In FIG. 6, input table 605 is separated into a plurality of partitions. Input table 605 is shown divided into partitions 610 and 615. All of or a portion of table 605 may be split into partitions for parallel aggregation. Portions of table 605 not initially partitioned and processed by a parallel aggregation operation may subsequently be partitioned for parallel aggregation processing. Table 605 may be partitioned into equal-sized partitions. Partitions 610 and 615 are but two example partitions, and additional partitions may exist and be processed in the parallel aggregation operations herein.
  • In some embodiments, a first plurality of execution threads, aggregator threads, are initially running and a second plurality of execution threads are not initially running or are in a sleep state. The concurrently operating aggregator threads operate to fetch an exclusive part of table 605. Partition 610 is fetched by aggregator thread 620 and partition 615 is fetched by aggregator thread 625.
  • Each of the aggregator threads may read their partition and aggregate the values of each partition into a private local hash table or hash map. Aggregator thread 620 produces private hash map 630 and aggregator thread 625 produces local hash map 635. Since each aggregator thread processes its own separate portion of input table 605, and has its private hash map, the parallel processing of the partitions may be accomplished lock-free.
  • In some embodiments, the local hash tables may be the same size as the cache associated with the processing unit executing an aggregator thread. Sizing the local hash tables in this manner may function to avoid cache misses. In some aspects, input data may be read from table 605 to aggregate and written to the local hash tables row-wise or column-wise.
  • When a partition is consumed by an aggregator thread, the aggregator thread may fetch another, unprocessed partition of input table 605. In some embodiments, the aggregator threads move their associated local hash maps into a buffer 640 when the local hash table reaches a threshold size, initiate a new local hash table, and proceed.
  • In some embodiments, when the number of hash tables in buffer 640 reaches a threshold size, the aggregator threads may wake up a second plurality of execution threads, referred to in the present example as merger threads, and the aggregator threads may enter a sleep state. In some embodiments, the local hash maps may be retained in buffer 640 until the entire input table 605 is consumed by the aggregations threads 620 and 625. When the entire input table 605 is consumed by the aggregator threads 620 and 625, the second plurality of execution threads, the merger threads, are awaken and the aggregator threads enter a sleep state.
  • Each of the merger threads is responsible for a certain partition of all of the private hash maps in buffer 640. The particular data partition each merger thread is responsible for may be determined by assigning distinct, designated key values of the local hash maps to each of the merger threads. That is, the partition of the data of the portion the data for which each merger thread is responsible may be determined by “key splitting” in the local hash maps. As illustrated in FIG. 6, merger thread 1 is responsible for designated keys 665 and merger thread 2 is responsible for keys 670. Each of the merger threads 1 and 2 operate to iterate over all of the private hash maps in buffer 640, read their respective data partition as determined by the key splitting, and merge their respective data partition into a thread-local part hash table (or part hash map).
  • As further illustrated in FIG. 6, merger thread 1 (662) and merger thread 2 (664) each consider all of the private hash maps in buffer 640 based on the key based partitions they are each responsible for and produce, respectively, part hash map 1 (675) and part hash map 2 (680).
  • In some embodiments, in the instance a merger thread has processed its data partition and there are additional data partitions in need of being processed, the executing merger threads may acquire responsibility for a new data partition and proceed to process the new data partition as discussed above. In the instance all data partitions are processed, the merger threads may enter a sleep state and the aggregator threads may return to an active, running state. Upon returning to the active, running state, the processes discussed above may repeat.
  • In the instance there is no more data to be processed by the aggregator threads and the merger threads, the parallel aggregation operation herein may terminate. The results of the aggregation process will be contained in the set of part hash maps (e.g., 675 and 680). In some respects, the part hash maps may be seen as forming a parallel result since the part hash maps are disjoint.
  • In some embodiments, the part hash maps may be processed in parallel. As an example, a having clause may be evaluated and applied to every group or parallel sorting and merging may be performed thereon.
  • An overall result may be obtained from the disjoint part hash maps by concatenating them together, as depicted in FIG. 6 at 685.
  • FIG. 7 is an illustrative example of a flow diagram 700 relating to some parallel aggregation embodiments herein. At S705, exclusive partitions of an input data table are received or retrieved for aggregating in parallel. At S710 the values of each of the exclusive partitions are aggregated. In some embodiments, the values of each partition are aggregated into a local hash map by one of a plurality of concurrently running execution threads.
  • At S715 a determination is made whether the aggregating of the partitions of the input table partitions is complete or whether the buffer is full. In the instance additional partitions remain to be aggregated and buffer 640 is not full, whether at the end of aggregating a current partition and/or for other considerations, process 700 returns to further aggregate partitions of the input data and store the aggregated values in key—index pairs in local hash tables. In the instance aggregating of the partitions is complete or the buffer is full, process 700 proceeds to assign designated parts of the local hash tables or hash maps to a second plurality of execution threads at S720. The second plurality of execution threads work to merge the designated parts of the local hash maps into thread-local part hash maps at S725 and to produce result tables.
  • At S730, a determination is made whether the aggregating is complete. In the instance the aggregating is not complete, process 700 returns to further aggregate partitions of the input data. In the instance aggregating is complete, process 700 proceeds S735.
  • At S735, process 700 operates to generate a global result by assembling the results obtained at S725 into a composite result table. In some embodiments, the overall result may be produced by concatenating the part hash maps of S725 to each other.
  • Each system described herein may be implemented by any number of devices in communication via any number of other public and/or private networks. Two or more of the devices herein may be co-located, may be a single device, or may be located remote from one another and may communicate with one another via any known manner of network(s) and/or a dedicated connection. Moreover, each device may comprise any number of hardware and/or software elements suitable to provide the functions described herein as well as any other functions. Other topologies may be used in conjunction with other embodiments.
  • All systems and processes discussed herein may be embodied in program code stored on one or more computer-readable media. Such media may include, for example, a floppy disk, a CD-ROM, a DVD-ROM, magnetic tape, and solid state Random Access Memory (RAM) or Read Only Memory (ROM) storage units. According to some embodiments, a memory storage unit may be associated with access patterns and may be independent from the device (e.g., magnetic, optoelectronic, semiconductor/solid-state, etc.) Moreover, in-memory technologies may be used such that databases, etc. may be completely operated in RAM memory at a processor. Embodiments are therefore not limited to any specific combination of hardware and software.
  • Embodiments have been described herein solely for the purpose of illustration. Persons skilled in the art will recognize from this description that embodiments are not limited to those described, but may be practiced with modifications and alterations limited only by the spirit and scope of the appended claims.

Claims (23)

1. A computer implemented method, comprising:
separating an input table into a plurality of partitions;
generating, by each of a first plurality of execution threads operating concurrently, a local hash table for each of the partitions, each local hash table storing key—index pairs; and
merging the local hash tables, by a second plurality of execution threads operating concurrently, to produce a set of disjoint result hash tables.
2. The method of claim 1, wherein each distinct key is mapped to a unique integer.
3. The method of claim 1, wherein the local hash table generated by each of the first plurality of execution threads is private to the execution thread that generated it and is independent of other local hash tables.
4. The method of claim 1, wherein each of the second plurality of execution threads processes a dedicated range of hash values of all of the local hash tables.
5. The method of claim 1, wherein the local hash tables are of a fixed size.
6. The method of claim 1, further comprising concatenating the set of disjoint result hash tables to obtain an overall result.
7. A computer implemented method, comprising:
retrieving, by concurrently executing a first plurality of execution threads, disjoint partitions of an input table;
aggregating, by each of the first plurality of execution threads, values of each partition into a respective local hash table, each local hash table storing key—index pairs; and
merging the local hash tables, by a second plurality of execution threads operating concurrently, to produce a set of disjoint result hash tables, each of the second plurality of execution threads responsible for a dedicated range of hash values of all of the local hash tables.
8. The method of claim 7, wherein each distinct key is mapped to a unique integer.
9. The method of claim 7, wherein the local hash table generated by each of the first plurality of execution threads is private to the execution thread that generated it and is independent of other local hash tables.
10. The method of claim 7, wherein the plurality of first execution threads further:
retrieves another partition of the input table when a previously retrieved partition is consumed by the plurality of first execution threads; and
moves the local hash tables to a buffer when a hash table becomes a threshold size, initializes a new local hash table, and proceeds to retrieve another partition of the input table.
11. The method of claim 7, further comprising concatenating the set of disjoint result hash tables to obtain an overall result.
12. The method of claim 1, wherein the second plurality of execution threads are further responsible for another dedicated range of hash values of all of the local hash tables in an instance the second plurality of execution threads have processed all of the local hash tables and other ranges of hash values remain unprocessed.
13. A system, comprising:
a plurality of processing units;
a shared memory accessible by all of the plurality of processing units;
a database to store an input table; and
a query engine to execute an query comprising:
separating the input table into a plurality of partitions;
generating, by each of a first plurality of execution threads executing concurrently by the plurality of processing units, a local hash table for each of the partitions, each local hash table storing key—index pairs; and
merging the local hash tables, by a second plurality of execution threads executing concurrently by the plurality of processing units, to produce a set of disjoint result hash tables.
14. The system of claim 13, wherein each distinct key is mapped to a unique integer.
15. The system of claim 13, wherein the local hash table generated by each of the first plurality of execution threads is private to the execution thread that generated it and is independent of other local hash tables.
16. The system of claim 13, wherein each of the second plurality of execution threads processes a dedicated range of hash values of all of the local hash tables.
17. The system of claim 13, wherein the local hash tables are of a fixed size.
18. The system of claim 13, further comprising concatenating the set of disjoint result hash tables to obtain an overall result.
19. A system, comprising:
a plurality of processing units;
a shared memory accessible by all of the plurality of processing units;
a database to store an input table; and
a query engine to execute an aggregation query comprising:
retrieving, by concurrently executing a first plurality of execution threads by the plurality of processing units, disjoint partitions of an input table;
aggregating, by each of the first plurality of execution threads, values of each partition into a respective local hash table, each local hash table storing key—index pairs; and
merging the local hash tables, by a second plurality of execution threads executing concurrently by the plurality of processing units, to produce a set of disjoint result hash tables, each of the second plurality of execution threads responsible for a dedicated range of hash values of all of the local hash tables.
20. The system of claim 19, wherein each distinct key is mapped to a unique integer.
21. The system of claim 19, wherein the local hash table generated by each of the first plurality of execution threads is private to the execution thread that generated it and is independent of other local hash tables.
22. The system of claim 19, wherein the plurality of first execution threads further:
retrieve another partition of the input table when a previously retrieved partition is consumed by the plurality of first execution threads;
move the local hash tables to a buffer when a hash table becomes a threshold size, initializes a new local hash table, and proceeds to retrieve another partition of the input table.
23. The system of claim 19, further comprising concatenating the set of disjoint result hash tables to obtain an overall result.
US12/978,194 2010-07-12 2010-12-23 Aggregation in parallel computation environments with shared memory Abandoned US20120011144A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US12/978,194 US20120011144A1 (en) 2010-07-12 2010-12-23 Aggregation in parallel computation environments with shared memory
EP11004931.9A EP2469423B1 (en) 2010-12-23 2011-06-16 Aggregation in parallel computation environments with shared memory
US15/016,978 US10127281B2 (en) 2010-12-23 2016-02-05 Dynamic hash table size estimation during database aggregation processing
US15/040,501 US10114866B2 (en) 2010-12-23 2016-02-10 Memory-constrained aggregation using intra-operator pipelining

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US36330410P 2010-07-12 2010-07-12
US12/978,194 US20120011144A1 (en) 2010-07-12 2010-12-23 Aggregation in parallel computation environments with shared memory

Publications (1)

Publication Number Publication Date
US20120011144A1 true US20120011144A1 (en) 2012-01-12

Family

ID=45439313

Family Applications (4)

Application Number Title Priority Date Filing Date
US12/978,044 Active 2031-01-06 US8370316B2 (en) 2010-07-12 2010-12-23 Hash-join in parallel computation environments
US12/978,194 Abandoned US20120011144A1 (en) 2010-07-12 2010-12-23 Aggregation in parallel computation environments with shared memory
US12/982,767 Active 2032-04-28 US9223829B2 (en) 2010-07-12 2010-12-30 Interdistinct operator
US13/742,034 Active 2032-01-25 US9177025B2 (en) 2010-07-12 2013-01-15 Hash-join in parallel computation environments

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US12/978,044 Active 2031-01-06 US8370316B2 (en) 2010-07-12 2010-12-23 Hash-join in parallel computation environments

Family Applications After (2)

Application Number Title Priority Date Filing Date
US12/982,767 Active 2032-04-28 US9223829B2 (en) 2010-07-12 2010-12-30 Interdistinct operator
US13/742,034 Active 2032-01-25 US9177025B2 (en) 2010-07-12 2013-01-15 Hash-join in parallel computation environments

Country Status (1)

Country Link
US (4) US8370316B2 (en)

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110264687A1 (en) * 2010-04-23 2011-10-27 Red Hat, Inc. Concurrent linked hashed maps
US20120166447A1 (en) * 2010-12-28 2012-06-28 Microsoft Corporation Filtering queried data on data stores
US20120254252A1 (en) * 2011-03-31 2012-10-04 International Business Machines Corporation Input/output efficiency for online analysis processing in a relational database
US20130013824A1 (en) * 2011-07-08 2013-01-10 Goetz Graefe Parallel aggregation system
WO2015077951A1 (en) * 2013-11-28 2015-06-04 Intel Corporation Techniques for block-based indexing
US9195599B2 (en) 2013-06-25 2015-11-24 Globalfoundries Inc. Multi-level aggregation techniques for memory hierarchies
US9213732B2 (en) 2012-12-28 2015-12-15 Sap Ag Hash table and radix sort based aggregation
US9292560B2 (en) 2013-01-30 2016-03-22 International Business Machines Corporation Reducing collisions within a hash table
US9311359B2 (en) 2013-01-30 2016-04-12 International Business Machines Corporation Join operation partitioning
US9317517B2 (en) 2013-06-14 2016-04-19 International Business Machines Corporation Hashing scheme using compact array tables
US9378264B2 (en) 2013-06-18 2016-06-28 Sap Se Removing group-by characteristics in formula exception aggregation
US9405858B2 (en) 2013-06-14 2016-08-02 International Business Machines Corporation On-the-fly encoding method for efficient grouping and aggregation
US9411853B1 (en) 2012-08-03 2016-08-09 Healthstudio, LLC In-memory aggregation system and method of multidimensional data processing for enhancing speed and scalability
US20160350394A1 (en) * 2015-05-29 2016-12-01 Sap Se Aggregating database entries by hashing
US9519583B1 (en) * 2015-12-09 2016-12-13 International Business Machines Corporation Dedicated memory structure holding data for detecting available worker thread(s) and informing available worker thread(s) of task(s) to execute
US9519668B2 (en) 2013-05-06 2016-12-13 International Business Machines Corporation Lock-free creation of hash tables in parallel
US9672248B2 (en) 2014-10-08 2017-06-06 International Business Machines Corporation Embracing and exploiting data skew during a join or groupby
US9836492B1 (en) * 2012-11-01 2017-12-05 Amazon Technologies, Inc. Variable sized partitioning for distributed hash tables
US9922064B2 (en) 2015-03-20 2018-03-20 International Business Machines Corporation Parallel build of non-partitioned join hash tables and non-enforced N:1 join hash tables
US10108653B2 (en) 2015-03-27 2018-10-23 International Business Machines Corporation Concurrent reads and inserts into a data structure without latching or waiting by readers
US10114866B2 (en) 2010-12-23 2018-10-30 Sap Se Memory-constrained aggregation using intra-operator pipelining
US10175894B1 (en) 2014-12-30 2019-01-08 EMC IP Holding Company LLC Method for populating a cache index on a deduplicated storage system
US10289307B1 (en) 2014-12-30 2019-05-14 EMC IP Holding Company LLC Method for handling block errors on a deduplicated storage system
US10303791B2 (en) 2015-03-20 2019-05-28 International Business Machines Corporation Efficient join on dynamically compressed inner for improved fit into cache hierarchy
US10437738B2 (en) * 2017-01-25 2019-10-08 Samsung Electronics Co., Ltd. Storage device performing hashing-based translation between logical address and physical address
US10503717B1 (en) * 2014-12-30 2019-12-10 EMC IP Holding Company LLC Method for locating data on a deduplicated storage system using a SSD cache index
US10650011B2 (en) 2015-03-20 2020-05-12 International Business Machines Corporation Efficient performance of insert and point query operations in a column store
US10831736B2 (en) 2015-03-27 2020-11-10 International Business Machines Corporation Fast multi-tier indexing supporting dynamic update
US10891234B2 (en) 2018-04-04 2021-01-12 Sap Se Cache partitioning to accelerate concurrent workloads
US11113237B1 (en) 2014-12-30 2021-09-07 EMC IP Holding Company LLC Solid state cache index for a deduplicate storage system
WO2023034328A3 (en) * 2021-08-30 2023-04-13 Data.World, Inc. Correlating parallelized data from disparate data sources to aggregate graph data portions to predictively identify entity data
US11816118B2 (en) 2016-06-19 2023-11-14 Data.World, Inc. Collaborative dataset consolidation via distributed computer networks

Families Citing this family (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8370316B2 (en) 2010-07-12 2013-02-05 Sap Ag Hash-join in parallel computation environments
US20120179669A1 (en) * 2011-01-06 2012-07-12 Al-Omari Awny K Systems and methods for searching a search space of a query
US8863146B1 (en) * 2011-04-07 2014-10-14 The Mathworks, Inc. Efficient index folding using indexing expression generated using selected pair of indices for parallel operations based on the number of indices exceeding a pre-determined threshold
WO2013124522A1 (en) 2012-02-22 2013-08-29 Nokia Corporation A system, and a method for providing a predition for controlling a system
US9009155B2 (en) * 2012-04-27 2015-04-14 Sap Se Parallel set aggregation
US9355146B2 (en) 2012-06-29 2016-05-31 International Business Machines Corporation Efficient partitioned joins in a database with column-major layout
US10387293B2 (en) * 2012-10-09 2019-08-20 Securboration, Inc. Systems and methods for automatically parallelizing sequential code
US10725897B2 (en) 2012-10-09 2020-07-28 Securboration, Inc. Systems and methods for automatically parallelizing sequential code
US9569400B2 (en) * 2012-11-21 2017-02-14 International Business Machines Corporation RDMA-optimized high-performance distributed cache
US9378179B2 (en) 2012-11-21 2016-06-28 International Business Machines Corporation RDMA-optimized high-performance distributed cache
US20140214886A1 (en) 2013-01-29 2014-07-31 ParElastic Corporation Adaptive multi-client saas database
US9600852B2 (en) * 2013-05-10 2017-03-21 Nvidia Corporation Hierarchical hash tables for SIMT processing and a method of establishing hierarchical hash tables
US9411845B2 (en) 2013-06-13 2016-08-09 Sap Se Integration flow database runtime
US9659050B2 (en) 2013-08-06 2017-05-23 Sybase, Inc. Delta store giving row-level versioning semantics to a non-row-level versioning underlying store
CN104424326B (en) * 2013-09-09 2018-06-15 华为技术有限公司 A kind of data processing method and device
US9558221B2 (en) * 2013-11-13 2017-01-31 Sybase, Inc. Multi-pass, parallel merge for partitioned intermediate pages
US9529849B2 (en) * 2013-12-31 2016-12-27 Sybase, Inc. Online hash based optimizer statistics gathering in a database
US9824106B1 (en) * 2014-02-20 2017-11-21 Amazon Technologies, Inc. Hash based data processing
US9792328B2 (en) 2014-03-13 2017-10-17 Sybase, Inc. Splitting of a join operation to allow parallelization
US9836505B2 (en) 2014-03-13 2017-12-05 Sybase, Inc. Star and snowflake join query performance
US10380183B2 (en) * 2014-04-03 2019-08-13 International Business Machines Corporation Building and querying hash tables on processors
US9684684B2 (en) 2014-07-08 2017-06-20 Sybase, Inc. Index updates using parallel and hybrid execution
US9785660B2 (en) 2014-09-25 2017-10-10 Sap Se Detection and quantifying of data redundancy in column-oriented in-memory databases
US20160378824A1 (en) * 2015-06-24 2016-12-29 Futurewei Technologies, Inc. Systems and Methods for Parallelizing Hash-based Operators in SMP Databases
US10482076B2 (en) 2015-08-14 2019-11-19 Sap Se Single level, multi-dimension, hash-based table partitioning
US10726015B1 (en) * 2015-11-01 2020-07-28 Yellowbrick Data, Inc. Cache-aware system and method for identifying matching portions of two sets of data in a multiprocessor system
US10083206B2 (en) * 2015-11-19 2018-09-25 Business Objects Software Limited Visualization of combined table data
US10528284B2 (en) 2016-03-29 2020-01-07 Samsung Electronics Co., Ltd. Method and apparatus for enabling larger memory capacity than physical memory size
US10678704B2 (en) 2016-03-29 2020-06-09 Samsung Electronics Co., Ltd. Method and apparatus for enabling larger memory capacity than physical memory size
US9983821B2 (en) 2016-03-29 2018-05-29 Samsung Electronics Co., Ltd. Optimized hopscotch multiple hash tables for efficient memory in-line deduplication application
US10496543B2 (en) 2016-03-31 2019-12-03 Samsung Electronics Co., Ltd. Virtual bucket multiple hash tables for efficient memory in-line deduplication application
US9966152B2 (en) 2016-03-31 2018-05-08 Samsung Electronics Co., Ltd. Dedupe DRAM system algorithm architecture
CN109416682B (en) 2016-06-30 2020-12-15 华为技术有限公司 System and method for managing database
US10685004B2 (en) * 2016-07-11 2020-06-16 Salesforce.Com, Inc. Multiple feature hash map to enable feature selection and efficient memory usage
US11481321B2 (en) 2017-03-27 2022-10-25 Sap Se Asynchronous garbage collection in parallel transaction system without locking
US10726006B2 (en) 2017-06-30 2020-07-28 Microsoft Technology Licensing, Llc Query optimization using propagated data distinctness
US10489348B2 (en) * 2017-07-17 2019-11-26 Alteryx, Inc. Performing hash joins using parallel processing
US10552452B2 (en) 2017-10-16 2020-02-04 Alteryx, Inc. Asynchronously processing sequential data blocks
US10558364B2 (en) 2017-10-16 2020-02-11 Alteryx, Inc. Memory allocation in a data analytics system
US10810207B2 (en) * 2018-04-03 2020-10-20 Oracle International Corporation Limited memory and statistics resilient hash join execution
US11625398B1 (en) 2018-12-12 2023-04-11 Teradata Us, Inc. Join cardinality estimation using machine learning and graph kernels
US11016778B2 (en) 2019-03-12 2021-05-25 Oracle International Corporation Method for vectorizing Heapsort using horizontal aggregation SIMD instructions
US11258585B2 (en) * 2019-03-25 2022-02-22 Woven Planet North America, Inc. Systems and methods for implementing robotics frameworks
US11797539B2 (en) * 2019-09-12 2023-10-24 Oracle International Corporation Accelerated building and probing of hash tables using symmetric vector processing
EP4028907B1 (en) 2019-09-12 2023-10-04 Oracle International Corporation Accelerated building and probing of hash tables using symmetric vector processing
US11138232B1 (en) 2020-10-15 2021-10-05 Snowflake Inc. Export data from tables into partitioned folders on an external data lake

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5230047A (en) * 1990-04-16 1993-07-20 International Business Machines Corporation Method for balancing of distributed tree file structures in parallel computing systems to enable recovery after a failure
US5850547A (en) * 1997-01-08 1998-12-15 Oracle Corporation Method and apparatus for parallel processing aggregates using intermediate aggregate values
US5884299A (en) * 1997-02-06 1999-03-16 Ncr Corporation Optimization of SQL queries involving aggregate expressions using a plurality of local and global aggregation operations
US20020051536A1 (en) * 2000-10-31 2002-05-02 Kabushiki Kaisha Toshiba Microprocessor with program and data protection function under multi-task environment
US6708178B1 (en) * 2001-06-04 2004-03-16 Oracle International Corporation Supporting B+tree indexes on primary B+tree structures with large primary keys
US6859808B1 (en) * 2001-05-31 2005-02-22 Oracle International Corporation Mapping logical row identifiers for primary B+tree-like structures to physical row identifiers
US7054872B1 (en) * 2001-05-29 2006-05-30 Oracle International Corporation Online tracking and fixing of invalid guess-DBAs in secondary indexes and mapping tables on primary B+tree structures
US20060182046A1 (en) * 2005-02-16 2006-08-17 Benoit Dageville Parallel partition-wise aggregation
US7124147B2 (en) * 2003-04-29 2006-10-17 Hewlett-Packard Development Company, L.P. Data structures related to documents, and querying such data structures
US20060271568A1 (en) * 2005-05-25 2006-11-30 Experian Marketing Solutions, Inc. Distributed and interactive database architecture for parallel and asynchronous data processing of complex data and for real-time query processing
US7216338B2 (en) * 2002-02-20 2007-05-08 Microsoft Corporation Conformance execution of non-deterministic specifications for components
US20080162409A1 (en) * 2006-12-27 2008-07-03 Microsoft Corporation Iterate-aggregate query parallelization
US20080313128A1 (en) * 2007-06-12 2008-12-18 Microsoft Corporation Disk-Based Probabilistic Set-Similarity Indexes
US20090164412A1 (en) * 2007-12-21 2009-06-25 Robert Joseph Bestgen Multiple Result Sets Generated from Single Pass Through a Dataspace
US20100010967A1 (en) * 2008-07-11 2010-01-14 Day Management Ag System and method for a log-based data storage
US20100082633A1 (en) * 2008-10-01 2010-04-01 Jurgen Harbarth Database index and database for indexing text documents
US20100217953A1 (en) * 2009-02-23 2010-08-26 Beaman Peter D Hybrid hash tables
US20110246503A1 (en) * 2010-04-06 2011-10-06 Bender Michael A High-Performance Streaming Dictionary
US20110252033A1 (en) * 2010-04-09 2011-10-13 International Business Machines Corporation System and method for multithreaded text indexing for next generation multi-core architectures

Family Cites Families (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5742806A (en) * 1994-01-31 1998-04-21 Sun Microsystems, Inc. Apparatus and method for decomposing database queries for database management system including multiprocessor digital data processing system
US6338056B1 (en) * 1998-12-14 2002-01-08 International Business Machines Corporation Relational database extender that supports user-defined index types and user-defined search
US6430550B1 (en) * 1999-12-03 2002-08-06 Oracle Corporation Parallel distinct aggregates
US6507847B1 (en) * 1999-12-17 2003-01-14 Openwave Systems Inc. History database structure for Usenet
US7174429B2 (en) * 2001-12-28 2007-02-06 Intel Corporation Method for extending the local memory address space of a processor
US6952692B1 (en) * 2002-05-17 2005-10-04 Ncr Corporation Execution of requests in a parallel database system
US7356542B2 (en) * 2003-08-22 2008-04-08 Oracle International Corporation DML statements for densifying data
US7426520B2 (en) * 2003-09-10 2008-09-16 Exeros, Inc. Method and apparatus for semantic discovery and mapping between data sources
US9183256B2 (en) * 2003-09-19 2015-11-10 Ibm International Group B.V. Performing sequence analysis as a relational join
US7260563B1 (en) * 2003-10-08 2007-08-21 Ncr Corp. Efficient costing for inclusion merge join
US8145642B2 (en) 2004-11-30 2012-03-27 Oracle International Corporation Method and apparatus to support bitmap filtering in a parallel system
US8126870B2 (en) * 2005-03-28 2012-02-28 Sybase, Inc. System and methodology for parallel query optimization using semantic-based partitioning
US20060288030A1 (en) * 2005-06-09 2006-12-21 Ramon Lawrence Early hash join
US7801912B2 (en) * 2005-12-29 2010-09-21 Amazon Technologies, Inc. Method and apparatus for a searchable data service
US20070250470A1 (en) * 2006-04-24 2007-10-25 Microsoft Corporation Parallelization of language-integrated collection operations
US8122006B2 (en) * 2007-05-29 2012-02-21 Oracle International Corporation Event processing query language including retain clause
US7966343B2 (en) * 2008-04-07 2011-06-21 Teradata Us, Inc. Accessing data in a column store database based on hardware compatible data structures
US8862625B2 (en) * 2008-04-07 2014-10-14 Teradata Us, Inc. Accessing data in a column store database based on hardware compatible indexing and replicated reordered columns
US7970872B2 (en) * 2007-10-01 2011-06-28 Accenture Global Services Limited Infrastructure for parallel programming of clusters of machines
US8005868B2 (en) * 2008-03-07 2011-08-23 International Business Machines Corporation System and method for multiple distinct aggregate queries
US8032503B2 (en) * 2008-08-05 2011-10-04 Teradata Us, Inc. Deferred maintenance of sparse join indexes
US8078646B2 (en) * 2008-08-08 2011-12-13 Oracle International Corporation Representing and manipulating RDF data in a relational database management system
US8150836B2 (en) * 2008-08-19 2012-04-03 Teradata Us, Inc. System, method, and computer-readable medium for reducing row redistribution costs for parallel join operations
US8069210B2 (en) * 2008-10-10 2011-11-29 Microsoft Corporation Graph based bot-user detection
US8620884B2 (en) * 2008-10-24 2013-12-31 Microsoft Corporation Scalable blob storage integrated with scalable structured storage
US8370316B2 (en) 2010-07-12 2013-02-05 Sap Ag Hash-join in parallel computation environments

Patent Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5230047A (en) * 1990-04-16 1993-07-20 International Business Machines Corporation Method for balancing of distributed tree file structures in parallel computing systems to enable recovery after a failure
US5850547A (en) * 1997-01-08 1998-12-15 Oracle Corporation Method and apparatus for parallel processing aggregates using intermediate aggregate values
US5884299A (en) * 1997-02-06 1999-03-16 Ncr Corporation Optimization of SQL queries involving aggregate expressions using a plurality of local and global aggregation operations
US20020051536A1 (en) * 2000-10-31 2002-05-02 Kabushiki Kaisha Toshiba Microprocessor with program and data protection function under multi-task environment
US7054872B1 (en) * 2001-05-29 2006-05-30 Oracle International Corporation Online tracking and fixing of invalid guess-DBAs in secondary indexes and mapping tables on primary B+tree structures
US6859808B1 (en) * 2001-05-31 2005-02-22 Oracle International Corporation Mapping logical row identifiers for primary B+tree-like structures to physical row identifiers
US6708178B1 (en) * 2001-06-04 2004-03-16 Oracle International Corporation Supporting B+tree indexes on primary B+tree structures with large primary keys
US7809674B2 (en) * 2001-06-04 2010-10-05 Oracle International Corporation Supporting B+tree indexes on primary B+tree structures with large primary keys
US7216338B2 (en) * 2002-02-20 2007-05-08 Microsoft Corporation Conformance execution of non-deterministic specifications for components
US7124147B2 (en) * 2003-04-29 2006-10-17 Hewlett-Packard Development Company, L.P. Data structures related to documents, and querying such data structures
US7779008B2 (en) * 2005-02-16 2010-08-17 Oracle International Corporation Parallel partition-wise aggregation
US20060182046A1 (en) * 2005-02-16 2006-08-17 Benoit Dageville Parallel partition-wise aggregation
US20060271568A1 (en) * 2005-05-25 2006-11-30 Experian Marketing Solutions, Inc. Distributed and interactive database architecture for parallel and asynchronous data processing of complex data and for real-time query processing
US20080162409A1 (en) * 2006-12-27 2008-07-03 Microsoft Corporation Iterate-aggregate query parallelization
US20080313128A1 (en) * 2007-06-12 2008-12-18 Microsoft Corporation Disk-Based Probabilistic Set-Similarity Indexes
US7610283B2 (en) * 2007-06-12 2009-10-27 Microsoft Corporation Disk-based probabilistic set-similarity indexes
US20090164412A1 (en) * 2007-12-21 2009-06-25 Robert Joseph Bestgen Multiple Result Sets Generated from Single Pass Through a Dataspace
US20100010967A1 (en) * 2008-07-11 2010-01-14 Day Management Ag System and method for a log-based data storage
US20100082633A1 (en) * 2008-10-01 2010-04-01 Jurgen Harbarth Database index and database for indexing text documents
US20100217953A1 (en) * 2009-02-23 2010-08-26 Beaman Peter D Hybrid hash tables
US20110246503A1 (en) * 2010-04-06 2011-10-06 Bender Michael A High-Performance Streaming Dictionary
US20110252033A1 (en) * 2010-04-09 2011-10-13 International Business Machines Corporation System and method for multithreaded text indexing for next generation multi-core architectures

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Wikipedia, "Barrier (computer science)" retrieved 2/19/2016. *
Wikipedia, "Parallel computing" retrieved 2/19/2016. *
Wikipedia, "Table (database)" retrieved 2/19/2016. *

Cited By (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8719307B2 (en) * 2010-04-23 2014-05-06 Red Hat, Inc. Concurrent linked hashed maps
US20110264687A1 (en) * 2010-04-23 2011-10-27 Red Hat, Inc. Concurrent linked hashed maps
US10114866B2 (en) 2010-12-23 2018-10-30 Sap Se Memory-constrained aggregation using intra-operator pipelining
US10311105B2 (en) * 2010-12-28 2019-06-04 Microsoft Technology Licensing, Llc Filtering queried data on data stores
US20120166447A1 (en) * 2010-12-28 2012-06-28 Microsoft Corporation Filtering queried data on data stores
US20120254252A1 (en) * 2011-03-31 2012-10-04 International Business Machines Corporation Input/output efficiency for online analysis processing in a relational database
US8719312B2 (en) * 2011-03-31 2014-05-06 International Business Machines Corporation Input/output efficiency for online analysis processing in a relational database
US20130013824A1 (en) * 2011-07-08 2013-01-10 Goetz Graefe Parallel aggregation system
US8700822B2 (en) * 2011-07-08 2014-04-15 Hewlett-Packard Development Company, L.P. Parallel aggregation system
US9411853B1 (en) 2012-08-03 2016-08-09 Healthstudio, LLC In-memory aggregation system and method of multidimensional data processing for enhancing speed and scalability
US9836492B1 (en) * 2012-11-01 2017-12-05 Amazon Technologies, Inc. Variable sized partitioning for distributed hash tables
US9213732B2 (en) 2012-12-28 2015-12-15 Sap Ag Hash table and radix sort based aggregation
US9292560B2 (en) 2013-01-30 2016-03-22 International Business Machines Corporation Reducing collisions within a hash table
US9311359B2 (en) 2013-01-30 2016-04-12 International Business Machines Corporation Join operation partitioning
US9317548B2 (en) 2013-01-30 2016-04-19 International Business Machines Corporation Reducing collisions within a hash table
US9665624B2 (en) 2013-01-30 2017-05-30 International Business Machines Corporation Join operation partitioning
US9519668B2 (en) 2013-05-06 2016-12-13 International Business Machines Corporation Lock-free creation of hash tables in parallel
US9317517B2 (en) 2013-06-14 2016-04-19 International Business Machines Corporation Hashing scheme using compact array tables
US9367556B2 (en) 2013-06-14 2016-06-14 International Business Machines Corporation Hashing scheme using compact array tables
US9471710B2 (en) 2013-06-14 2016-10-18 International Business Machines Corporation On-the-fly encoding method for efficient grouping and aggregation
US10592556B2 (en) 2013-06-14 2020-03-17 International Business Machines Corporation On-the-fly encoding method for efficient grouping and aggregation
US9405858B2 (en) 2013-06-14 2016-08-02 International Business Machines Corporation On-the-fly encoding method for efficient grouping and aggregation
US9378264B2 (en) 2013-06-18 2016-06-28 Sap Se Removing group-by characteristics in formula exception aggregation
US10489394B2 (en) 2013-06-18 2019-11-26 Sap Se Database query calculation using an operator that explicitly removes group-by characteristics
US9195599B2 (en) 2013-06-25 2015-11-24 Globalfoundries Inc. Multi-level aggregation techniques for memory hierarchies
WO2015077951A1 (en) * 2013-11-28 2015-06-04 Intel Corporation Techniques for block-based indexing
US10242038B2 (en) 2013-11-28 2019-03-26 Intel Corporation Techniques for block-based indexing
US9672248B2 (en) 2014-10-08 2017-06-06 International Business Machines Corporation Embracing and exploiting data skew during a join or groupby
US10489403B2 (en) 2014-10-08 2019-11-26 International Business Machines Corporation Embracing and exploiting data skew during a join or groupby
US11113237B1 (en) 2014-12-30 2021-09-07 EMC IP Holding Company LLC Solid state cache index for a deduplicate storage system
US10503717B1 (en) * 2014-12-30 2019-12-10 EMC IP Holding Company LLC Method for locating data on a deduplicated storage system using a SSD cache index
US10175894B1 (en) 2014-12-30 2019-01-08 EMC IP Holding Company LLC Method for populating a cache index on a deduplicated storage system
US10289307B1 (en) 2014-12-30 2019-05-14 EMC IP Holding Company LLC Method for handling block errors on a deduplicated storage system
US9922064B2 (en) 2015-03-20 2018-03-20 International Business Machines Corporation Parallel build of non-partitioned join hash tables and non-enforced N:1 join hash tables
US11061878B2 (en) 2015-03-20 2021-07-13 International Business Machines Corporation Parallel build of non-partitioned join hash tables and non-enforced N:1 join hash tables
US10394783B2 (en) 2015-03-20 2019-08-27 International Business Machines Corporation Parallel build of non-partitioned join hash tables and non-enforced N:1 join hash tables
US10303791B2 (en) 2015-03-20 2019-05-28 International Business Machines Corporation Efficient join on dynamically compressed inner for improved fit into cache hierarchy
US10387397B2 (en) 2015-03-20 2019-08-20 International Business Machines Corporation Parallel build of non-partitioned join hash tables and non-enforced n:1 join hash tables
US10650011B2 (en) 2015-03-20 2020-05-12 International Business Machines Corporation Efficient performance of insert and point query operations in a column store
US10108653B2 (en) 2015-03-27 2018-10-23 International Business Machines Corporation Concurrent reads and inserts into a data structure without latching or waiting by readers
US10831736B2 (en) 2015-03-27 2020-11-10 International Business Machines Corporation Fast multi-tier indexing supporting dynamic update
US11080260B2 (en) 2015-03-27 2021-08-03 International Business Machines Corporation Concurrent reads and inserts into a data structure without latching or waiting by readers
US20160350394A1 (en) * 2015-05-29 2016-12-01 Sap Se Aggregating database entries by hashing
US10055480B2 (en) * 2015-05-29 2018-08-21 Sap Se Aggregating database entries by hashing
US9519583B1 (en) * 2015-12-09 2016-12-13 International Business Machines Corporation Dedicated memory structure holding data for detecting available worker thread(s) and informing available worker thread(s) of task(s) to execute
US11816118B2 (en) 2016-06-19 2023-11-14 Data.World, Inc. Collaborative dataset consolidation via distributed computer networks
US10437738B2 (en) * 2017-01-25 2019-10-08 Samsung Electronics Co., Ltd. Storage device performing hashing-based translation between logical address and physical address
US10891234B2 (en) 2018-04-04 2021-01-12 Sap Se Cache partitioning to accelerate concurrent workloads
WO2023034328A3 (en) * 2021-08-30 2023-04-13 Data.World, Inc. Correlating parallelized data from disparate data sources to aggregate graph data portions to predictively identify entity data

Also Published As

Publication number Publication date
US8370316B2 (en) 2013-02-05
US20120011108A1 (en) 2012-01-12
US20130138628A1 (en) 2013-05-30
US20120011133A1 (en) 2012-01-12
US9177025B2 (en) 2015-11-03
US9223829B2 (en) 2015-12-29

Similar Documents

Publication Publication Date Title
US20120011144A1 (en) Aggregation in parallel computation environments with shared memory
EP2469423B1 (en) Aggregation in parallel computation environments with shared memory
US11157478B2 (en) Technique of comprehensively support autonomous JSON document object (AJD) cloud service
US10572475B2 (en) Leveraging columnar encoding for query operations
US10628419B2 (en) Many-core algorithms for in-memory column store databases
US8660985B2 (en) Multi-dimensional OLAP query processing method oriented to column store data warehouse
Papailiou et al. H 2 RDF+: High-performance distributed joins over large-scale RDF graphs
US11593323B2 (en) Parallel and efficient technique for building and maintaining a main memory CSR based graph index in a RDBMS
US11797509B2 (en) Hash multi-table join implementation method based on grouping vector
US7640257B2 (en) Spatial join in a parallel database management system
WO2013152543A1 (en) Multidimensional olap query processing method for column-oriented data warehouse
US10185743B2 (en) Method and system for optimizing reduce-side join operation in a map-reduce framework
CN104376109A (en) Multi-dimension data distribution method based on data distribution base
Zhao et al. A practice of TPC-DS multidimensional implementation on NoSQL database systems
Gu et al. Rainbow: a distributed and hierarchical RDF triple store with dynamic scalability
Tian et al. A survey of spatio-temporal big data indexing methods in distributed environment
CN108319604B (en) Optimization method for association of large and small tables in hive
US9870399B1 (en) Processing column-partitioned data for row-based operations in a database system
US20200151178A1 (en) System and method for sharing database query execution plans between multiple parsing engines
EP2469424B1 (en) Hash-join in parallel computation environments
US10706055B2 (en) Partition aware evaluation of top-N queries
Yu et al. MPDBS: A multi-level parallel database system based on B-Tree
Shi et al. HEDC++: an extended histogram estimator for data in the cloud
US11775543B1 (en) Heapsort in a parallel processing framework
CN113742346A (en) Asset big data platform architecture optimization method

Legal Events

Date Code Title Description
AS Assignment

Owner name: SAP AG, GERMANY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:TRANSIER, FREDERIK;MATHIS, CHRISTIAN;BOHNSACK, NICO;AND OTHERS;REEL/FRAME:025565/0018

Effective date: 20101217

AS Assignment

Owner name: SAP AG, GERMANY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SANDERS, PETER;MULLER, INGO;SIGNING DATES FROM 20121126 TO 20121127;REEL/FRAME:029742/0678

AS Assignment

Owner name: SAP SE, GERMANY

Free format text: CHANGE OF NAME;ASSIGNOR:SAP AG;REEL/FRAME:033625/0223

Effective date: 20140707

STCB Information on status: application discontinuation

Free format text: ABANDONED -- AFTER EXAMINER'S ANSWER OR BOARD OF APPEALS DECISION