WO2013152216A1 - Balancing database workloads through migration - Google Patents

Balancing database workloads through migration Download PDF

Info

Publication number
WO2013152216A1
WO2013152216A1 PCT/US2013/035309 US2013035309W WO2013152216A1 WO 2013152216 A1 WO2013152216 A1 WO 2013152216A1 US 2013035309 W US2013035309 W US 2013035309W WO 2013152216 A1 WO2013152216 A1 WO 2013152216A1
Authority
WO
WIPO (PCT)
Prior art keywords
database
server
migration
cpu
workload
Prior art date
Application number
PCT/US2013/035309
Other languages
French (fr)
Inventor
Yun Chi
Vahit Hacigumus
Original Assignee
Nec Laboratories America, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nec Laboratories America, Inc. filed Critical Nec Laboratories America, Inc.
Priority to JP2014554994A priority Critical patent/JP5914699B2/en
Publication of WO2013152216A1 publication Critical patent/WO2013152216A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • G06F9/5088Techniques for rebalancing the load in a distributed system involving task migration

Definitions

  • the present invention relates to database workload balancing.
  • Data centers typically house high concentrations and densities of such computer systems and additionally provide databases to support customer needs.
  • Data center operators have to make the decision to purchase the server boxes up-front and then provision resources on ever changing workload. Further, multiple different workloads may share resources on the same physical box and provisioning the workload requires taking into account physical constraints such as capacity constraints associated with the physical resources.
  • cloud computing for data intensive computing presents unique opportunities and challenges for data center operators.
  • Another problem is hotspot elimination.
  • a server is overloaded if it is a memory hotspot or a CPU hotspot, or both
  • resolve the overloading servers by migrating out one master from each overloaded server.
  • Federated databases have been used where relations are split manually among different servers and a unified view is exposed to the clients.
  • such a solution involves manual configuration, special merging and wrapper algorithms, and does not support dynamic rebalancing.
  • Another solution is to add and remove slave databases based on workloads.
  • slave databases replicas
  • master only support read queries and all the write queries have to go to the master
  • adding slaves takes very long time, and so this solution is not able to handle quick workload changes.
  • the workload is balanced through swapping the roles or master and slave.
  • such method is only effective when the workloads at master and slaves are greatly different and the swapping is supported by the underlying DB system.
  • the load balancing is achieved by (a) changing resource allocation among different virtual machines at the same server and (b) adding and removing servers.
  • the method assumes each DB instance is wrapped within a single virtual machine and each server must contain all DBs (either all masters of all DBs, or one slave from each DB), thus limiting the applicability of the method.
  • a method for balancing database workloads among a plurality of database servers includes when a new server is available, selecting a set of master and slave databases to be migrated to the new server; and migrating the selected databases to result in a balanced new workload among all servers while minimizing migration cost; and during operation, if workload unbalance is detected in real time from a workload change in a certain database, iteratively selecting one database at a time to migrate to a different server to balance the workload.
  • the system achieves workload balancing in databases by using database migration in real time.
  • Such a solution has fast response time, involves minimum manual intervention, and can use different features (such as memory and CPU footprint) and metrics (such as workload variance, LI -norm of workload variation, master/slave ratio in a server, and so on) in the target function.
  • the system achieves workload balancing among databases and so the overall system performance is optimized among all databases and all the servers. Faster operation is achieved.
  • the system leverages techniques of live migration of memory-resident databases and so the operation is fast and the reaction to workload changes is quick.
  • the system quickly finds a cost-effective database arrangement that minimizes the cost of running a workload.
  • the system helps data center operators to provision for specific workloads while minimizing the total operating cost (TOC) of running the workload.
  • TOC total operating cost
  • Workload balancing can be achieve in real time through migration, and the ACID properties (atomicity, consistency, isolation, durability), which are required for most database systems, are guaranteed.
  • FIG. 1 illustrates an exemplary workload balancing achieved by migration database masters or slaves among servers
  • FIG. 2B is an exemplary illustration of low variance implying balanced capacities, for N
  • FIG. 3 shows an illustration that the optimal tenant may not exist.
  • FIG. 4 is an illustration of the hotspot elimination problem.
  • FIG. 5 is an illustration of a memory hotspot given multiple resources, namely memory and CPU.
  • FIG. 6 shows an exemplary computer to perform database balancing.
  • FIG. 1 illustrates an exemplary workload balancing achieved by migration database masters or slaves among servers.
  • SV1-SV4 represent four servers
  • M1-M3 represent three database masters
  • each database master has two corresponding database slaves (replicas).
  • SI 1 and S 12 are slaves of Ml .
  • the system selects a set of master and slave databases to be migrated to the new server.
  • the selected databases to be migrated will result in the most balanced new workload among all servers (including the new server) while use the minimum amount of migration cost.
  • the master-to-slave ratio is similar to those in existing servers (i.e., the new server is as typical as any existing server after the migration is completed).
  • Second, during operation if workload unbalance is detected in real time, e.g., due to workload change in a certain database, our solution will iteratively select one database at a time to migrate to a different server, in order to balance the workload.
  • M databases are hosted on N servers.
  • Each database consists a primary DB (master) and ore or more secondary DBs (slaves), which are hosted on different servers. That is, a master DB cannot be co-located with any of its slaves on the same server, and neither can two slaves belonging to the same master be co-loated on the same server.
  • Each database has footprints on several system resources, such as CPU usage and memory usage, which can vary over time.
  • Each server has a maximum resource capacity beyond which the server is considered overloaded.
  • the resources are assumed to be additive, namely the server resource usage is the sum of resources usages of all the databases hosted on the server. Such an assumption is commonly accepted for CPU and memory usages.
  • the system performs live migration of DBs, e.g., the real-time migration capability from systems such as the NEC TAM.
  • the system consists of N servers, SV V ..., SV N .
  • Server SV t contains K t tenants, ⁇ t) , ... ,t !' ⁇ . (Note: Different servers may have different numbers of tenants).
  • Each tenant is either a master or a slave. The master of a tenant cannot be co- located with any of its slaves on the same TAM server. has a footprint of memory usage ⁇ mem and a footprint of CPU usage
  • the memory usage and CPU usage are additive in that for server SV t , the total memory usage is .
  • the units of memory and CPU are interchangeable among servers (e.g., memory unit being Giga-Byte and CPU unit being Giga-cycles-per-second).
  • Each server SV i has a memory threshold SV memTh and a CPU threshold SV cpuTh that in normal conditions, should not go beyond. Once going beyond either memory or CPU threshold, we say the server becomes a hotspot (a memory hotspot or a CPU hotspot).
  • Each server SV t has a memory capacity SV i .memCap (namely, extra memory capacity before SV i becomes a memory hotspot) and a CPU capacity SV cpuCap . If
  • server SV i becomes a hotspot.
  • SV memCap The memory capacity of server SV i .
  • FIG. 1 the equi-lines with the same values are shown (they turn out to be circles).
  • ⁇ ⁇ 2 achieves its minimal value 0; other than that, the more unbalanced the ⁇ x 1 ,x 2 ⁇ are, the higher the variance.
  • N 3.
  • SV t is a hotspot server with resource (for example, memory) capacity x t and SV j is a light-loaded server with resource capacity je . . Then if the system migrates a tenant with resource usage ⁇ from SV i to SV j , the reduction in variance for capacities among servers is
  • the solution is to choose the largest resource capacity.
  • the tenant resource usage are discrete values and it is not guaranteed that a tenant with resource usage of ⁇ * can be found, as illustrated in FIG. 3, where the optimal ⁇ * is not achievable, because the nearest tenant resource usages are A 1 and ⁇ 2 .
  • FIG. 3 shows this illustration that the optimal tenant may not exist.
  • each existing server " donates" a set of masters and slaves to be migrated to the server so that
  • Pseudo-code for one exemplary Method 1 to migrate DBs to the new server is as follows: fo u : .S ' j , .. , , 53 ⁇ 4 ⁇ tte Ijois it s rver *>K
  • Method 1 can be used and invoked a number of times. The problem is different in that SV N+l was initially empty. So to fill up the empty SV N+1 quickly, the greedy nature of Method 1 may favor migrating large tenants existing servers, and therefore may result in large-tenant-only on SV N+1 and at the same time, leave large '' holes" in the existing server capacities (i.e., some servers have large tenants migrated out while others have no tenants migrated out).
  • Method 2 can be used for new server migration.
  • Method 2 among ⁇ S ⁇ SV N ⁇ , there are ⁇ M ⁇ masters and
  • l ⁇ N + 1) slaves, Method 2 is called with parameters reqMaster false and
  • Lines 6—9 pick the best candidate tenant for migration for each server. Note that this candidate can be decided by each server locally, without consulting other servers.
  • Lines 11-14 move the best candidate among N servers to SV N+1 .
  • ⁇ J C mem - ⁇ J memCap + C cpu - ⁇ J cpuCap , '
  • c mem and c are parameters for calibrating the different units between memory and CPU (GB vs. G-cycle-per-sec).
  • c mem and c can be used to control the emphasis on the balance of memory and that of CPU. For example, maybe CPU usage is more bursty and its balance is more important than that of memory usage, because the memory usage is relatively stable.
  • a special case is to set
  • One solution is to revise the objective function as the ratio between the variance reduction of migrating t and the cost of migrating t . For example, if the migration cost for t is proportional to t 's data size, the new metric is
  • a master cannot co-located with any of its slaves. So a master at a hotspot server cannot be migrated to arbitrary servers. This has to be checked in the algorithm. That is, for a tenant t who is a candidate for migration, the system only considers target servers that do not have t 's slaves on them. The system considers candidate tenant and target server that eliminate the old hotspot and do not generate new hotspot. If such a solution does not exist, the algorithm returns null.
  • the input of the algorithm is ⁇ SV ..., SV N ⁇ .
  • the solutions of the algorithm should address all the hotspots. This makes the problem much more difficult.
  • FIG. 4 shows an exemplary process for balancing database workloads among several database servers by using database migration (201).
  • the method includes migrating a set of databases from a set of existing servers to a newly available server so as to balance the workloads among servers (202) and a method of migrating databases among existing database servers so as to balance the workloads among the servers (209).
  • the process uses a metric to measure the goodness of a given database configuration (203).
  • the process also selects a sequence of database masters and slaves to migrate to the new server (204).
  • the metric considers various system resources such as CPU usage and memory usage.
  • the metric combines different factors in a unified way through a weighting strategy and a vector norm (Euclidean norm or LI norm).
  • the method includes selecting a set of master databases and slave databases to be migrated to the newly available server.
  • the method takes into consideration factors including (a) the expect cost reduction after the migration, (b) the master-to-slave ratio at each of the server after migration, and (c) the migration cost that is related to the size of each database.
  • the method chooses an optimal order of migration so as to minimize the negative performance impact on the system during the process of migration.
  • the method includes migrating databases among existing database servers so as to balance the workloads among the servers (209).
  • a metric can be used to measure the goodness of a given database configuration (210).
  • the metric considers various system resources such as CPU usage and memory usage.
  • the metric combines different factors in a unified way through a weighting strategy and a vector norm (Euclidean norm or LI norm).
  • the method includes iteratively selecting the next database to migrate and the corresponding target database server, in order to eliminate all the hot spots.
  • the method efficiently selects the next database to be migrated and the corresponding target server to be migrated to.
  • the method iteratively migrates one database at a time until all the hot spots are eliminated.
  • FIG. 6 shows an exemplary computer to perform database balancing.
  • the system may be implemented in hardware, firmware or software, or a combination of the three.
  • the invention is implemented in a computer program executed on a programmable computer having a processor, a data storage system, volatile and non-volatile memory and/or storage elements, at least one input device and at least one output device.
  • the computer preferably includes a processor, random access memory (RAM), a program memory (preferably a writable read-only memory (ROM) such as a flash ROM) and an input/output (I/O) controller coupled by a CPU bus.
  • RAM random access memory
  • program memory preferably a writable read-only memory (ROM) such as a flash ROM
  • I/O controller coupled by a CPU bus.
  • the computer may optionally include a hard drive controller which is coupled to a hard disk and CPU bus. Hard disk may be used for storing application programs, such as the present invention, and data. Alternatively, application programs may be stored in RAM or ROM.
  • I/O controller is coupled by means of an I/O bus to an I/O interface.
  • I/O interface receives and transmits data in analog or digital form over communication links such as a serial link, local area network, wireless link, and parallel link.
  • a display, a keyboard and a pointing device may also be connected to I/O bus.
  • separate connections may be used for I/O interface, display, keyboard and pointing device.
  • Programmable processing system may be preprogrammed or it may be programmed (and reprogrammed) by downloading a program from another source (e.g., a floppy disk, CD-ROM, or another computer).
  • Each computer program is tangibly stored in a machine-readable storage media or device (e.g., program memory or magnetic disk) readable by a general or special purpose programmable computer, for configuring and controlling operation of a computer when the storage media or device is read by the computer to perform the procedures described herein.
  • the inventive system may also be considered to be embodied in a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner to perform the functions described herein.

Abstract

A method for balancing database workloads among a plurality of database servers includes when a new server is available, selecting a set of master and slave databases to be migrated to the new server; and migrating the selected databases to result in a balanced new workload among all servers while minimizing migration cost; and during operation, if workload unbalance is detected in real time from a workload change in a certain database, iteratively selecting one database at a time to migrate to a different server to balance the workload.

Description

BALANCING DATABASE WORKLOADS THROUGH MIGRATION
This application is a non-provisional application of US Provisional Application Serial No. 61620151 filed 4/4/2012, the content of which is incorporated by reference.
BACKGROUND
The present invention relates to database workload balancing.
For various economic and business reasons enterprises are increasingly centralizing their backend computer systems in purpose built data centers. Data centers typically house high concentrations and densities of such computer systems and additionally provide databases to support customer needs. Data center operators have to make the decision to purchase the server boxes up-front and then provision resources on ever changing workload. Further, multiple different workloads may share resources on the same physical box and provisioning the workload requires taking into account physical constraints such as capacity constraints associated with the physical resources. The recent move towards cloud computing for data intensive computing presents unique opportunities and challenges for data center operators.
One key challenge that data center operators face is the provisioning of resources in the data center for specific customer workloads. For example, with a new server added, each existing server "donates" a set of masters and slaves to be migrated to the server so that
• there is no master-slave co-located on the new server,
• master-to-slave ratio are about the same across all the servers,
• memory capacities and CPU capacities are as balanced as possible across all the servers,
• amount of data being moved is minimal.
Another problem is hotspot elimination. For a given configuration of servers, if there are overloaded servers (a server is overloaded if it is a memory hotspot or a CPU hotspot, or both), resolve the overloading servers by migrating out one master from each overloaded server. Federated databases have been used where relations are split manually among different servers and a unified view is exposed to the clients. However, such a solution involves manual configuration, special merging and wrapper algorithms, and does not support dynamic rebalancing.
Another solution is to add and remove slave databases based on workloads. However, because slave databases (replicas) only support read queries and all the write queries have to go to the master, such a solution can be ineffective when the workload contains many write queries. In addition, adding slaves takes very long time, and so this solution is not able to handle quick workload changes.
Other approaches have assumed that the workloads are different for a master DB and its slaves. Therefore, the workload is balanced through swapping the roles or master and slave. However, such method is only effective when the workloads at master and slaves are greatly different and the swapping is supported by the underlying DB system. In another method, the load balancing is achieved by (a) changing resource allocation among different virtual machines at the same server and (b) adding and removing servers. However, the method assumes each DB instance is wrapped within a single virtual machine and each server must contain all DBs (either all masters of all DBs, or one slave from each DB), thus limiting the applicability of the method.
SUMMARY
In one aspect, a method for balancing database workloads among a plurality of database servers includes when a new server is available, selecting a set of master and slave databases to be migrated to the new server; and migrating the selected databases to result in a balanced new workload among all servers while minimizing migration cost; and during operation, if workload unbalance is detected in real time from a workload change in a certain database, iteratively selecting one database at a time to migrate to a different server to balance the workload.
Advantages of the preferred embodiment may include one or more of the following. The system achieves workload balancing in databases by using database migration in real time. Such a solution has fast response time, involves minimum manual intervention, and can use different features (such as memory and CPU footprint) and metrics (such as workload variance, LI -norm of workload variation, master/slave ratio in a server, and so on) in the target function. The system achieves workload balancing among databases and so the overall system performance is optimized among all databases and all the servers. Faster operation is achieved. The system leverages techniques of live migration of memory-resident databases and so the operation is fast and the reaction to workload changes is quick. The system quickly finds a cost-effective database arrangement that minimizes the cost of running a workload. The system helps data center operators to provision for specific workloads while minimizing the total operating cost (TOC) of running the workload. Workload balancing can be achieve in real time through migration, and the ACID properties (atomicity, consistency, isolation, durability), which are required for most database systems, are guaranteed.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 illustrates an exemplary workload balancing achieved by migration database masters or slaves among servers
FIG. 2A illustrates a low variance implying balanced capacities, for N = 2.
FIG. 2B is an exemplary illustration of low variance implying balanced capacities, for N
= 3.
FIG. 3 shows an illustration that the optimal tenant may not exist. FIG. 4 is an illustration of the hotspot elimination problem.
FIG. 5 is an illustration of a memory hotspot given multiple resources, namely memory and CPU.
FIG. 6 shows an exemplary computer to perform database balancing.
DETAILED DESCRIPTION
FIG. 1 illustrates an exemplary workload balancing achieved by migration database masters or slaves among servers. SV1-SV4 represent four servers, M1-M3 represent three database masters, and each database master has two corresponding database slaves (replicas). Thus, SI 1 and S 12 are slaves of Ml . First, when a new server is just started to handle more workload, the system selects a set of master and slave databases to be migrated to the new server. The selected databases to be migrated will result in the most balanced new workload among all servers (including the new server) while use the minimum amount of migration cost. In the new server, the master-to-slave ratio is similar to those in existing servers (i.e., the new server is as typical as any existing server after the migration is completed). Second, during operation, if workload unbalance is detected in real time, e.g., due to workload change in a certain database, our solution will iteratively select one database at a time to migrate to a different server, in order to balance the workload.
In one embodiment, M databases are hosted on N servers. Each database consists a primary DB (master) and ore or more secondary DBs (slaves), which are hosted on different servers. That is, a master DB cannot be co-located with any of its slaves on the same server, and neither can two slaves belonging to the same master be co-loated on the same server.
Each database has footprints on several system resources, such as CPU usage and memory usage, which can vary over time. Each server has a maximum resource capacity beyond which the server is considered overloaded. The resources are assumed to be additive, namely the server resource usage is the sum of resources usages of all the databases hosted on the server. Such an assumption is commonly accepted for CPU and memory usages. Preferably, the system performs live migration of DBs, e.g., the real-time migration capability from systems such as the NEC TAM. Next, terminologies are detailed:
• The system consists of N servers, SVV ..., SVN .
• Server SVt contains Kt tenants, {t) , ... ,t !' } . (Note: Different servers may have different numbers of tenants). • Each tenant
Figure imgf000007_0001
is either a master or a slave. The master of a tenant cannot be co- located with any of its slaves on the same TAM server. has a footprint of memory usage ή mem and a footprint of CPU usage
Figure imgf000007_0002
• The memory usage and CPU usage are additive in that for server SVt , the total memory usage is
Figure imgf000007_0003
.
• The units of memory and CPU are interchangeable among servers (e.g., memory unit being Giga-Byte and CPU unit being Giga-cycles-per-second).
• Each server SVi has a memory threshold SV memTh and a CPU threshold SV cpuTh that in normal conditions, should not go beyond. Once going beyond either memory or CPU threshold, we say the server becomes a hotspot (a memory hotspot or a CPU hotspot).
• Each server SVt has a memory capacity SVi.memCap (namely, extra memory capacity before SVi becomes a memory hotspot) and a CPU capacity SV cpuCap . If
SV memCap < 0 or SV cpuCap < 0 , server SVi becomes a hotspot.
For each server SVi .
• SV memCap: The memory capacity of server SVi .
• SV cpuCap : The CPU capacity of server SVi .
• SV memTh : The memory threshold of server SVt , beyond which SVi becomes a memory hotspot.
• SV cpuTh : The CPU threshold of server SVi , beyond which SVi becomes a CPU hotspot.
• SV ^tenants : The set of tenants {t), ... , t ' } that belong to server . For each tenant ή such that ή e SV;.tenants ,
• t- Iid : Tenant ID of ή .
• ή .isMaster : A Boolean flag to indicate if
Figure imgf000008_0001
is a master (or a slave).
• ή mem : The memory usage of t- .
• t!jcpu : The CPU usage of ή .
• ή .dataSize : The size of data of
For N numbers {x ...,xN} , the me
Figure imgf000008_0002
are defined as
1 N
X N 1'
1 N
With the above notation, a small variance implies balanced servers. If {x1, ...,xN} represent the resource capacities (memory or CPU) among servers, then lower variance among {xx, ...,xN} implies balanced servers. In another word, If remains unchanged, then
Figure imgf000008_0003
is minimized when χλ = · · · = xN . σχ 2 is a good metric to measure the balance of resource capacity among servers.
FIG. 2A illustrates a low variance implying balanced capacities, for Ν = 2. With x1 + x2 = c , where c = 2μx is a constant, the system simply redistributes values among χλ and x2. Since x1 + x2 = 2μχ , all the possible {x1 , x2 } pairs are located on the line shown in the figure.
In FIG. 1, the equi-lines with the same
Figure imgf000008_0004
values are shown (they turn out to be circles). As can be seen, when x1 = x2 = μχ , σχ 2 achieves its minimal value 0; other than that, the more unbalanced the {x1,x2} are, the higher the variance. This also works for N > 2. FIG. 2B is an exemplary illustration of low variance implying balanced capacities, for N = 3.
If SVt is a hotspot server with resource (for example, memory) capacity xt and SVj is a light-loaded server with resource capacity je . . Then if the system migrates a tenant with resource usage Δ from SVi to SVj , the reduction in variance for capacities among servers is
/ (Δ) = ~~~ (xi ~ xj + Δ) . If this variance reduction is treated as a function of Δ , then
dA N N from which the best tenant to migrate should have resource usage Δ* = (x . - ; )/2 and as a result of the migration, the change of variance is - (x . -x; )2/4.
The solution is to choose the largest resource capacity. The tenant resource usage are discrete values and it is not guaranteed that a tenant with resource usage of Δ* can be found, as illustrated in FIG. 3, where the optimal Δ* is not achievable, because the nearest tenant resource usages are A1 and Δ2 . FIG. 3 shows this illustration that the optimal tenant may not exist.
Next, the process of new server migration is discussed. With a new server added, each existing server "donates" a set of masters and slaves to be migrated to the server so that
• there is no master-slave co-located on the new server,
• master-to-slave ratio are about the same across all the servers,
• memory capacities and CPU capacities are as balanced as possible across all the servers,
• amount of data being moved is minimal.
Pseudo-code for one exemplary Method 1 to migrate DBs to the new server is as follows: fo u : .S 'j , .. , , 5¾ Ϋ tte Ijois it s rver *>K
out ut: lusmm I fo ml ¾fo <.>ni &SM! target :>.;·· v r ? o : gmt to
*~ :fV ·?···· id , ch ige <«« o;
t for j 1. to JV" do
. i:
Figure imgf000010_0001
mid
14 id
i» return C, >;
A single resource X can be picked for illustration purpose, where X = {x1,...,xN} are the resource usages of the existing N servers {SV ...,SVN} . Now, a new server SVN+1 is brought into the system. After each existing server donates some masters and slaves to the new server, assume the new resource usages in the N+l servers are Y = {y ...,yN,yN+i\ . Then the new mean and variance in resource usage are
Figure imgf000010_0002
Ν
μΥ = Uy and is not affected by the exact configuration of tenants.
^Y N + l x
Method 1 can be used and invoked a number of times. The problem is different in that SVN+l was initially empty. So to fill up the empty SVN+1 quickly, the greedy nature of Method 1 may favor migrating large tenants existing servers, and therefore may result in large-tenant-only on SVN+1 and at the same time, leave large ''holes" in the existing server capacities (i.e., some servers have large tenants migrated out while others have no tenants migrated out).
Based on the above consideration, Method 2 ignores the variance at the new server. That is, at each iteration, the method searches for a tenant t such that by migrating t into the new server SVN+1 , the variance reduction among the existing N servers (with respect to μγ instead of μχ ) is maximized: t= argmax - [(yt - ti ) - μγ ] ).
server i,tenantk m-ust ; {SV\ . ... :> , 5!¾ ij : di r qM ¾«?¾r , mi rmN am
ut t ' il d iftd si
Figure imgf000011_0001
, , , ,S¾- S½.;.* }
i (ί ί«¾: ¾, ''· :;°:°4 Q
s for t 4- I to N do
■ί mid
■S for ί ■■■ Ϊ t ,:Y do
" j in:: i-f ) dl- ^ i<^ifr: i < ^a^>: . >v;il:^. i:.^ id , ; '·: » mid
,.<);
Figure imgf000011_0002
«> ;.r |5VS .... , >d Λ· , S%., , ]:
As shown above, Method 2 can be used for new server migration. In Method 2, among {S^ SVN} , there are \ M\ masters and | 5* | slaves. Then the method selects | M | l{N + 1 ) masters and | 5* | l{N + 1) slaves to migrate to the new server SVN+1. To select | M \ l(N + 1) masters, Method 2 is called with parameters reqMaster = true and reqNum =| M | l(N + 1) ; to select I S | l{N + 1) slaves, Method 2 is called with parameters reqMaster = false and
reqNum =| S | l{N + 1) . Q is a priority queue, which sort the candidates by the variance reduction. In method 2:
• Lines 2—5 compute the average resource usages μηεη and μηεηι (with new server SVN+1 included).
• Lines 6—9 pick the best candidate tenant for migration for each server. Note that this candidate can be decided by each server locally, without consulting other servers.
• Lines 11-14 move the best candidate among N servers to SVN+1 .
• Lines 15—24 update the best candidate tenant of each server. If slaves are being migrated, the slave newly moved to SVN+1 may invalidate the best candidate in other servers, and so re-computation is needed.
Next, the problem of hotspot elimination is discussed. For a given configuration of servers, if there are overloaded servers (a server SVi is overloaded if it is a memory hotspot with
SV memCap < 0 or a CPU hotspot with SV cpuCap < 0 , or both), resolve the overloading servers by migrating out one master from each overloaded server. This problem is illustrated in FIG. 1.
There are two resources (memCap , cpuCap) , instead of a single resource X . This will affect two things: (1) how hotspot is defined and (2) how the balance among servers is measured. For (1), a new server SVi a hotspot if either SVi.memCap < 0 or SV cpuCap < 0 . In either case, the hotspot has to be eliminated. This is illustrated in FIG. 5 which is an illustration of a memory hotspot given multiple resources, namely memory and CPU.
For (2), a simple solution can be used by combining the variance of memory capacity and that of CPU capacity:
2 2 2
<J = C mem - <J memCap + C cpu - <J cpuCap , ' where cmem and c are parameters for calibrating the different units between memory and CPU (GB vs. G-cycle-per-sec).
In addition, cmem and c can be used to control the emphasis on the balance of memory and that of CPU. For example, maybe CPU usage is more bursty and its balance is more important than that of memory usage, because the memory usage is relatively stable. A special case is to set
1
c mem = 2
'memCap and
1
Figure imgf000013_0001
By doing this, the system essentially normalizes both memCap and cpuCap to 1 and treats the balance of memory and CPU as equally important. This consideration is reflected in lines 7, 8 and 1 1 of Method 3.
Because different tenants have different data sizes, the costs of migrating them are different. One solution is to revise the objective function as the ratio between the variance reduction of migrating t and the cost of migrating t . For example, if the migration cost for t is proportional to t 's data size, the new metric is
t.dataSize where <J2(t) is the variance reduction if t is migrated.
A master cannot co-located with any of its slaves. So a master at a hotspot server cannot be migrated to arbitrary servers. This has to be checked in the algorithm. That is, for a tenant t who is a candidate for migration, the system only considers target servers that do not have t 's slaves on them. The system considers candidate tenant and target server that eliminate the old hotspot and do not generate new hotspot. If such a solution does not exist, the algorithm returns null.
The input of the algorithm is {SV ..., SVN} . Among the servers, there might be more than one hotspots and so the solution of the algorithm should address all the hotspots. This makes the problem much more difficult. We choose to use a local greedy heuristic where the hotspots are eliminated one-by-one. That is, (1) each server hotspot is addressed by assuming it is the only hotspot, (2) the system configuration is hypothetically updated after each hotspot is addressed, (3) then the next hotspot is addressed by assuming the hypothetically updated system configuration, and (4) the algorithm declares failure if it cannot reach a final configuration where all hotspots are eliminated.
Note that this is heuristics because (1) we have to follow an order for which the hotspots are addressed and (2) for each hotspot we find a solution in a greedy manner where a less greedy solution may make next hotspot easier to eliminate.
FIG. 4 shows an exemplary process for balancing database workloads among several database servers by using database migration (201). The method includes migrating a set of databases from a set of existing servers to a newly available server so as to balance the workloads among servers (202) and a method of migrating databases among existing database servers so as to balance the workloads among the servers (209). The process uses a metric to measure the goodness of a given database configuration (203). The process also selects a sequence of database masters and slaves to migrate to the new server (204). In 205, the metric considers various system resources such as CPU usage and memory usage. In 206, the metric combines different factors in a unified way through a weighting strategy and a vector norm (Euclidean norm or LI norm).
Turning now to 204, the method includes selecting a set of master databases and slave databases to be migrated to the newly available server. In 207, while making the selection, the method takes into consideration factors including (a) the expect cost reduction after the migration, (b) the master-to-slave ratio at each of the server after migration, and (c) the migration cost that is related to the size of each database. In 208, the method chooses an optimal order of migration so as to minimize the negative performance impact on the system during the process of migration.
The method includes migrating databases among existing database servers so as to balance the workloads among the servers (209). A metric can be used to measure the goodness of a given database configuration (210). In 212, the metric considers various system resources such as CPU usage and memory usage. In 213, the metric combines different factors in a unified way through a weighting strategy and a vector norm (Euclidean norm or LI norm).
In an alternative, in 211, the method includes iteratively selecting the next database to migrate and the corresponding target database server, in order to eliminate all the hot spots. In 214, the method efficiently selects the next database to be migrated and the corresponding target server to be migrated to. In 215, the method iteratively migrates one database at a time until all the hot spots are eliminated.
os f S ¾ ho teS St, YarUo ioo r d lK-t ki
ι t*··· r< >/.?/ . skr" :.\;·;
.,'■■■■ r u
if > S( £ h i r
Figure imgf000015_0001
c d
u« >m- : di . ; , ,. r :d %, n ns !ddh hore eordd he hors d
om iu:: sr^'ate !'{·£? M get) : ¾. or ers! s*t of t rurrds to mlgrsr nd ¾ <:> cs>m;?¾so3idmg ta ge i;em; s f r
- #;
o j > ZO;
Figure imgf000016_0002
emr * d; FIG. 6 shows an exemplary computer to perform database balancing. The system may be implemented in hardware, firmware or software, or a combination of the three. Preferably the invention is implemented in a computer program executed on a programmable computer having a processor, a data storage system, volatile and non-volatile memory and/or storage elements, at least one input device and at least one output device.
By way of example, a block diagram of a computer to support the system is discussed next in FIG. 6. The computer preferably includes a processor, random access memory (RAM), a program memory (preferably a writable read-only memory (ROM) such as a flash ROM) and an input/output (I/O) controller coupled by a CPU bus. The computer may optionally include a hard drive controller which is coupled to a hard disk and CPU bus. Hard disk may be used for storing application programs, such as the present invention, and data. Alternatively, application programs may be stored in RAM or ROM. I/O controller is coupled by means of an I/O bus to an I/O interface. I/O interface receives and transmits data in analog or digital form over communication links such as a serial link, local area network, wireless link, and parallel link. Optionally, a display, a keyboard and a pointing device (mouse) may also be connected to I/O bus. Alternatively, separate connections (separate buses) may be used for I/O interface, display, keyboard and pointing device. Programmable processing system may be preprogrammed or it may be programmed (and reprogrammed) by downloading a program from another source (e.g., a floppy disk, CD-ROM, or another computer).
Each computer program is tangibly stored in a machine-readable storage media or device (e.g., program memory or magnetic disk) readable by a general or special purpose programmable computer, for configuring and controlling operation of a computer when the storage media or device is read by the computer to perform the procedures described herein. The inventive system may also be considered to be embodied in a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner to perform the functions described herein.
The system has been described herein in considerable detail in order to comply with the patent statutes and to provide those skilled in the art with the information needed to apply the novel principles and to construct and use such specialized components as are required. However, it is to be understood that the invention can be carried out by specifically different equipment and devices, and that various modifications, both as to the equipment details and operating procedures, can be accomplished without departing from the scope of the invention itself.

Claims

What is claimed is:
1. A method for balancing database workloads among a plurality of database servers,
comprising: when a new server is available, selecting a set of master and slave databases to be migrated to the new server and migrating the selected databases to result in a balanced new workload among all servers while minimizing migration cost; and during operation, if a workload unbalance is detected in real-time from a workload change, iteratively selecting one database at a time to migrate to a different server to balance the workload.
2. The method of claim 1, comprising applying a metric to measure a goodness of a given database configuration.
3. The method of claim 2, wherein the metric comprises system resources including processor (CPU) usage and memory usage.
4. The method of claim 2, wherein the metric comprises different factors in a unified way through a weighting strategy and a vector norm.
5. The method of claim 1, wherein the selecting a set of master databases and slave databases for migration further comprises considering one or more factors including (a) the expect cost reduction after the migration, (b) the master-to-slave ratio at each of the server after migration, (c) the migration cost that is related to the size of each database.
6. The method of claim 1, comprising choosing an optimal order of migration to minimize impact on the system during migration.
7. The method of claim 1, comprising iteratively selecting a next database to migrate and a corresponding target database server to eliminate a hot spot.
8. The method of claim 1, comprising iteratively migrating one database at a time until all hot spots are eliminated.
9. The method of claim 1 , comprising determining a metric for a given database configuration including a mean and a variance, wherein the variance is determined as: mem memCap cpu cpuCap ^ where cmem and ccpu are parameters for calibrating different units between memory and a CPU.
10. The method of claim 9, comprising setting
Figure imgf000020_0001
1 1. A system for balancing database workloads among a plurality of database servers, comprising: a processor; code executable by the processor when a new server is available, including instructions for selecting a set of master and slave databases to be migrated to the new server; and migrating the selected databases to result in a balanced new workload among all servers while minimizing migration cost; and code executable by the processor during operation, including instructions for detecting if workload unbalance is detected in real time and iteratively selecting one database at a time to migrate to a different server to balance the workload.
12. The system of claim 1 1 , comprising code for applying a metric to measure a goodness of a given database configuration.
13. The system of claim 12, wherein the metric comprises system resources including processor (CPU) usage and memory usage.
14. The system of claim 12, wherein the metric combines different factors in a unified way through a weighting strategy and a vector norm.
15. The system of claim 1 1 , wherein the code for selecting a set of master databases and slave databases for migration further comprises considering one or more factors including (a) the expect cost reduction after the migration, (b) the master-to-slave ratio at each of the server after migration, (c) the migration cost that is related to the size of each database.
16. The system of claim 11, comprising code for choosing an optimal order of migration to minimize impact on the system during migration.
17. The system of claim 11 , comprising code for iterative ly selecting a next database to migrate and a corresponding target database server to eliminate a hot spot.
18. The system of claim 11, comprising code for iteratively migrating one database at a time until all hot spots are eliminated.
19. The system of claim 11, comprising code for determining a metric for a given database configuration including a mean and a variance, wherein the variance is determined as:
2 2 2
<J = C mem - <J memCap + C cpu - <J cpuCap , ' where cmem and c are parameters for calibrating different units between memory and a CPU.
20. The system of claim 19, comprising code for setting mem 2 and c cpu
memCap
PCT/US2013/035309 2012-04-04 2013-04-04 Balancing database workloads through migration WO2013152216A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2014554994A JP5914699B2 (en) 2012-04-04 2013-04-04 Database workload balancing through migration

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201261620151P 2012-04-04 2012-04-04
US61/620,151 2012-04-04

Publications (1)

Publication Number Publication Date
WO2013152216A1 true WO2013152216A1 (en) 2013-10-10

Family

ID=49301057

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2013/035309 WO2013152216A1 (en) 2012-04-04 2013-04-04 Balancing database workloads through migration

Country Status (2)

Country Link
JP (1) JP5914699B2 (en)
WO (1) WO2013152216A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170318083A1 (en) * 2016-04-27 2017-11-02 NetSuite Inc. System and methods for optimal allocation of multi-tenant platform infrastructure resources
US10754704B2 (en) 2018-07-11 2020-08-25 International Business Machines Corporation Cluster load balancing based on assessment of future loading
US11178065B2 (en) 2019-08-07 2021-11-16 Oracle International Corporation System and methods for optimal allocation of multi-tenant platform infrastructure resources

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6791834B2 (en) * 2017-11-30 2020-11-25 株式会社日立製作所 Storage system and control software placement method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030069903A1 (en) * 2001-10-10 2003-04-10 International Business Machines Corporation Database migration
US20070130566A1 (en) * 2003-07-09 2007-06-07 Van Rietschote Hans F Migrating Virtual Machines among Computer Systems to Balance Load Caused by Virtual Machines
US20080243864A1 (en) * 2007-03-28 2008-10-02 Ciena Corporation Methods and systems for a network element database migration service
US20110099403A1 (en) * 2009-10-26 2011-04-28 Hitachi, Ltd. Server management apparatus and server management method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5494915B2 (en) * 2009-04-01 2014-05-21 日本電気株式会社 Replication system, master server, replica server, replication method, and program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030069903A1 (en) * 2001-10-10 2003-04-10 International Business Machines Corporation Database migration
US20070130566A1 (en) * 2003-07-09 2007-06-07 Van Rietschote Hans F Migrating Virtual Machines among Computer Systems to Balance Load Caused by Virtual Machines
US20080243864A1 (en) * 2007-03-28 2008-10-02 Ciena Corporation Methods and systems for a network element database migration service
US20110099403A1 (en) * 2009-10-26 2011-04-28 Hitachi, Ltd. Server management apparatus and server management method

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170318083A1 (en) * 2016-04-27 2017-11-02 NetSuite Inc. System and methods for optimal allocation of multi-tenant platform infrastructure resources
US10659542B2 (en) * 2016-04-27 2020-05-19 NetSuite Inc. System and methods for optimal allocation of multi-tenant platform infrastructure resources
US10754704B2 (en) 2018-07-11 2020-08-25 International Business Machines Corporation Cluster load balancing based on assessment of future loading
US11178065B2 (en) 2019-08-07 2021-11-16 Oracle International Corporation System and methods for optimal allocation of multi-tenant platform infrastructure resources
US11736409B2 (en) 2019-08-07 2023-08-22 Oracle International Corporation System and methods for optimal allocation of multi-tenant platform infrastructure resources

Also Published As

Publication number Publication date
JP2015513333A (en) 2015-05-07
JP5914699B2 (en) 2016-05-11

Similar Documents

Publication Publication Date Title
US9020901B2 (en) Balancing database workloads through migration
US10635664B2 (en) Map-reduce job virtualization
US10871960B2 (en) Upgrading a storage controller operating system without rebooting a storage system
US10824512B2 (en) Managing journaling resources with copies stored in multiple locations
EP2791813B1 (en) Load balancing in cluster storage systems
US10977086B2 (en) Workload placement and balancing within a containerized infrastructure
US20150222695A1 (en) Distributed processing system and method of operating the same
US20190250946A1 (en) Migrating a software container taking into account resource constraints
US9800575B1 (en) Assigning storage responsibility in a distributed data storage system with replication
US10776141B2 (en) Directed placement for request instances
US20200301748A1 (en) Apparatuses and methods for smart load balancing in a distributed computing system
US20170132044A1 (en) Storage management computer
US10069906B2 (en) Method and apparatus to deploy applications in cloud environments
WO2013152216A1 (en) Balancing database workloads through migration
US20220329651A1 (en) Apparatus for container orchestration in geographically distributed multi-cloud environment and method using the same
CN111913670A (en) Load balancing processing method and device, electronic equipment and storage medium
US10019330B2 (en) Providing fault tolerance in a virtualized computing environment through a migration approach based on resource availability
US20180314554A1 (en) Providing fault tolerance in a virtualized computing environment through a swapping approach
WO2020126537A1 (en) Incremental generation of quantum circuits
US10250455B1 (en) Deployment and management of tenant services
US20170201423A1 (en) Balancing utilization of infrastructure in a networked computing environment
US10387578B1 (en) Utilization limiting for nested object queries
CN112015552A (en) Hash ring load balancing method and device, electronic equipment and storage medium
US11928517B2 (en) Feature resource self-tuning and rebalancing
US10992534B2 (en) Forming groups of nodes for assignment to a system management server

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13772651

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2014554994

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13772651

Country of ref document: EP

Kind code of ref document: A1