US8407447B2 - Dynamically reallocating computing components between partitions - Google Patents

Dynamically reallocating computing components between partitions Download PDF

Info

Publication number
US8407447B2
US8407447B2 US12/639,335 US63933509A US8407447B2 US 8407447 B2 US8407447 B2 US 8407447B2 US 63933509 A US63933509 A US 63933509A US 8407447 B2 US8407447 B2 US 8407447B2
Authority
US
United States
Prior art keywords
computing component
partition
management processor
computing
management
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US12/639,335
Other versions
US20110145540A1 (en
Inventor
Kenneth C. Duisenberg
Loren M. Koehler
Ivan Farkas
Stephen B. Lyle
Rajeev Grover
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hewlett Packard Enterprise Development LP
Original Assignee
Hewlett Packard Development Co LP
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hewlett Packard Development Co LP filed Critical Hewlett Packard Development Co LP
Priority to US12/639,335 priority Critical patent/US8407447B2/en
Assigned to HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P. reassignment HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: FARKAS, IVAN, KOEHLER, LOREN M., LYLE, STEPHEN B., DUISENBERG, KENNETH C., GROVER, RAJEEV
Publication of US20110145540A1 publication Critical patent/US20110145540A1/en
Application granted granted Critical
Publication of US8407447B2 publication Critical patent/US8407447B2/en
Assigned to HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP reassignment HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P.
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources

Definitions

  • a “cluster computing system” is a group of linked computing components working together closely so that in many respects they form a single computer.
  • a plurality of computing components such as blade computers may be used together to form a cluster computing system.
  • multiple blade computers may be inserted into a blade enclosure.
  • the blade enclosure may provide resources, such as power sources, networking, cooling elements (e.g., fans) and backplanes that are shared among multiple blade computers. Sharing these resources between multiple blade computers causes the overall utilization to become more efficient.
  • each computing component may include a management processor (sometimes referred to as a baseboard management controller or “BMC”) that is charged with various management tasks on that computing component. These management tasks may include monitoring temperature, cooling fan speeds, power mode, operating system status, and the like.
  • BMC baseboard management controller
  • the management processor of each blade may monitor these parameters and exchange information about these parameters with various outside entities, such as a blade enclosure controller. These communications may occur in many instances using the Intelligent Platform Management Interface (IPMI) protocols.
  • IPMI Intelligent Platform Management Interface
  • a plurality of computing components may be allocated among one or more logical partitions.
  • Each logical partition may operate as an independent cluster computing system that has available to it the software and hardware resources of every computing component that forms part of the logical partition. For example, a first subset of blade computers in a blade enclosure may be allocated to a first partition, where they will cooperate to form a first cluster computing system. Likewise, a second subset of blade computers in the enclosure may be allocated to a second partition, where they will cooperate to form a second cluster computing system. Management of resources of each blade may become difficult where each blade includes a separate management processor.
  • Management of partitions often is controlled by an entity outside of the plurality of computing components, such as by a processor on the blade enclosure. Situations may arise where one or more computing components of a partition are to be allocated to a different partition. In such cases, either all computing components of the system may be rebooted, or a centralized processor, such as a processor on a blade enclosure, may shut down computing components of individual partitions without affecting computing components of other partitions. However, in the latter case, reallocation is dependant on the centralized processor.
  • FIG. 1 depicts schematically, a system that includes a plurality of computing components (in this case, blade computers) that are deployed within an enclosure and allocated among a plurality of logical partitions according to an embodiment of the invention; and
  • FIG. 2 depicts an example method of allocating a plurality of computing components among one or more logical partitions according to an embodiment of the invention.
  • FIG. 3 depicts an example method of a management processor of a first computing component reallocating the first computing component from one partition to another without affecting computing components of other partitions.
  • FIG. 4 depicts, schematically, a configuration of example computing components among example partitions before the steps of FIG. 3 are performed.
  • FIG. 5 depicts, schematically, the configuration of example computing components among example partitions after the steps of FIG. 3 are performed.
  • Systems, methods and computing components are provided for allocating a plurality of computing components among one or more logical partitions and reallocating computing components between partitions without affecting computing components of other partitions.
  • an outside or centralized entity e.g., an enclosure controller
  • embodiments of the present disclosure implement peer-controlled partitioning. Control of dynamic repartitioning is delegated to the plurality of computing components themselves, and specifically, the management processors of the computing components.
  • FIG. 1 depicts an example system 10 that includes an enclosure 14 in which a plurality of computing components 16 are deployed.
  • each computing component 16 is a blade and enclosure 14 is a blade enclosure.
  • enclosure 14 is a blade enclosure.
  • the principles of the present disclosure may be applied in other environments where a plurality of computing components 16 are deployed together.
  • Enclosure 14 may include an enclosure controller 22 that may be configured to interact with the plurality of computing components 16 .
  • Enclosure controller 22 may be configured to execute instructions contained in firmware code (e.g., blade management firmware) that is stored in enclosure memory 24 available to enclosure controller 22 .
  • firmware code e.g., blade management firmware
  • Enclosure memory 24 may come in various volatile and non-volatile forms, and in many instances, is flash memory (e.g., EEPROM).
  • Enclosure 14 may include resources that are to be shared among the plurality of computing components 16 (hereafter referred to as “shared resources”), such as power sources, networking, cooling elements (e.g., fans) and a backplane 26 .
  • shared resources such as power sources, networking, cooling elements (e.g., fans) and a backplane 26 .
  • Such an arrangement is more efficient than clustering together a group of independent computer systems, each which includes its own resources such as a fan and power supply.
  • each individual computing component 16 may include its own resources, including memory (e.g., RAM and ROM), as well as input/output devices such as power and reset controls and one or more network interfaces (“NAC”) 27 .
  • Each computing component 16 may also include a management processor 28 (also referred to as a “baseboard management processor” or “BMC”) that manages resources of the particular computing component 16 .
  • BMC baseboard management processor
  • users may interact with enclosure controller 22 to communicate directly with management processor 28 of each computing component 16 .
  • Each computing component 16 may also include a separate central processing unit 30 (“CPU”), as shown in this example, although it is not required (e.g., in a blade 18 with a single processor that performs both management functions and processing functions).
  • the CPU 30 provides the “brains” of each computing component 16 , executing instructions to provide computer programs such as operating systems or applications to users.
  • the plurality of computing components 16 may be allocated among a plurality of logical partitions 32 .
  • each computing component's CPU 30 cooperates with the CPUs 30 of the other computing components 16 of the partition 32 to form a cluster computing system.
  • FIG. 1 for example, there are four computing components 16 , each a blade, that are allocated among two partitions 32 of two computing components 16 each. Although two partitions are shown, other numbers of logical partitions may be implemented using a plurality of computing components 16 . Moreover, it should be understood that a logical partition 32 may include any number of computing components 16 , including a single computing component 16 .
  • each management processor 28 of each computing component 16 may continue to be accessible.
  • each management processor 28 may continue to communicate with and be accessible through enclosure controller 22 (or another outside entity that is configured to exchange IPMI communications). If a logical partition 32 is formed from a relatively large number of computing components 16 , managing resources of each individual computing component 16 through the computing component's management processor 28 may be difficult and resource intensive.
  • embodiments described herein assign a role to each management processor of a partition.
  • Each management processor of the partition may then cooperate with other management processors of the partition to control resources of the partition.
  • resources of the partition may include hardware and software resources of each computing component 16 of a partition 32 , such as memory, I/O interfaces, management interfaces and so forth. Resources of the partition may also include hardware resources that are shared among computing components 16 of a partition, such as the “shared resources” described above. Resources of the partition may also include software resources that are shared or distributed among computing components 16 , such as an operating system of a cluster computing system formed from a plurality of computing components 16 of a partition 32 .
  • control of all management processors 28 of the partition 32 is delegated to a single management processor 28 , referred to herein as the “primary management processor” 38 .
  • Each of the other management processors 28 of the partition 32 is relegated to a role of a secondary or “auxiliary” management processor 40 in which it becomes subservient to the primary management processor 38 of the same partition.
  • management processors 28 may themselves allocate and control partitioning of the plurality of computing components 16 among a plurality of logical partitions 32 .
  • the plurality of computing components 16 may allocate themselves among the plurality of logical partitions 32 according to a partition description received at each management processor 28 from a domain management controller 42 , or “DMC.”
  • DMC 42 may be software executing on one of the management processors 28 (typically on a first blade), as indicated by the dashed lines surrounding DMC 42 in FIG. 1 .
  • DMC 42 may be a separate hardware processor of a blade, or even a separate computing component 16 of enclosure 14 .
  • DMC 42 may control distribution of partition descriptions to the entire enclosure 14 .
  • a partition description may include information identifying the plurality of logical partitions 32 , the computing components 16 that are to be assigned to each logical partition, and the role (primary or auxiliary) that is to be assigned to each computing component's management processor 28 within the partition.
  • DMC 42 may receive a partition description from any number of sources. For example, it may receive a partition description from a network server, from removable media made available to enclosure 14 (e.g., a USB jump drive, CD or DVD), or via manual input at enclosure controller 22 (or elsewhere). In some instances, DMC 42 may receive a partition description from a portable device that is used to physically and/or electronically interconnect each computing component 16 of a partition.
  • a partition description from a network server, from removable media made available to enclosure 14 (e.g., a USB jump drive, CD or DVD), or via manual input at enclosure controller 22 (or elsewhere).
  • DMC 42 may receive a partition description from a portable device that is used to physically and/or electronically interconnect each computing component 16 of a partition.
  • DMC 42 may forward the description to each management processor 28 of the plurality of computing components 16 in enclosure 14 .
  • Each management processor 28 may then assume a role of primary or auxiliary management processor within its partition based upon the received partition description.
  • a management processor 28 that assumes the role of primary management processor 38 of a partition 32 may be a liaison between outside entities and resources associated with the partition 32 .
  • Resources associated with a partition 32 may include resources of each computing component 16 of the partition (e.g., I/O, memory), shared resources provided by enclosure 14 that are associated specifically with that partition 32 , and software executing within the partition 32 .
  • a primary management processor 38 of a partition 32 may serve as a source for all management interfaces of a partition 32 .
  • An example of software executing within the partition 32 is an operating system that is executing on CPUs 30 of the computing components 16 of the partition 32 .
  • primary management processor 38 of a partition may enable one or more external interfaces 44 .
  • An enabled external interface 44 may serve as an interface from an outside entity to any number of resources associated with the partition.
  • External interface 44 may be a user interface to an operating system executing on one or more computing components 16 of the partition.
  • External interface 44 may additionally or alternatively be a management interface for interacting directly with the primary management processor 38 .
  • External interface 44 may also be a NAC 27 on the same computing component 16 as the primary management processor 38 ; in such cases, primary management processor 38 may be configured to establish a network connection on behalf of the entire partition.
  • external interface 44 may be a communication pathway from outside entities to primary management processor 38 that is used to manage resources associated with the partition.
  • Each computing component 16 has two external interfaces 44 ; one that connects to backplane 26 and the other that is a NAC 27 .
  • Management processor 28 of the computing component 16 labeled BLADE 1 has assumed a role of primary management processor 38 of the first partition 34 . Accordingly, BLADE 1 has both of its external interfaces 44 enabled.
  • All other management processors 28 in a partition 32 other than the primary management processor 38 may assume a role of auxiliary management processor 40 (also referred to as a “slave”).
  • Auxiliary management processors 40 of a partition 32 may be subservient to the primary management processor 38 of the same partition 32 .
  • management processor 28 of the computing component 16 labeled BLADE 2 has assumed a role of an auxiliary management processor 40 of the first partition 34 . Accordingly, BLADE 2 has both its external interfaces 44 disabled. This is indicated by the dashed lines in the connection to backplane 26 and the shaded area in the case of the NAC 27 . To be “disabled” does not necessarily mean a physical and/or electrical connection is severed. Rather, the external interface 44 is made unavailable (e.g., invisible) to users of system 10 .
  • auxiliary management processors 40 may be configured to interact with entities that are external to the system 10 , or even with enclosure controller 22 , exclusively through the primary management processor 38 their partition 32 using an internal interface 46 established with the primary management processor 38 .
  • Internal interfaces 46 may be invisible and/or inaccessible by all entities except for the primary management processor 38 of the partition.
  • Auxiliary management processor 40 may send and receive communications containing various information, such as device health information or power/reset commands, through internal interface 46 .
  • various information such as device health information or power/reset commands
  • external communications directed to or from any management processor of a partition may be routed through the primary management processor 38 .
  • FIG. 2 depicts a method of allocating a plurality of computing components 16 among a plurality of logical partitions 32 . Although these steps are shown in a particular order, it should be understood that these steps may be performed in any number of sequences.
  • partition description information is distributed (e.g., by DMC 42 ) to the management processor 28 of each computing module 16 of the plurality of computing modules 16 of enclosure 14 .
  • DMC 42 may have received the partition description from an entity outside of enclosure 14 prior to step 100 .
  • each management processor 28 reads the partition description to determine in which partition that management processor's computing module 16 belongs, as well as that management processor's role in the partition.
  • the management processor 28 may enable an external interface 44 (if the external interface 44 is disabled) at step 104 .
  • the primary management processor 38 thereafter provides exclusive access to resources associated with the partition of which it is a member.
  • the administrator may direct his or her device health request to the primary management processor 38 of the partition through an enabled external interface 44 .
  • Primary processor 38 may then communicate with the target computing component's management processor 28 , which presumably is an auxiliary management processor 40 , through an internal interface 46 to obtain the target component's device health.
  • the management processor 28 may disable its external interface 44 (if it is already enabled) at step 108 .
  • an internal interface is established between the auxiliary management processor 40 and the primary management processor 38 of the same partition. Thereafter, communications between an outside entity and auxiliary management processor 40 are directed through the primary management processor 38 of the same partition.
  • Computing components may be configured to dynamically reallocate themselves between partitions, rather than have the reallocation controlled by an outside entity (e.g., system processor 22 ).
  • a system of computing components may include a first computing component that is allocated to a first partition and a second computing component that is allocated to a second partition.
  • a domain management processor may distribute a partition description to the management processors of each computing component that indicates that the first computing component is to be reallocated to a third partition, while the second computing component is to remain part of the second partition.
  • the management processor of the first computing component may be configured to reallocate the first computing component to a third partition without affecting the second computing component. Meanwhile, the management processor or the second computing component may receive the partition description and determine that it does not indicate that the second computing component should switch partitions. Accordingly, the management processor of the second computing component may not reboot or interrupt a central processing unit of the second computing component.
  • FIG. 3 depicts an example method of reallocating at least some computing components of a system to different partitions without affecting other computing components. Although these steps are shown in a particular order, it should be understood that these steps may be performed in any number of sequences.
  • FIGS. 4 and 5 depict, schematically, the configuration of example computing components among example partitions before and after the steps of FIG. 3 are performed, respectively.
  • a partition description may be distributed to management processors of first, second, third and fourth computing components, 300 , 302 , 304 and 306 , respectively.
  • the partition description may indicate that: the first computing component 300 is to be reallocated from a first partition 400 to a second partition 402 ; the second computing component 302 is to remain part of a third partition 404 ; a third computing component 304 is to be reallocated from the first partition 400 to a fourth partition 406 ; and a fourth computing component 306 is to be reallocated from a fifth partition 408 to the same second partition 402 to which the first computing component 300 is being reallocated.
  • the first computing component 300 may be dynamically reallocated from the first partition 400 to the second partition 402 without affecting other computing components of other partitions. For example, a management processor of the first computing component 300 may reboot a central processing unit of the first computing component 300 and cause the central processing unit to join the second partition 402 once the reboot is complete.
  • the third computing component 304 began as part of the same first partition 400 as the first computing component 300 .
  • a management processor of the third computing component 304 receives the partition description in step 200 .
  • the management processor of the third computing component 304 may then reboot a central processing unit of the third computing component 304 and cause it to join the fourth partition 406 once the reboot is complete, in step 204 .
  • the fourth computing component 306 began as part of the fifth partition 408 .
  • a management processor of the fourth computing component 306 receives the partition description in step 200 .
  • the management processor of the fourth computing component 306 may then reboot a central processing unit of the fourth computing component 306 and cause it to join the same second partition 402 as the first computing component 300 , once the reboot is complete, in step 206 .
  • the management processor of the second computing component 302 may receive the partition description and determine that the second computing component 302 will not be swapping partitions. Accordingly, the management processor of the second computing component 302 may not cause a central processing unit of the second computing component 302 to reboot, and the second computing component 302 may continue to operate as part of the third partition 404 without interruption.

Abstract

Systems, methods and computing components are provided for dynamically reallocating a plurality of computing components among one or more logical partitions. A first computing component that is allocated to a first partition may have a management processor. A second computing component may be allocated to a second partition. The management processor of the first computing component may be configured to reallocate the first computing component to a third partition without affecting the second computing component.

Description

BACKGROUND
A “cluster computing system” is a group of linked computing components working together closely so that in many respects they form a single computer. A plurality of computing components such as blade computers may be used together to form a cluster computing system. To this end, multiple blade computers may be inserted into a blade enclosure. The blade enclosure may provide resources, such as power sources, networking, cooling elements (e.g., fans) and backplanes that are shared among multiple blade computers. Sharing these resources between multiple blade computers causes the overall utilization to become more efficient.
In a system of multiple computing components, such as a blade enclosure hosting a plurality of blade computers, each computing component may include a management processor (sometimes referred to as a baseboard management controller or “BMC”) that is charged with various management tasks on that computing component. These management tasks may include monitoring temperature, cooling fan speeds, power mode, operating system status, and the like. The management processor of each blade may monitor these parameters and exchange information about these parameters with various outside entities, such as a blade enclosure controller. These communications may occur in many instances using the Intelligent Platform Management Interface (IPMI) protocols.
A plurality of computing components may be allocated among one or more logical partitions. Each logical partition may operate as an independent cluster computing system that has available to it the software and hardware resources of every computing component that forms part of the logical partition. For example, a first subset of blade computers in a blade enclosure may be allocated to a first partition, where they will cooperate to form a first cluster computing system. Likewise, a second subset of blade computers in the enclosure may be allocated to a second partition, where they will cooperate to form a second cluster computing system. Management of resources of each blade may become difficult where each blade includes a separate management processor.
Management of partitions often is controlled by an entity outside of the plurality of computing components, such as by a processor on the blade enclosure. Situations may arise where one or more computing components of a partition are to be allocated to a different partition. In such cases, either all computing components of the system may be rebooted, or a centralized processor, such as a processor on a blade enclosure, may shut down computing components of individual partitions without affecting computing components of other partitions. However, in the latter case, reallocation is dependant on the centralized processor.
BRIEF DESCRIPTION OF THE DRAWINGS
For a detailed description of embodiments of the disclosure, reference will now be made to the accompanying drawings in which:
FIG. 1 depicts schematically, a system that includes a plurality of computing components (in this case, blade computers) that are deployed within an enclosure and allocated among a plurality of logical partitions according to an embodiment of the invention; and
FIG. 2 depicts an example method of allocating a plurality of computing components among one or more logical partitions according to an embodiment of the invention.
FIG. 3 depicts an example method of a management processor of a first computing component reallocating the first computing component from one partition to another without affecting computing components of other partitions.
FIG. 4 depicts, schematically, a configuration of example computing components among example partitions before the steps of FIG. 3 are performed.
FIG. 5 depicts, schematically, the configuration of example computing components among example partitions after the steps of FIG. 3 are performed.
DETAILED DESCRIPTION
Systems, methods and computing components are provided for allocating a plurality of computing components among one or more logical partitions and reallocating computing components between partitions without affecting computing components of other partitions. Rather than an outside or centralized entity (e.g., an enclosure controller) controlling and/or maintaining partitioning, embodiments of the present disclosure implement peer-controlled partitioning. Control of dynamic repartitioning is delegated to the plurality of computing components themselves, and specifically, the management processors of the computing components.
FIG. 1 depicts an example system 10 that includes an enclosure 14 in which a plurality of computing components 16 are deployed. In this exemplary system 10, each computing component 16 is a blade and enclosure 14 is a blade enclosure. However, it should be understood that the principles of the present disclosure may be applied in other environments where a plurality of computing components 16 are deployed together.
Enclosure 14 may include an enclosure controller 22 that may be configured to interact with the plurality of computing components 16. Enclosure controller 22 may be configured to execute instructions contained in firmware code (e.g., blade management firmware) that is stored in enclosure memory 24 available to enclosure controller 22. Enclosure memory 24 may come in various volatile and non-volatile forms, and in many instances, is flash memory (e.g., EEPROM).
Enclosure 14 may include resources that are to be shared among the plurality of computing components 16 (hereafter referred to as “shared resources”), such as power sources, networking, cooling elements (e.g., fans) and a backplane 26. Such an arrangement is more efficient than clustering together a group of independent computer systems, each which includes its own resources such as a fan and power supply.
In addition to shared resources provided by enclosure 14, each individual computing component 16 may include its own resources, including memory (e.g., RAM and ROM), as well as input/output devices such as power and reset controls and one or more network interfaces (“NAC”) 27. Each computing component 16 may also include a management processor 28 (also referred to as a “baseboard management processor” or “BMC”) that manages resources of the particular computing component 16. In traditional systems, users may interact with enclosure controller 22 to communicate directly with management processor 28 of each computing component 16.
Each computing component 16 may also include a separate central processing unit 30 (“CPU”), as shown in this example, although it is not required (e.g., in a blade 18 with a single processor that performs both management functions and processing functions). The CPU 30 provides the “brains” of each computing component 16, executing instructions to provide computer programs such as operating systems or applications to users.
As mentioned above, the plurality of computing components 16 may be allocated among a plurality of logical partitions 32. When multiple computing components 16 are combined to form a logical partition, each computing component's CPU 30 cooperates with the CPUs 30 of the other computing components 16 of the partition 32 to form a cluster computing system.
In FIG. 1, for example, there are four computing components 16, each a blade, that are allocated among two partitions 32 of two computing components 16 each. Although two partitions are shown, other numbers of logical partitions may be implemented using a plurality of computing components 16. Moreover, it should be understood that a logical partition 32 may include any number of computing components 16, including a single computing component 16.
Even though multiple computing components 16 may be used together to form a single partition 32 and, thus, a single cluster computing system, a management processor 28 of each computing component 16 nevertheless may continue to be accessible. For example, each management processor 28 may continue to communicate with and be accessible through enclosure controller 22 (or another outside entity that is configured to exchange IPMI communications). If a logical partition 32 is formed from a relatively large number of computing components 16, managing resources of each individual computing component 16 through the computing component's management processor 28 may be difficult and resource intensive.
Rather than interacting with a management processor 28 of each computing component 16 of a partition 32 on an individual basis, embodiments described herein assign a role to each management processor of a partition. Each management processor of the partition may then cooperate with other management processors of the partition to control resources of the partition.
As used herein, “resources of the partition” may include hardware and software resources of each computing component 16 of a partition 32, such as memory, I/O interfaces, management interfaces and so forth. Resources of the partition may also include hardware resources that are shared among computing components 16 of a partition, such as the “shared resources” described above. Resources of the partition may also include software resources that are shared or distributed among computing components 16, such as an operating system of a cluster computing system formed from a plurality of computing components 16 of a partition 32.
In an exemplary example of cooperation among management processors to control resources of a partition 32, control of all management processors 28 of the partition 32 is delegated to a single management processor 28, referred to herein as the “primary management processor” 38. Each of the other management processors 28 of the partition 32 is relegated to a role of a secondary or “auxiliary” management processor 40 in which it becomes subservient to the primary management processor 38 of the same partition.
Because the primary management processor 38 of a partition 32 acts on behalf of all management processors 28 of the partition 32, it is unnecessary to relinquish control of the partitioning to an outside entity (such as enclosure controller 22). Instead, by adopting the roles described herein, management processors 28 may themselves allocate and control partitioning of the plurality of computing components 16 among a plurality of logical partitions 32.
The plurality of computing components 16 may allocate themselves among the plurality of logical partitions 32 according to a partition description received at each management processor 28 from a domain management controller 42, or “DMC.” DMC 42 may be software executing on one of the management processors 28 (typically on a first blade), as indicated by the dashed lines surrounding DMC 42 in FIG. 1. Alternatively, DMC 42 may be a separate hardware processor of a blade, or even a separate computing component 16 of enclosure 14. DMC 42 may control distribution of partition descriptions to the entire enclosure 14.
A partition description may include information identifying the plurality of logical partitions 32, the computing components 16 that are to be assigned to each logical partition, and the role (primary or auxiliary) that is to be assigned to each computing component's management processor 28 within the partition.
DMC 42 may receive a partition description from any number of sources. For example, it may receive a partition description from a network server, from removable media made available to enclosure 14 (e.g., a USB jump drive, CD or DVD), or via manual input at enclosure controller 22 (or elsewhere). In some instances, DMC 42 may receive a partition description from a portable device that is used to physically and/or electronically interconnect each computing component 16 of a partition.
Once DMC 42 receives a partition description, it may forward the description to each management processor 28 of the plurality of computing components 16 in enclosure 14. Each management processor 28 may then assume a role of primary or auxiliary management processor within its partition based upon the received partition description.
A management processor 28 that assumes the role of primary management processor 38 of a partition 32 (also referred to as “monarch” or “master”) may be a liaison between outside entities and resources associated with the partition 32. Resources associated with a partition 32 may include resources of each computing component 16 of the partition (e.g., I/O, memory), shared resources provided by enclosure 14 that are associated specifically with that partition 32, and software executing within the partition 32. For example, a primary management processor 38 of a partition 32 may serve as a source for all management interfaces of a partition 32. An example of software executing within the partition 32 is an operating system that is executing on CPUs 30 of the computing components 16 of the partition 32.
To serve as an exclusive liaison for a partition, primary management processor 38 of a partition may enable one or more external interfaces 44. An enabled external interface 44 may serve as an interface from an outside entity to any number of resources associated with the partition. External interface 44 may be a user interface to an operating system executing on one or more computing components 16 of the partition. External interface 44 may additionally or alternatively be a management interface for interacting directly with the primary management processor 38. External interface 44 may also be a NAC 27 on the same computing component 16 as the primary management processor 38; in such cases, primary management processor 38 may be configured to establish a network connection on behalf of the entire partition. In sum, external interface 44 may be a communication pathway from outside entities to primary management processor 38 that is used to manage resources associated with the partition.
An example of this is seen in FIG. 1. Each computing component 16 has two external interfaces 44; one that connects to backplane 26 and the other that is a NAC 27. Management processor 28 of the computing component 16 labeled BLADE 1 has assumed a role of primary management processor 38 of the first partition 34. Accordingly, BLADE 1 has both of its external interfaces 44 enabled.
All other management processors 28 in a partition 32 other than the primary management processor 38 may assume a role of auxiliary management processor 40 (also referred to as a “slave”). Auxiliary management processors 40 of a partition 32 may be subservient to the primary management processor 38 of the same partition 32.
For example, in FIG. 1, management processor 28 of the computing component 16 labeled BLADE 2 has assumed a role of an auxiliary management processor 40 of the first partition 34. Accordingly, BLADE 2 has both its external interfaces 44 disabled. This is indicated by the dashed lines in the connection to backplane 26 and the shaded area in the case of the NAC 27. To be “disabled” does not necessarily mean a physical and/or electrical connection is severed. Rather, the external interface 44 is made unavailable (e.g., invisible) to users of system 10.
Because their external interfaces 44 are disabled, auxiliary management processors 40 may be configured to interact with entities that are external to the system 10, or even with enclosure controller 22, exclusively through the primary management processor 38 their partition 32 using an internal interface 46 established with the primary management processor 38. Internal interfaces 46 may be invisible and/or inaccessible by all entities except for the primary management processor 38 of the partition.
Auxiliary management processor 40 may send and receive communications containing various information, such as device health information or power/reset commands, through internal interface 46. In other words, external communications directed to or from any management processor of a partition may be routed through the primary management processor 38.
FIG. 2 depicts a method of allocating a plurality of computing components 16 among a plurality of logical partitions 32. Although these steps are shown in a particular order, it should be understood that these steps may be performed in any number of sequences. In step 100, partition description information is distributed (e.g., by DMC 42) to the management processor 28 of each computing module 16 of the plurality of computing modules 16 of enclosure 14. As noted above, DMC 42 may have received the partition description from an entity outside of enclosure 14 prior to step 100.
Next, at step 102, each management processor 28 reads the partition description to determine in which partition that management processor's computing module 16 belongs, as well as that management processor's role in the partition.
If, at step 102, the management processor 28 determines that it is a primary management processor 38, then the management processor 28 may enable an external interface 44 (if the external interface 44 is disabled) at step 104. At step 106, the primary management processor 38 thereafter provides exclusive access to resources associated with the partition of which it is a member.
For example, if a system administrator wishes to ascertain the device health of a particular computing component 16 of the partition, the administrator may direct his or her device health request to the primary management processor 38 of the partition through an enabled external interface 44. Primary processor 38 may then communicate with the target computing component's management processor 28, which presumably is an auxiliary management processor 40, through an internal interface 46 to obtain the target component's device health.
If, at step 102, the management processor 28 determines that it is an auxiliary management processor 40, then may disable its external interface 44 (if it is already enabled) at step 108. At step 110, an internal interface is established between the auxiliary management processor 40 and the primary management processor 38 of the same partition. Thereafter, communications between an outside entity and auxiliary management processor 40 are directed through the primary management processor 38 of the same partition.
Computing components may be configured to dynamically reallocate themselves between partitions, rather than have the reallocation controlled by an outside entity (e.g., system processor 22). For example, a system of computing components may include a first computing component that is allocated to a first partition and a second computing component that is allocated to a second partition. A domain management processor may distribute a partition description to the management processors of each computing component that indicates that the first computing component is to be reallocated to a third partition, while the second computing component is to remain part of the second partition.
The management processor of the first computing component may be configured to reallocate the first computing component to a third partition without affecting the second computing component. Meanwhile, the management processor or the second computing component may receive the partition description and determine that it does not indicate that the second computing component should switch partitions. Accordingly, the management processor of the second computing component may not reboot or interrupt a central processing unit of the second computing component.
FIG. 3 depicts an example method of reallocating at least some computing components of a system to different partitions without affecting other computing components. Although these steps are shown in a particular order, it should be understood that these steps may be performed in any number of sequences. FIGS. 4 and 5 depict, schematically, the configuration of example computing components among example partitions before and after the steps of FIG. 3 are performed, respectively.
In step 200, a partition description may be distributed to management processors of first, second, third and fourth computing components, 300, 302, 304 and 306, respectively. The partition description may indicate that: the first computing component 300 is to be reallocated from a first partition 400 to a second partition 402; the second computing component 302 is to remain part of a third partition 404; a third computing component 304 is to be reallocated from the first partition 400 to a fourth partition 406; and a fourth computing component 306 is to be reallocated from a fifth partition 408 to the same second partition 402 to which the first computing component 300 is being reallocated.
In step 202, the first computing component 300 may be dynamically reallocated from the first partition 400 to the second partition 402 without affecting other computing components of other partitions. For example, a management processor of the first computing component 300 may reboot a central processing unit of the first computing component 300 and cause the central processing unit to join the second partition 402 once the reboot is complete.
The third computing component 304 began as part of the same first partition 400 as the first computing component 300. A management processor of the third computing component 304 receives the partition description in step 200. The management processor of the third computing component 304 may then reboot a central processing unit of the third computing component 304 and cause it to join the fourth partition 406 once the reboot is complete, in step 204.
The fourth computing component 306 began as part of the fifth partition 408. A management processor of the fourth computing component 306 receives the partition description in step 200. The management processor of the fourth computing component 306 may then reboot a central processing unit of the fourth computing component 306 and cause it to join the same second partition 402 as the first computing component 300, once the reboot is complete, in step 206.
Meanwhile, at step 200, the management processor of the second computing component 302 may receive the partition description and determine that the second computing component 302 will not be swapping partitions. Accordingly, the management processor of the second computing component 302 may not cause a central processing unit of the second computing component 302 to reboot, and the second computing component 302 may continue to operate as part of the third partition 404 without interruption.
The disclosure set forth above may encompass multiple distinct embodiments with independent utility. The specific embodiments disclosed and illustrated herein are not to be considered in a limiting sense, because numerous variations are possible. The subject matter of this disclosure includes all novel and nonobvious combinations and subcombinations of the various elements, features, functions, and/or properties disclosed herein. The following claims particularly point out certain combinations and subcombinations regarded as novel and nonobvious. Other combinations and subcombinations of features, functions, elements, and/or properties may be claimed in applications claiming priority from this or a related application. Such claims, whether directed to a different embodiment or to the same embodiment, and whether broader, narrower, equal, or different in scope to the original claims, also are regarded as included within the subject matter of the present disclosure.
Where the claims recite “a” or “a first” element or the equivalent thereof, such claims include one or more such elements, neither requiring nor excluding two or more such elements. Further, ordinal indicators, such as first, second or third, for identified elements are used to distinguish between the elements, and do not indicate a required or limited number of such elements, and do not indicate a particular position or order of such elements unless otherwise specifically stated.

Claims (20)

We claim:
1. A system of computing components, comprising:
a plurality of logical partitions that operate independently from one another and each independently share software and hardware resources between computing components within each of the logical partitions;
a first computing component with a first management processor, the first computing component being allocated to a first logical partition; and
a second computing component with a second management processor, the second computing component being allocated to a second logical partition;
wherein the first management processor of the first computing component is configured to reallocate the first computing component to a third logical partition without affecting the second computing component, wherein affecting a computing component includes interrupting the computing component; and
wherein the first and second management processors are configured to receive a partition description that:
indicates whether the first and second management processors have a primary or auxiliary role among other management processors in each respective logical partition; and
disables an external interface associated with a management processor with an auxiliary role and enables an external interface associated with a management processor with a primary role.
2. The system of claim 1, wherein the management processor is configured to allocate the first computing component to the third logical partition in response to receiving a partition description.
3. The system of claim 2, wherein the partition description describes the first computing component being allocated to the third logical partition and the second computing component being allocated to the second logical partition.
4. The system of claim 2, wherein the first computing component includes a central processing unit, and wherein allocating the first computing component to the third logical partition includes the management processor rebooting the central processing unit.
5. The system of claim 2, further comprising a domain management processor that is configured to distribute the partition description to the management processor of the first computing component and a management processor of the second computing component.
6. The system of claim 5, wherein the domain management processor is executing on a predetermined management processor of a computing component of the system.
7. The system of claim 1, further comprising:
a third computing component with a management processor, the third computing component being allocated to the first logical partition;
wherein the management processor of the third computing component is configured to reallocate the third computing component to a fourth logical partition without affecting the second computing component.
8. The system of claim 1, further comprising:
a third computing component with a management processor, the third computing component being allocated to a fourth logical partition;
wherein the management processor of the third computing component is configured to reallocate the third computing component to the third logical partition without affecting the second computing component.
9. The system of claim 1, wherein the management processor is configured to:
assume a role among management processors of one of the plurality of logical partitions; and
cooperate with other management processors of one of the logical partitions to control resources of the logical partition.
10. A system, comprising:
a plurality of logical partitions that operate independently from one another and each independently share software and hardware resources between computing components within each of the logical partitions;
a first computing component that includes a first management processor allocated to a first logical partition among the plurality of logical partitions, wherein the management processor is configured to:
reallocate the first computing component from the first logical partition to a second logical partition without affecting a second computing component that is allocated to a third logical partition, wherein affecting a computing component includes interrupting the computing component, and wherein the management processor of the first computing component and a management processor of the second computing component are configured to receive a partitioning description that:
indicates whether the first and second management processors have a primary or auxiliary role among other management processors in each respective partition; and
disables an external interface associated with a management processor with an auxiliary role and enable an external interface associated with a management processor with a primary role.
11. The computing component of claim 10, wherein the management processor is configured to allocate the computing component to the second logical partition in response to receiving a partition description.
12. The computing component of claim 11, wherein the partition description describes the computing component being allocated to the second logical partition and the second computing component being allocated to the third logical partition.
13. The computing component of claim 10, wherein the first computing component includes a central processing unit, and wherein allocating the first computing component to the second logical partition includes the management processor rebooting the central processing unit.
14. The computing component of claim 10, wherein the management processor is configured to:
assume a role among management processors of one of the plurality of logical partitions; and
cooperate with other management processors of one of the logical partitions to control resources of the logical partition.
15. A method of dynamically reallocating computing components between partitions, comprising:
operating a plurality of logical partitions independently from one another, wherein the software and hardware resources are shared between computing components within each of the logical partitions;
distributing a partition description to management processors of a first computing component that is allocated to a first logical partition and a second computing component that is allocated to a second logical partition, wherein the partition description:
indicates that the first computing component is to be allocated to a third logical partition and the second computing component is to be allocated to the second logical partition and whether the management processors of the first and second computing components have a primary or auxiliary role among other management processors in each respective partition; and
disables an external interface associated with a management processor with an auxiliary role and enables an external interface associated with a management processor with a primary role; and
reallocating, by the management processor of the first computing component, the first computing component from the first partition to the third partition without affecting the second computing component, wherein affecting a computing component includes interrupting the computing component.
16. The method of claim 15, wherein the first computing component includes a central processing unit, and wherein reallocating the first computing component to the third logical partition includes the management processor of the first computing component rebooting the central processing unit.
17. The method of claim 15, further comprising:
distributing the partition description to a management processor of a third computing component that is allocated to the first logical partition; and
reallocating, by the management processor of the third computing component, the third computing component to a fourth logical partition without affecting the second computing component.
18. The method of claim 15, further comprising:
distributing the partition description to a management processor of a third computing component that is allocated to a fourth logical partition; and
reallocating, by the management processor of the third computing component, the third computing component from the fourth logical partition to the third logical partition without affecting the second computing component.
19. The method of claim 15, further comprising
providing access to resources of each logical partition exclusively at a primary management processor of the partition.
20. The method of claim 15, wherein affecting the computing component includes rebooting the computing component.
US12/639,335 2009-12-16 2009-12-16 Dynamically reallocating computing components between partitions Active 2031-05-04 US8407447B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/639,335 US8407447B2 (en) 2009-12-16 2009-12-16 Dynamically reallocating computing components between partitions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US12/639,335 US8407447B2 (en) 2009-12-16 2009-12-16 Dynamically reallocating computing components between partitions

Publications (2)

Publication Number Publication Date
US20110145540A1 US20110145540A1 (en) 2011-06-16
US8407447B2 true US8407447B2 (en) 2013-03-26

Family

ID=44144209

Family Applications (1)

Application Number Title Priority Date Filing Date
US12/639,335 Active 2031-05-04 US8407447B2 (en) 2009-12-16 2009-12-16 Dynamically reallocating computing components between partitions

Country Status (1)

Country Link
US (1) US8407447B2 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8429162B1 (en) * 2011-09-27 2013-04-23 Amazon Technologies, Inc. Facilitating data redistribution in database sharding
US10496507B2 (en) * 2017-09-21 2019-12-03 American Megatrends International, Llc Dynamic personality configurations for pooled system management engine

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6725317B1 (en) * 2000-04-29 2004-04-20 Hewlett-Packard Development Company, L.P. System and method for managing a computer system having a plurality of partitions
WO2005036367A2 (en) 2003-10-08 2005-04-21 Unisys Corporation Virtual data center that allocates and manages system resources across multiple nodes
US6971002B2 (en) 2001-08-09 2005-11-29 International Business Machines Corporation Method, system, and product for booting a partition using one of multiple, different firmware images without rebooting other partitions
US7103639B2 (en) 2000-12-05 2006-09-05 Hewlett-Packard Development Company, L.P. Method and apparatus for processing unit synchronization for scalable parallel processing
US7146515B2 (en) 2002-06-20 2006-12-05 International Business Machines Corporation System and method for selectively executing a reboot request after a reset to power on state for a particular partition in a logically partitioned system
US7150022B2 (en) * 2003-05-27 2006-12-12 Microsoft Corporation Systems and methods for the repartitioning of data
US20070027948A1 (en) 2005-06-23 2007-02-01 International Business Machines Corporation Server blades connected via a wireless network
US7340579B2 (en) 2004-11-12 2008-03-04 International Business Machines Corporation Managing SANs with scalable hosts
US7565398B2 (en) * 2002-06-27 2009-07-21 International Business Machines Corporation Procedure for dynamic reconfiguration of resources of logical partitions
US20090228685A1 (en) * 2006-04-27 2009-09-10 Intel Corporation System and method for content-based partitioning and mining

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6725317B1 (en) * 2000-04-29 2004-04-20 Hewlett-Packard Development Company, L.P. System and method for managing a computer system having a plurality of partitions
US7103639B2 (en) 2000-12-05 2006-09-05 Hewlett-Packard Development Company, L.P. Method and apparatus for processing unit synchronization for scalable parallel processing
US6971002B2 (en) 2001-08-09 2005-11-29 International Business Machines Corporation Method, system, and product for booting a partition using one of multiple, different firmware images without rebooting other partitions
US7146515B2 (en) 2002-06-20 2006-12-05 International Business Machines Corporation System and method for selectively executing a reboot request after a reset to power on state for a particular partition in a logically partitioned system
US7565398B2 (en) * 2002-06-27 2009-07-21 International Business Machines Corporation Procedure for dynamic reconfiguration of resources of logical partitions
US7150022B2 (en) * 2003-05-27 2006-12-12 Microsoft Corporation Systems and methods for the repartitioning of data
WO2005036367A2 (en) 2003-10-08 2005-04-21 Unisys Corporation Virtual data center that allocates and manages system resources across multiple nodes
WO2005036405A1 (en) 2003-10-08 2005-04-21 Unisys Corporation Computer system para-virtualization using a hypervisor that is implemented in a partition of the host system
US7340579B2 (en) 2004-11-12 2008-03-04 International Business Machines Corporation Managing SANs with scalable hosts
US20070027948A1 (en) 2005-06-23 2007-02-01 International Business Machines Corporation Server blades connected via a wireless network
US20090228685A1 (en) * 2006-04-27 2009-09-10 Intel Corporation System and method for content-based partitioning and mining

Also Published As

Publication number Publication date
US20110145540A1 (en) 2011-06-16

Similar Documents

Publication Publication Date Title
US20230208731A1 (en) Techniques to control system updates and configuration changes via the cloud
US11354053B2 (en) Technologies for lifecycle management with remote firmware
US10951458B2 (en) Computer cluster arrangement for processing a computation task and method for operation thereof
Gupta et al. HPC-aware VM placement in infrastructure clouds
US7051215B2 (en) Power management for clustered computing platforms
CN101142553B (en) OS agnostic resource sharing across multiple computing platforms
US20080043769A1 (en) Clustering system and system management architecture thereof
US9477286B2 (en) Energy allocation to groups of virtual machines
JP2005032242A (en) Monitoring system and monitoring method of utilization of resource, and performance of application
JP2007094611A (en) Computer system and its boot control method
CN104714846A (en) Resource processing method, operating system and equipment
US7783807B2 (en) Controlling resource transfers in a logically partitioned computer system
US8407447B2 (en) Dynamically reallocating computing components between partitions
US7200700B2 (en) Shared-IRQ user defined interrupt signal handling method and system
US8239539B2 (en) Management processors cooperating to control partition resources
EP4124932A1 (en) System, apparatus and methods for power communications according to a cxl.power protocol
US20160073543A1 (en) Zoneable power regulation
US20110213838A1 (en) Managing at least one computer node
US9933826B2 (en) Method and apparatus for managing nodal power in a high performance computer system
KR20230067755A (en) Memory management device for virtual machine
EP2693718B1 (en) Information processing system, collecting program, and collecting method

Legal Events

Date Code Title Description
AS Assignment

Owner name: HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P., TEXAS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DUISENBERG, KENNETH C.;KOEHLER, LOREN M.;FARKAS, IVAN;AND OTHERS;SIGNING DATES FROM 20091207 TO 20091214;REEL/FRAME:023703/0444

STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP, TEXAS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P.;REEL/FRAME:037079/0001

Effective date: 20151027

FPAY Fee payment

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8