WO1993023806A1

WO1993023806A1 - Method and apparatus for reducing memory wearout in a computer system

Info

Publication number: WO1993023806A1
Application number: PCT/US1992/006713
Authority: WO
Inventors: Lishing Liu; William Robert Reohr; David Tjeng-Ming Shen
Original assignee: International Business Machines Corporation
Priority date: 1992-05-12
Filing date: 1992-08-11
Publication date: 1993-11-25
Also published as: EP0640228A1; JPH07503564A

Abstract

Device and method for managing storage of data in the storage locations of a computer memory by periodically modifying a translation scheme so that input addresses are altered in a systematic and predetermined manner such that storage of data is evenly distributed among computer memory storage locations.

Description

METHOD AND APPARATUS FOR REDUCING MEMORY WEAROUT IN A COMPUTER SYSTEM

TECHNICAL FIELD

This invention relates generally to memories of computer systems and, more particularly, to reducing or preventing wearout of storage locations and circuits in computer memory systems.

BACKGROUND OF THE INVENTION

Basic computer storage hierarchies include main storage, cache storage and auxiliary storage (for example, disks) . Main storage is typically implemented with either static RAM (SRAM) or dynamic RAM (DRAM) array chips; and main storage stores data that is directly accessible by program software via a corresponding physical address. Cache storage is utilized as a fast buffer for data accessing by the central processing unit (CPU) in order to compensate for the long lead-delays associated with main memory accesses.

Breakdown or failure of any one circuit or conductor of a memory chip or system can cause the entire memory chip to fail. One cause of failure of a memory circuit is wearing out of the circuit. Many factors contribute to the wearout of memory circuits. For example, the phenomenon of electromigration is a contributing factor to circuit wearout. Electro¬ migration is the movement of atoms caused by collisions with energetic electrons. This motion is due to an exchange of momentum between the electrons and the atoms. Over a period of time, voids can form in the metal conductor as a result of electro- migration. Voids in a metal conductor causes an increase in current density. This can be explained by the fact that the size of the conductor is reduced as a result of the voids, but the remaining conductor must still carry the same quantity of energetic electrons. Such current density increase accentuates conductor atom movement, thus accelerating metal void growth in the conductor. Eventually, a complete open circuit in the metal conductor may result, possibly causing a breakdown of the entire memory chip.

In order to reduce or prevent conductor wearout, current density guidelines for long term reliability are established by memory circuit designers. Essentially, the guidelines dictate how thick a wire must be so as to adequately support a certain current over a given period of time. That period of time being the lifetime of the chip. Reliability guidelines are set so that a memory chip will not likely wear out over its useful lifetime.

When memory chips are designed, worse case current density numbers are determined for each circuit. The worse case numbers assume that the same circuit switches every memory cycle. Since memory locations are used non-uniformly relative to each other, the chip designer must design memory chips based on the worse case scenario. Under some system applications, certain physical address locations may be used preferentially over others. As a result, the memory circuits which correspond to the preferentially used locations will fire/switch more frequently than will the other memory circuits. Accordingly, the problem remains that the preferentially used memory circuits have a higher duty cycle than the other memory circuits, and thus these memory circuits have a greater likelihood to fail. A solution to this problem is to make circuit usage more uniform, so that the worse case current density specification for each circuit can be reduced. Ultimately, thinner wires can be used so as to improve both the physical density and the speed of the circuit for the same lifetime assumptions.

Further, software programs have a natural tendency to access memory locations in a manner which is highly skewed to relatively small numbers of locations. For example, studies of execution traces have shown that in a main memory having a total size of 16 million bytes, the most frequently accessed 1000 blocks (each block having a size of 128 bytes) of the main memory accounted for more than 90% of total CPU accesses to the main memory. Similar studies have also shown highly skewed access patterns to small numbers of blocks in caches, and further studies have reported similar results for auxiliary storage accesses. For example, the so-called 80/20 Rule states that roughly 80% of accesses to disks are concentrated to 20% of memory locations on the disks.

The skewed nature of memory accesses may cause undesirable consequences on the long term reliability of storage systems. This is especially true in the storage technologies which are particularly vulnerable to wearout conditions. For example flash memory technology is currently popular in implementing nonvolatile storages (e.g., main memory). A cell in a flash memory chip generally has a lifetime of 100,000 to 1,000,000 accesses before wearout condition occurs. Although different memory technologies may have better lifetime characteristics than flash memory, it is generally true that long term reliability can be improved by reducing average access frequencies to memory cells. Highly skewed patterns of data access is inherently undesirable in this regard, since smaller portions of memory cells tend to be more subject to wearout risks due to more frequent accesses.

Another potential cause for memory reliability problems is power shorts or opens at the metal level due to heat generated from current. Heat cumulation may deteriorate the dielectric material. Uneven memory access patterns to certain memory locations tend to increase the occurrences of such undesirable effects.

There have been proposals or reducing wearout possibilities of memory by spreading memory accesses more uniformly. For example, in U.S. Patent No. 4,922,456 to Naddor et al. , a technique is described to spread accesses to double buffer for a nonvolatile memory design in order to evenly distribute wearout possibilities. However, Naddor et al. deals with specifical rotation techniques for accessing a circular buffer, which serves as temporary buffer for memory writes, and does not deal with the reduction of skews of access to the memory itself.

Moreover, it is generally known in the art to provide mapping or translating of CPU specified addresses to internal addresses which correspond to particular physical storage locations in memory. For example, such mapping has been performed for the purpose of virtual-to-real address translation through software page tables, which allows multiple programs to dynamically share real main memory under the control of operating systems. Another example of such mapping is a con iguration table for a large memory, which allows the memory to be reconfigured by large chunks (e.g., 4 megabyte blocks) and offers the capability of isolating failed chunks from being accessed. However, such mapping methods have not been performed for the purposes of providing more uniform memory circuit usage.

SUMMARY OF THE INVENTION

Although skewed patterns of memory accesses are inherent in the nature of software programs, the present invention solves the above problem by achieving more even accesses to memory cells by periodically altering the manner in which data is stored in memory. These techniques offer the opportunities for improving long term memory reliability by breaking or altering the normal patterns that memory cells are used.

Thus, in accordance with the present invention, there is provided a device and method for managing storage of data in the storage locations of a computer memory. The device includes translation means for receiving input addresses and altering the input addresses in accordance with a translation scheme so as to develop output addresses. Each of the input addresses and each of the output addresses corresponds to a computer memory storage location. Further, controlling means for developing and sending control signals to the translation means effectively controls the translation scheme of the translation means so that the altering of the input addresses by the translation means is conducted in a systematic and predetermined manner such that storage of data is evenly distributed among computer memory storage locations.

The method in accordance with the invention comprises receiving input addresses from a source, each of the input addresses comprising segments; providing a translation scheme for altering the input addresses in accordance therewith; periodically modifying the translation scheme in a systematic and predetermined manner so that storage of data is evenly distributed among computer memory storage locations; altering the input addresses in accordance with the translation scheme for developing output addresses; and outputting the output addresses.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other objects, features, aspects and advantages will be more readily apparent and better understood from the following detailed description of the invention, in which:

FIG. 1 is a block diagram in accordance with a preferred embodiment of the present invention;

FIG. 2 is a specific example of translation circuitry in accordance with a irst preferred embodiment of the present invention;

FIGS. 3A-D shows rotation of data in memory locations in accordance with the invention;

FIG. 4 is another embodiment of translation circuitry;

FIGS. 5A-C shows how data can be preserved when changing translation schemes;

FIG. 6 is a prior art cache memory;

FIG. 7 is the cache memory of FIG. 6 with the present invention incorporated therein; and

FIG. 8 is a block diagram of another embodiment of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention will first be discussed herein relative to computer memory systems in general, that is, any computer memory system, and then the invention will be further discussed as particularly incorporated in a cache storage system. However, it should be fully understood that the scope of the invention is not limited to cache storage. In this regard, those skilled in the art will appreciate and realize from the detailed description provided hereinbelow that the present invention can be applied to improve long term reliability in other memory systems, such as registers, magnetic disk storage, semiconductor disk storage, etc.

Referring initially to FIG. 1, in the course of performing various computing operations, a computer component, such as a central processing unit (CPU) 15, will write data to memory 20 or retrieve data from memory 20. Memory 20 is representative of any computer memory and can be a complete storage hierarchy, such as main storage, or it can be merely a portion or segment of a memory chip. Generally, memory 20 is organized into a plurality of distinct individually addressable locations. Each location has a unique corresponding address by which it can be accessed by the CPU 15 for fetching the contents stored in the location or for writing data to the location. More specifically, during each write operation, the CPU 15 specifies an address which corresponds to the location to which data is to be written; and during each read operation, the CPU 15 specifies an address which corresponds to the location from which data is to be fetched.

In accordance with the present invention, translation circuitry 25 is connected between the CPU 15 and memory 20, and a translation controller 30 is connected to the translation circuitry 25. The translation circuitry 25 can be provided as an additional layer of mapping in conjunction with any translation or mapping circuitry that may already exist in a computer memory system, or if no such mapping circuitry exists in the computer memory system, then the translation circuitry 25 can be the only layer of mapping that exists in the system.

Any one of a number of commercially available programmable microcontrollers are capable of being utilized for the translation controller 30. Since such microcontrollers are well known to those skilled in the art, no further explanation will be provided and it should be understood that these micro¬ controllers are readily programmable to perform the functions of the translation controller 30 as described hereinbelow.

Generally speaking, after the CPU 15 specifies an input address to be accessed, the translation circuitry 25 receives the input address and alters or translates the input address in accordance with a pre-specified translation scheme so as to develop an output address. Further, the translation controller 30 sends control signals to the translation circuitry 25 so as to intelligently regulate the translation scheme of the translation circuitry 25. The translation scheme can be periodically varied by the translation controller 30 as desired in order to facilitate balancing of workload or accessing among all memory locations. The specific periods by which the translation scheme is required to be varied by the translation controller 30 depends on the particular type of memory 30 being balanced. In this regard, the translation scheme can be dynamically varied over any time period as desired, such as every second,, ^"minute, day, week, etc.

Numerous techniques and circuitry can^"be implemented for carrying out the translation scheme, and a multitude of translation schemes can be employed for achieving the desired results in accordance with the invention. As such, the examples specified herein are included for illustrative purposes only, and should not be interpreted as limitations of the invention. As one example, the translation circuitry 25 used to implement a translation scheme comprises a plurality of selectable inverters which systematically invert some or all of the bits of the addresses being specified by the CPU 15. As shown in FIG. 2, the selectable inverters may be EXCLUSIVE-OR gates 35. FIG. 2 is an example of the translation circuitry 25 required for a 2-bit address and, for illustrative purposes only, translating by the EXCLUSIVE-OR gates 35 for 2-bit addresses will now be described in greater detail with reference also to FIG. 3.

In the example, "i" represents an input address, and "o" represents an output address. The input address is the address specified by the CPU 15 which is to be accessed, and the output address is the address that is developed after the input address is translated by the translation circuitry 25. The output address corresponds to the physical storage location in the memory 20. Since each bit of the input address is received by a respective EXCLUSIVE-OR gate 35, in this example, the translation circuitry 30 comprises two EXCLUSIVE-OR gates 35. Control signals from the translation controller 30 are also received by the EXCLUSIVE-OR gates 35. In order to allow for each EXCLUSIVE-OR gate 35 to be separately and independently controlled, each EXCLUSIVE-OR gate 35 receives its own control signal from the translation controller 30. C_B^_{t 0} represents the control signal sent to the EXCLUSIVE-OR gate corresponding to Bit 0 of the 2-bit address, and C_B^_t ^ represents the control signal sent to the EXCLUSIVE-OR gate 35 corresponding to Bit 1 of the 2-bit address. Thus, the translation controller 30 can control whether or not a bit of the logical address is inverted by sending either a high or low signal to the corresponding EXCLUSIVE-OR gate 35.

As shown in FIGS. 3A-D, four separate time periods, TIME1-TIME4, each with its own distinct translation scheme or mapping state will be described. Each box 40-55 represents a physical storage location in the memory 20, and the bits within each box represents the 2-bit input address being specified by the CPU 15 for accessing.

TABLE I illustrates the control signals sent from the translation controller 30 to the EXCLUSIVE-OR gates 35 of the translation circuitry 25 during each of the four time periods TIME1-TIME4.

TABLE I

^CBit 0 ^CBit 1

Timel 0 0

Time2 0 1

Time3 l 0

Time4 1 1

During Timel, since the control signals C_B^. ₀, ^cBit 1 ^are both low (0) , each bit of the translated output address will be equal to each bit of the input address, as follows:

Bit 0_o=Bit Bit l₀=Bit 1_± Accordingly, during Timel, input address 00 and output address 00 are equal, and each correspond to the same physical location 40; input address 01 and output address 01 are equal, and each correspond to the same physical location 45; input address 10 and output address 10 are equal, and each correspond to the same physical location 50; and input address 11 and output address 11 are equal, and each correspond to the same physical location 55.

During Time2, the control signal C_B^_{t 0} sent to the EXCLUSIVE-OR gate of Bit 0 is low (0) and the control signal C_B^_t ^_ sent to the EXCLUSIVE-OR gate 35 of Bit 1 is high (1) . Thus, in accordance with the logic of the EXCLUSIVE-OR gates 35, Bit 0 of the translated output address will be equal to Bit 0 of the input address, and the Bit 1 of the translated output address will be equal to the complement of Bit 1 of the input address, as follows:

Bit 0_o=Bit Bit l₀=Bit 1^

Thus, during Time2, input address 00 is translated to output address 01, and this input address 00 corresponds to physical location 45; input address 01 is translated to output address 00, and this input address 01 corresponds to physical location 40; input address 10 is translated to output address 11, and this input address 10 corresponds to physical location 55; and input address 11 is translated to output address 10, and this input address 11 corresponds to physical location 50.

During Time3, the control signal C_B^ ₀ sent to the EXCLUSIVE-OR gate 35 of Bit 0 is high (1) and the control signal C_{βit 2} sent to the EXCLUSIVE-OR gate 35 of Bit 1 is low (0) . Thus, in accordance with the logic of the EXCLUSIVE-OR gates 35, Bit 0 of the translated output address will be equal to the complement of Bit 0 of the input address, and Bit 1 of the translated output address will be equal to Bit 1 of the input address, as follows:

Bit =Bit Bit l₀=Bit

Thus, during Time3, input address 00 is translated to output address 10, and this input address 00 corresponds to physical location 50; input address 01 is translated to output address 11 and, this input address 01 corresponds to physical location 55; input address 10 is translated to output address 00, and this input address 10 corresponds to physical location 40; and input address 11 is translated to output address 01, and this input address 11 corresponds to physical location 45.

During Time4, the control signal C_B^_{t 0} sent to the EXCLUSIVE-OR gate of Bit 0 is high (1) and the control signal C_B^_t -^ sent to the EXCLUSIVE-OR gate of Bit 1 is high (1) . Thus, in accordance with the logic of the EXCLUSIVE-OR gates 35, Bit 0 of the translated output address will be equal to the complement of Bit 0 of the input address, and Bit 1 of the translated output address will be equal to the complement of Bit 1 of the input address, as follows:

Bit 0_o=Bit 0^ Bit l₀=Bit 1^

Thus, during Time4, input address 00 is translated to output address 11, and this input address 00 corresponds to physical location 55; input address 01 is translated to output address 10, and this input address corresponds to physical location 50; input address 10 is translated to output address 01, and this input address corresponds to physical location 45; and input address 11 is translated to output address 00, and this input address 11 corresponds to physical location 40.

As such, the physical locations in memory which correspond to the input addresses specified by the CPU 15 have been rotated by altering over each time period, TIME1-TIME4, the translating of the input addresses to output addresses. In other words, the same input address specified by the CPU 15 will access a different physical location during each time period.

Furthermore, although FIG. 2 shows and the example hereinabove for a 2-bit address describes there to be a corresponding number of EXCLUSIVE-OR gates and address bits, it should be understood that it may not be necessary for there to be an EXCLUSIVE-OR gate for each address bit. In this regard, in larger memories, such as 1 Megabit memories or larger, it may be impractical for there to be one EXCLUSIVE-OR gate for each address bit. In such larger memories, it may only be practical and necessary to have EXCLUSIVE-OR gates connected to approximately 3 or 4 high order address bits, thereby allowing for 8-16 different mapping states.

As shown in FIG. 4, another example of translation circuitry 25 that can be used to implement another translation scheme comprises a circular shift register 60 which systematically shifts the bits of the addresses being specified by the CPU 15.

The shift register 60 can be any conventional, commercially available register and may comprise a plurality of latches 65, each latch 65 corresponding to and receiving a respective bit of the input address specified by the CPU 15. The 4-bit input address, 0110, and four time periods, PERI0D1-PERI0D4, will be used to illustrate how the same input address, 0110, is translated by the shift register 60 so as to develop four distinct output addresses, each corresponding to a different physical location in memory 20.

In response to a load control signal received from the translation controller, each latch 65 of the shift register 60 is loaded with a bit of the input address. The register 60 then performs a shift operation in accordance with the shift control signal received from the translation controller 30. The resulting translated output address is then accessed in memory 20. In this example, during PERIODl, no shift is performed and the translated output address is equal to the input address; and during PERIOD2, PERIOD3 and PERIOD4, the shift register 60 performs one, two and three shift(s) , respectively. The results are illustrated in TABLE II below.

TABLE II

BIT 0 BIT 1 BIT 2 BIT 3

INPUT ADDRESS: 0 1 1 0 SHIFT TRANSLATED OUTPUT ADDRESSES

BIT 0 BIT 1 BIT 2 BIT 3

PERIODl: 0 0 1 1 0

PERIOD2: 1 0 0 1 1

PERIOD3: 2 1 0 0 1

PERIOD4: 3 1 1 0 0

It should be understood that an input address of any length can be translated using a shift register, and input addresses of greater length would merely necessitate using a register having more latches so as to accommodate the bits of the input address. Further, if the input address is excessively long, it may not be necessary or practical to shift all bits of such an input address to adequately accomplish the goals of the present invention. In this regard, in such situations, it may only be necessary to shift, for example, 4 or 5 bits of the input addresses so as to suitably develop translated output addresses.

As noted hereinabove, the translation scheme can be changed over any time period as desired. The specific time period over which the translation scheme should be changed depends on the particular design and reliability requirements of the memory system. However, it should be realized that if a translation scheme is changed while stored data is still needed, then any data stored under a previous translation scheme could not be accessed by the CPU 15. Thus input addresses specified by the CPU 15 would then no longer correspond to the same output addresses, i.e., translation of input addresses to output addresses would be different, and the corresponding physical locations in memory would then also be different. Although some memory systems are used for temporary storage only, such as cache memory, changing the translation scheme can become a problem in other memory systems which are required to permanently store programs or blocks of data. Such permanently stored programs must be preserved when changing from one translation scheme and time period to another translation scheme and time period. For example, an operating system program residing on a hard disk drive requires such preservation.

With reference now to FIGS. 5A-C, in memory systems having such permanently stored programs or data, a temporary buffer 70 can be utilized to re-translate the program into a location in memory which is in accordance with the new translation scheme to be implemented. For example, if a new translation scheme and time period required for the data in memory location 75 to be transferred to memory location 90, and vice versa, then the data in location 90 can be copied or transferred to the temporary buffer 70, as shown in FIG. 5A.

The temporary buffer 70 can be any unused memory space, such as cache memory, a portion of main memory, an external disk, etc. , or it can be portions of memory chips dedicated for such a purpose. As shown in FIG. 5B, the data stored in location 75 can then be written to location 90 or, more specifically, the data in location 75 can be written over the data in location 90. Next, as shown in FIG. 5C, the data in the temporary buffer 70 (the data originally stored in location 90) can be written to location 75. Thus, the data previously stored in location 75 is now stored in location 90, and the data previously stored in location 90 is now stored in location 75. This data can then be correctly accessed in accordance with the new translation scheme and time period.

The present invention will now be discussed with specific application to a cache memory. First, a brief description of a cache memory will be provided so that operation of the invention in the cache can be presented in proper context. For simplicity of illustration a direct-mapped (1-way associative) cache 95 is shown in FIG. 6. It is assumed that all cache accesses from the CPU are in the unit of an 8-byte doubleword (DW) .

In a typical high performance computer each access to main memory can be completed in no fewer than 20-30 machine cycles. In order to compensate for such long lead-delay for main memory accesses, cache storages have been widely utilized as fast buffers for CPU data accessing. In many computer systems, caches are implemented purely as hardware layers of a storage subsystem and are transparent to the program software.

A cache is generally implemented as a table of fixed size memory blocks (e.g., 128 byte) and comprises two portions, namely, a directory portion 100 and an array portion 105. The cache array 105 stores the actual data blocks and the cache directory 100 describes the contents (e.g., addresses and other status tags) of the data blocks in the array 105. When a data block accessed by the CPU is missing in the cache, which is often referred as a cache miss condition, a new block entry may be inserted in the cache, which in turn may trigger the replacement of another already existing block from the cache.

Upon the creation of a new block entry in the cache the data of the block is generally retrieved from a lower level storage hierarchy (e.g., main storage) . Cache replacement is usually managed via least-recently-used (LRU) based algorithms. The probability (cache hit ratio) for a cache hit, which is the condition that a CPU data access hit a valid block entry in the cache, is normally in the range of 93-98% in modern computers. Therefore, in a computer with cache the cache itself intercepts and satisfies most of the data access requests (reads and writes) from the CPU.

The CPU specifies an input address 110, such as a 32-bit address (bits 0-31) , to be accessed in the cache 95. The cache array 105 may be viewed as a 1-dimensional table with IK (1024) entries, each entry containing a block (line) of data of size 128 bytes. Hence the total size of this cache is 128K bytes. For each access address to the cache 95 from the CPU, bits 25-31 of the address 110 are line offset (i.e., byte address within the 128 byte line) , and bits 15-24 of the address 110 are used to select one of the IK cache entries.

Since this is a direct-mapped cache, there can only be one correct line entry. The cache directory 100 stores address identifiers for all the line entries, and the cache array 105 stores all the data bits. For simplicity of illustration a single array chip is assumed to hold all the cache data bits. Each directory entry or record 115 has a ADDR field holding 15 address bits and a validity tag (V-bit) . A cache entry will be considered valid only when the associated V-bit is high (1) .

Bits 15-24 of the address 110 are used to select the line entry in both the directory 100 and the array 105. The selected directory entry is read out to compare logic circuitry 120 that compares bits 0-14 of address 110 to the ADDR field of the directory record 115, and also checks the V-bit of the record 115. When the compare logic circuitry 120 determines that bits 0-14 of the address 110 match the ADDR field of the directory record 115 and the V-bit of the record 115 is high, then the compare logic circuitry 120 sends a "HIT" signal to the CPU. If either condition is not met, then the compare logic circuitry 120 sends a "MISS" signal to the CPU.

At the cache array 105, bits 15-24 are used to select the 128 byte line 125, and bits 25-28 are used to select one of the 16 DW's in the selected line 125 for output to the CPU. The DW output is considered valid only when a "HIT" condition is received by the CPU from the compare logic circuitry 120. Upon a cache "MISS" condition, a copy of the required line needs to be fetched from main storage and loaded into the selected cache line. The directory information (address bits 0-14 and the V-bit) must then be properly updated to reflect the new line status.

Referring now to FIG. 7, in accordance with the present invention, the translation circuitry 25 and translation controller 30 can be implemented in the cache 95 so as to alter the line selection bits, bits 15-24, of the input address 110. Thus, selection of the line entry in the cache directory 100 and cache array 105 will be altered in accordance with a translated address. The translated address is developed in accordance with the translation scheme specified by the translation circuitry 25 and translation controller 30.

In other words, the particular line entry specified by bits 15-24 of the input address 110 will not be selected for accessing. Rather, bits 15-24 of the input address 110 will be modified by the translation circuitry 25 so as to develop a translated output address, and it is the translated output address that is used for selection for accessing.

It should be understood that the translation circuitry 25 and translation controller 30 illustrated in connection with the cache 95 can comprise the same circuitry as shown hereinabove in connection with memory systems in general. Further, the translation schemes outlined hereinabove can also be used in the cache 95, or any other suitable translation scheme can be employed for translating bits 15-24 of the input address 110.

In the case of utilization of the invention in cache storage, each time the translation scheme is changed, existing data contained in the cache may be purged or invalidated after main memory is updated with writes from the CPU. This invalidation process is particularly simple for a cache implemented with a so-called write-through scheme. The write-through scheme automatically causes each data write from the CPU to be written through to the main memory, in addition to being written to the cache.

Although in the above illustrations a single mapping is utilized for all accesses to a memory, it is possible to divide a single memory into more than one partition with separate mapping for each individual portion. Referring now to Figure 8, in which memory 20 is divided into four portions 160 -175. Each portion 160 - 175 has its own translation scheme, and each translation scheme is implemented by a respective portion 140 - 155 of the translation circuitry 25.

For each access from CPU, a single memory portion is selected. An example of the portion selection is based on 2 bits in the access address that do not change under translation conversion. Each of the translation circuitry portions 140 - 155 receives separate control signals (e.g., XOR mask) from the translation controller 30. For illustration purposes we assume that each of the translation circuitry portions 140 - 155 produces a mapped output address from the input address (from CPU) by XOR'ing with a bit-vector mask received from the translation controller 30. With this arrangement, data allocation may be reorganized on an individual portion basis.

In practice, reorganizing the mapping and the data allocation of a portion of the memory is less disruptive to performance than reorganizing for the whole memory. For example, Figure 8 may be viewed as an on-chip partition within the cache array chip 105 in Figure 7. The partition selection is via bits 15-16 of the input address 110 from CPU, and bits 15-16 are never altered by the translation circuitry 25. With this arrangement, the cache mapping may be reorganized on individual partition basis periodically. Upon each mapping reorganization, only the cache contents of the associated partition need to be invalidated. One advantage is that the cache can still perform effectively as fast memory buffer with three quarters of its contents remaining correct after the reorganization. While the invention has been described in terms of specific embodiments, it is evident in view of the foregoing description that numerous alternatives, modifications and variations will be apparent to those skilled in the art. Thus, the invention is intended to encompass all such alternatives, modifications and variations which fall within the scope and spirit of the invention and the appended claims.

Claims

What is claimed is:

1. A device for managing storage of data in the storage locations of a computer memory, comprising: translation means for receiving input addresses and altering said input addresses in accordance with a translation scheme so as to develop output addresses, wherein each of said input addresses comprises a plurality of segments, and wherein each of said input addresses and each of said output addresses corresponds to a computer memory storage location; and controlling means for developing and sending control signals to said translation means for periodically modifying the translation scheme so that altering of said input addresses by said translation means is conducted in a systematic and predetermined manner such that storage of data is evenly distributed among computer memory storage locations.

2. A device according to claim 1, wherein the translation scheme is modified after elapsing of a preset time period.

3. A device according to claim 1 or 2, wherein said translation means alters all input addresses in accordance with a single translation scheme throughout a time period.

4. A device according to claim 1 being further adapted to manage storage of data in a computer memory which comprises a plurality of portions, each portion including at least one computer memory storage location, and wherein a plurality of translation schemes are provided, such that said translation means alters input addresses corresponding to a first portion of computer memory in accordance with a first translation scheme, and said translation means alters input addresses corresponding to a second portion of computer memory in accordance with a second translation scheme.

5. A device according to any one of claims 1-4, wherein data is written to the computer memory storage locations which correspond to the developed output addresses.

6. A device according to any one of the previous claims, wherein said translation means receives input addresses from a central processing unit.

7. A device according to any one of the previous claims, wherein said translation means alters selected segments of said input addresses for developing output addresses.

8. A device according to any one of the previous claims, wherein said translation means includes inversion means for inverting selected segments of said input addresses.

9. A device according to claim 8, wherein said inversion means comprises EXCLUSIVE-OR gates.

10. A device according to any one of the previous claims, wherein said translation means includes a circular shift register.

11. A device according to any of claims 1-3 or 5-10, wherein a plurality of translation schemes are provided, said device further comprising: means for reloading data previously stored in accordance with a first translation scheme to computer memory storage locations which are in accordance with a second translation scheme, said first translation scheme being modified for developing said second translation scheme.

12. A device according to claim 11, wherein said means for reloading data includes a temporary buffer for temporarily holding data to be reloaded.

13. A method for managing storage of data in the storage locations of a computer memory, comprising the steps of: receiving input addresses from a source, each of said input addresses comprising segments; providing a translation scheme for altering said input addresses in accordance therewith; periodically modifying the translation scheme in a systematic and predetermined manner so that storage of data is evenly distributed among computer memory storage locations; altering said input addresses in accordance with the translation scheme for developing output addresses; and outputting said output addresses.

14.. A method according to claim 13, wherein the step of periodically modifying the translation scheme is conducted after elapsing of a preset time period.

15. A method according to claim 13 or 14, wherein all input addresses are altered in accordance with a single translation scheme throughout a time period.

16. A method according to claim 13 being further adapted to manage storage of data in a computer memory which comprises a plurality of portions, each portion including at least one computer memory storage location, and wherein a plurality of translation schemes are provided, such that a first translation scheme alters input addresses corresponding to a first portion of computer memory in accordance with a first translation scheme, and a second translation scheme alters input addresses corresponding to a second portion of computer memory in accordance with a second translation scheme.

17. A method according to any one of claims 13-16, further comprising the step of writing data to computer memory storage locations which correspond to the developed output addresses.

18. A method according to any one of claims 13-17, wherein the source from which input addresses are received is a central processing unit.

19. A method according to any one of claims 13-18, wherein the step of altering said input addresses comprises altering selected segments of said input addresses.

20. A method according to any one of claims 13-19, wherein the translation scheme includes inverting selected segments of said input addresses.

21. A method according to claim 20, wherein the inverting of selected segments of said input addresses is conducted using EXCLUSIVE-OR gates.

22. A method according to claim any one of claims 13-15 or 17-21, further comprising the step of reloading data previously stored in computer memory storage locations in accordance with a first translation scheme to computer memory storage locations in accordance with a second translation scheme, said first translation scheme being modified for developing said second translation scheme.