CA1165893A - Error-correcting system - Google Patents

Error-correcting system

Info

Publication number
CA1165893A
CA1165893A CA000359377A CA359377A CA1165893A CA 1165893 A CA1165893 A CA 1165893A CA 000359377 A CA000359377 A CA 000359377A CA 359377 A CA359377 A CA 359377A CA 1165893 A CA1165893 A CA 1165893A
Authority
CA
Canada
Prior art keywords
error
memory
bit
data
logic circuit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
CA000359377A
Other languages
French (fr)
Inventor
Genzo Nagano
Masao Takahashi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Application granted granted Critical
Publication of CA1165893A publication Critical patent/CA1165893A/en
Expired legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C29/00Checking stores for correct operation ; Subsequent repair; Testing stores during standby or offline operation
    • G11C29/70Masking faults in memories by using spares or by reconfiguring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1012Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using codes or arrangements adapted for a specific type of error
    • G06F11/1024Identification of the type of error

Abstract

ABSTRACT OF THE DISCLOSURE An error-correcting system is disclosed, which system is located between a memory and a central processing unit. The system is comprised of a relief bit memory, an ECC (Error Correction Code) logic circuit, a switching circuit and a correction controlling circuit. The ECC logic circuit detects the occurrence of a soft error and a hard error. When a hard error occurs in the memory, the defective memory cell thereof is switched to the relief bit memory. Accordingly, data to be written into the memory or the relief bit memory is switched by means of the switching circuit. Similarly, data to be read from the memory or the relief bit memory is also switched by the switching circuit. The data to be stored in the relief bit memory is validated by means of the ECC logic circuit and the switching circuit. Further, the (n+1) -bit soft and hard errors are reduced to n-bit soft and hard errors by means of the ECC logic circuit and the switching circuit.

Description

~6~ 3 D CRI PT ION

TITLE OF THE IN~TENTION
An Error-Correcting System TECHN IC AL FI ELD
The present invention relates to an error-correcting system and, more particularly, relates to an error--correcting system for correcting errors occurring in a main memory of a computer system.
BACKGROUND ART
In recent years, most memories are fabricated by using semiconductor memory devices, so that memories having very large capacities can easily be obtained and, also, very low price memories can be realized~ Generally, the larger the capacities of the memories become, the more errors occur in the memories, and accordingly, in such very large capacity memories, it is very important to supervise the occurrences of errors produced in the memories. In order -to supervise the occurrences of errors, an ECC (Error Correction Code) logic circuit has been proposed. The ECC logic circuit usually cooperates with the main memory of the computer system, so as to automatically correct n(n is positive integer)-bit errors occurring in the memory and, also, to detect an occurrence of (n+l)-bit errors in the memory.
At the presen~ time, extremely large capacity me~ories are about to be put to practical use. For exa~ple, in a RAM (Random Access Memory), the bit density thereof is being advanced from 64 K(Killo)bits to 256 Kbits.
Accordingly, the 256 Kbits R~ will soon be widly put to practical use.
However, when the bit density increases to such a very high bit density as, for example 256 Kbits, a certain problem arises. The problem is the occurrence of a so-called soft error. The term Itsoft errorl' is the recent term, because the phenomenon o soft error was found only few years ago. The reason such soft error phenomenon is created is now considered to be as Eollows. When the bit
- 2 ~

density of the memory is increased, the mernory must be fabricated by usiny very fine conductors distributed on great number ol memory cells. Accordingly, electric charges, to be stored in parasitic capacitors distributed along said very fine conductors, become very small. Such very small electric charges are liable to be dissipated by external forces due to, for example, an application the radio active rays, above all alpha rays, to the high density memory cells. That is, if the alpha rays pass through one of the memory cells, the logic of the data (electric charges) stored in the memory cell, through which the alpha rays have passed, may easily be inverted into the opposite data logic. Thus, the above mentioned soft errors often occur in a high density memory device. I~ should be noted that the soft error is a single-bit error which occurs at random in the memory device and, also, does not occur repeatedly at the same memory cell.
Although the term "soft error" is the recent term, the term "hard error" is already widely known. The hard error is created due to an occurrence of trouble or wear out of the memory device. Such hard error occurs repeatedly in the same memory cell and, also, the logic of the data in the memory cell having the hard error is fixed to be either logic "1" or logic l0 We know, through experience, that the hard error may occur with a probability of 200 through 250 FIT, which means that any hard error occurs with a probability of 200 through 250 per hour. It is a well known fact that, usually, some of the hard errors, for example~ about 30% of the hard errors among all the errors occurriny during a unit of time, are hard errors that are spread over bits of a plurality of addresses in one memory device or over bits of all the addresses contained in one memory device.
Further, it is important to know that, in the memory device having an extremely high bit density, for example .

11 ~r~~ ~ ~
3 256 Kbits, the frequency of oc~urrences of the soft errors may be much higher than the frequency of occurrences of the hard errors, by about more than one -thousand times. Conse-quently, thP following problems will be produced.
(A) Firstly, although certain fixed data have been written in respective memory cells through write operations, soft errors may occur at two bit positions in each of some words as time elapses after said write ope-ration. However, in this case, the above mentioned ECC
logic circuit cannot correct such 2-bit errors. This is because, as previously mentioned, the ECC logic circuit functions so as to correct n-bit errors and also detect an occurrence of (n~l)-bit errors. Generally, the positive number n is determined to be 1, that is n=l, from the point of view of the use of hardware in the computer system, and accordingly, the ECC logic circuit usually cannot correct the above mentioned 2-bit soft errors, but merely detect the occurrence thereof.
(B) Secondly, when the hard erroxs occur and are spread over bits of a plurality of addresses, it is con-cluded that each of the very large number of words contains l-bit error therein at the same time. Taking as an example of an RAM device having 64 Kbits, there is a possibility that a l-bit error occurs in each of the 64,000 words at most. In this case, if each of the words is composed of 8 bytes, such l-bit error is spread all over the words of the memory device having a capacity of 0.5 M(Mega)bytes.
If this occurs in the memory device, since as previously mentioned, the frequency of occurrences of the soft errors is very high, the 2-bit errors will frequently be created in some bits comprising the 0.5 Mbytes, after about ten-odd hours from the time of said write operations. It should be noted that such 2-bit errors cannot be corrected by means of the ECC logic circuit.
Regarding the above mentioned 2-bit errors, the 2-bit errors may be classified into two modes. In a first mode, the 2-bit errors are composed of both a firs-t l-bit soft ,, error and a second l-bit soft error. In a second mode, -the 2-bit errors are composed of both a 1-bit hard error and a l-bit sof~ error. With reyard to the first mode, it is easy for the ECC logic circuit to correct such 2-bit soft errors. This is because, as previously mentioned, the soft error does not occur repeatedly in ~he same memory cell.
~ccordingly, if the ECC logic circuit detects each soft error and corrects the same, thereafter, the data at the correspcnding memory cell can be correc$ed by writing the corrected data again into the same memory cell. ~lternate-ly, such soft error can also be correc-ted by the ECC logic circuit. That is when the ECC logic circuit detects such soft error during an ordinary memory accessing operation, the ECC logic circuit can correct the same by rewriting the corrected data again into the same memory cell. Further, it may also be possible for the ECC logic circuit to, first scan with certain period the memory device and read the data of all the addresses thereof sequentially, and then, if the soft error is found, rewrite the corrected data again into the same memory cell.
However, in the above mentioned second model including both l-bit hard error and l-bit soft error, it is impossible for the ECC logic circuit to correct the errors by the same rewriting operation of data as explained above with regard to the first mode, including only soft errors. Therefore, if such errors of the second mode occur in the memory device, such errors are left as they are until next periodi-cal maintenance is carried out. An effect, for leaving such errors as they are, is not serious for a computer system having a relatively small capacity main memory. However, the efect of leaving such errors as they are is very bad for a computer system having a very large capacity main memory, such as a main memory having more than 8 through lO
Mbytes. This is because, in such very large capacity main memory, the above mentioned 2-bit errors of the second mode may occur very frequently. Thus, such 2-bit errors must not be disregarded, so as to an increase of a reliabiliky to be maintained in the large capacity computer system.
DISCLOSURE OF INVENTION
An object of the present invention is, therefore, to provide an error-correcting system which can correct the aforesaid 2-bit errors of the second mode, including both hard and soft errors. Further, it is also an object of the present invention to create an error-correcting system which can be made of very simple hardware.
According to the present invention, there is provided an error-correcting system which is comprised of a first means for discriminating whether an error is a soft error or a hard error, a second means for storing data of a memory cell having a hard error, a third means for, when a hard error is found, switching the corresponding memory cell to said second means, a fourth means for effecting a validation operation with respect to data stored in said second means. The error -correcting system further includes a fifth means for, when (n~l)-bit errors are found during the validation operation, reducing the (n+l)-bit errors to n-bit errors by inverting the logic of the output data from said second means, and further, carrying out a rewriting operation, in the second means, of data which is corrected by means of the ECC logic circuit.
BRIEF DESCRIPTION OF DRAWING
Fig. 1 is a schematic block diagram of a computer system in which an error-correcting system of the presen-t invention is employed;
Fig. 2A schematically illustrates the arrangement of a memory array of a conventional memory, having no relief memory;
Fig. 2B schematically illustrates the arrangement of memory arrays of a memory, having a relief memory, according to the present invention;
Fig. 3 illustrate details of one example of the error-correcting system 12, illustrated in Fig. 1, according to the present invention;

.

f ~ lt~

Figs. 4A, 4B and 4C depict a flow chart used for explaining the operation of the error-correction system 12, according to the present invention, and;
Fig~ S is a block diagram depicting another example of the arrangement of ga-tes 35, 36 and 37, shown in Fig. 3.
BEST MODE FOR CARRYING OUT THE INVENTION
The present invention will become more apparent from the detailed description of the preferred embodiments presented below, with reference to the accompanying drawings.
Referring to Fig. 1, a computer system 10 is mainly comprised of a memory, ~uch as a main memory 11, an error-correcting system 12 and a CPU (Central Processing Unit~ 13. The error-correcting system 12 functions as a 2-bit error-correcting system. The syst~m 12 cooperates with the CPU 13, mainly via signal lines Ll and L2, and also via signal lines L8 and L9. The line Ll transfers write data to the memory 11, whiler the line L2 transEers read data from the memory 11. The 2-bit error-correcting system 12 is constructed by means of a relief bit memory 121, an ECC logic circuit 122, a switching circuit 123 and a correction controlling circuit 124. The relief bit memory 121 is incooperated with the memory 11.
The ECC logic circuit 122 operates to correct the 1 bit error and also to detect the occurrence of the 2-bit errors. The ECC loyic circuit 122 transfers the wri-te data, from the CPU 13, together with the error-correction code, to the memory 11 or 121, via signal line L3, the circuit 123 and signal line L5. On the other hand, thP ECC
logic circuit 122 receives the read data, together wi-th -the error-correction code, from the memory 11 or 121, via signal line L6r the circuit 123 and signal line L4, and then, removes -the error-correction code from the read data, so as to transfer only the read data, via the signal line L2, to the CPU 13. At the same time, the ECC logic circuit 122 produces a 1 bit or 2-bit error-detection signal, and also, when an occurrence of a l-bit error is :

ii8~33 detected, the circuit 122 produces an error-position signal. The error-position signal indicates an address at which the hard error has occurred in the memory ll. The error-detection signal is transferred from the circuit 122 to both the correction controlling circuit 124 and the CPU 13, via signal line L9. On the other hand, the error--position signal is transferred from the circuit 122 to both the circuit 124 and the CPU 13, via signal line L8.
Then the correction controlling circuit 124 supplies a relief bit selection signal and a control signal (mentioned hereinafter) to the s~itching circuit 123, via signal line L7. In this figure, it should be understood that typical and conventional address input signal lines, write control signal lines and read control signal lines are not illustrated, however, these lines are directly connected between the CPU 13 and the memories 11 and 121.
The arrangement of the memory arrays of both the memory ll and the relief memory 121, according ~o the present inventionl will be clarified with reference to Fig. 2B, by comparing then with the arrangement of the memory array of conventional memory, having no relief memory, illustrated in Fig. 2A. In Fig. 2A, a data 21 is cGmposed of, for example 0th through 63rd bits of data, because each word is composed of 8 bytes. An error--correction code (ecc) 22 of the 64th through 71st bits is added to the data 21. Thus, each completed word is composed of 72 ~64+8) bits.
In Fig. 2B, the memory 11 (see Fig. 1) is comprised of, for example, 0.5 Mbytes memory. The 0.5 Mbytes memory is composed of 64K words, each of which 64~ words is composed of 72 bits. Accordingly, a specified one of the words r for example, a word 23, illustrated by a dotted line, is a unit of data -to be processed at every operation performed in the ECC logic circuit 12 (see Fig. l).
The relief bit memory 121 (see Fig. l) of the present invention, is comprised of one or more relief bit memories.
In this example, 0 through 71st bits of the memory ll are classified into six groups, such as GO (O through 11th bits), Gl (12-th through 23rd bits1, G2 (24th through 35th bits), G3 (36th through 47th blts), G4 (48th through 59th bits) and G5 (6Oth through 71st bits). Accordinyly, six relief bit memories 121-0 through 121-5 are allotted to respective groups of GO through G5. Each of the relief bit memories (121-0 through 121-5) is constructed so as to be the width of 1 bit (see "1 BIT" in Fig. 2B) and the length of 64 Kbits (see "64KBITS" in Fig. ~B). Consequently, sicne the memory 11 has addresses corresponding to respective 64K words, each of the relief bit memories 121-0 through 121-5 can relieve one bit of address bits, and along 64K words, contained in each of the corresponding groups GO through G5. Thus, each of the relief bit memories 121-0 through 121-5 functions to relieve the hard error bit of memory 11, created in each o the corresponding groups GO through G5.
Details of the error-correcting system 12 illustrated in Fig. 1, are illustrated in Fig. 3. In Fig. 3, regions 122, 123 and 124, indicated by long dash lines, are respectively identical to the ECC logic circuit 122, the switching circuit 123 and the correction controlling circuit 124, illustrated in Fig. 1. Further, signal lines (L3-0, L3-1 through L3-11), ~L4-0, L4-1 through L4-11), (L5-0, L5-1 -through L5-11), (L6-0, L6-1 through L5-11) and (L7-GO, L7-D~ are respectively distributed along the signal lines L3, L4, L5, L6 and L7, illustrated in Fig. 1. It should be understood that the arrangement of Fig. 3 is illustrated with regard to one of the relief bit 30 memories 121-0 through 121-5 (see Fig. 2B), for example, the memory 121-0, which is allotted to the group GO ~see Fig. 2B). Thereore, if the error-correcting system 12 (see Fig. 1) includes k (k is positive integer) number of relief bit memories, k number of arrangements, having the identical arrangement to that of Fig. 3, must be employed in the system 12. In this figure 3, memory ll-GO of the group GO contains therein eleven memory bits (O), ~1) ' through (11), which correspond to 0th through 11th bits shown in Fig. 2B. The memory ll~G0 of the group G0 cooperates with the relief bit memory 121-0. The write data, transferred from the ECC logic circuit 122 via on the signal lines L3-0, L3-1 through L3-11, pass through the switching circuit 123 and are supplied to the memory 11-G0, via the signal lines L5-0, L5-1 through L5-11. When one of AND gates 30-0, 30-1 through 30-11 is opened by one of the relief bit selection signals, transferred on the signal lines L7-0, L7-1 through L7-11, one of said write data is supplied to the relief bit memory 121-0 via the AND gate, which is opened by the relief bit selection signal, and via an OR gate 31. The relief bit selection signal, transferred on the line L7-0, L7-1 through L7-11, is produced from the ECC logic circuit 122,in order to switch the memory bit ((0), (1) through (11)1 to the relief bit memory 121-0, when the circuit 122 has detected a bit error during a read operation of the read data, transferred on the lines L4-0, L4-1 through L4-11~ If the bit error occurs in the memory bit (1), the relief bit selection signal of logic "1" is supplied via the line L7-1. On the other hand, during a read operation, the read data are supplied to the CPU (see 13 in Fig. 1), via OR gates 34-0, .~4-1 through 34-11, the signal lines L4-0, L4-1 through L4-11 and the ECC logic circuit 122. In this case, each of the OR gates 34-0, 34-1 through 34-11 produces either data from the corresponding memory bits ((0), (1) through (113) or data from the relief bit memory 121-0, via OR gate 37.
If, for example the memory bit (1) contains an error bit, the relief bit selection signal of logic "1" is supplied via the line L7-1, and accordingly, ~ND gate 32-1 is not opened, but ~ND gate 33-1 is opened. Therefore, the data frcm the memory 121-0 is selected and supplied, via the OR
gate 37, to the OR gate 34-1. Thus, the data of the memory bit (1), containing a bit error, is prevented from being transmitted from the OR gate 34-1. Sinc~ the relief bit selection signal (error-position signal) of logic "1" is ~3 -- 10 ~

supplied only to the line L7-1, the AND gates 3~-0 through 32-11, other than 32-1, are opened. Accordingly, the data, other than the data of memory bit (1~, are supplied from the memory ll-G0.
Regarding the relief bit mernory 121-0, it is necessary for the memory 121-0 to store therein correct data. That is, the correct data must be the same as the data which would have been stored in one of the memory bits ((0), (1) through (11)), if no hard error had occurred at -the memory 10 bit. Accordingly, the validation operation must be carried out after the switching operation, from the memory bit ((0~, (1) through (11)) to the relief bit memory, is completed, in order to store the corrected data in the relief bit memory. The validation operation can be 15 achieved automatically by means of the ECC logic circuit 122. ~en the bit error occurs in, for example, the memory bit (1), the memory bit (1) is switched to the relief bit memory 121-0. Then, the data, composed of data of memory bits (0) through (11), except for (1), and the 20 relief bit memory 121-0, are produced from the OR
gates 34-0, 34-1 through 34-11. The above mentioned data and also the corresponding data of groups Gl through G5, in order to form a completed word (see 23 in Fig. 2B), are supplied to the ECC logic circuit 122. In this case, the 25 logic of the data of memory 121-0 is no-t known. However, the ECC logic circuit 122 can determine whether or not the data of the memory 121-0 is correct by utilizing the error correction code. Further, the logic circuit 122 can correct the data of the memory 121-0, if the initially 30 stored data of the memory 121-0 is not correct, by utilizing said error correction code. Thereafter, the corrected data is rewrit-ten into the memory 121-0, and the validation operation is completed.
As previously mentioned, since the frequency of 35 occurrence of the soft errors is much higher than that of the hard errors, 2-bit errors will often occur during the validation operation. If such case, mentioned above, ~,' ' ~5~

occurs, the vàlidation operation can not be completed because the ECC logic circuit cannot correct 2-bit errors, but merely detect the occurrence thereof. However, according to the pxesent invention, the error-correcting system 12 can complete, even though such 2-bit errors occur, said validation operation. The reason for -~his is as follows. When 2-bit errors occur during the validation operation, it is very probable that one of said 2-bit errors is an error which is produced from the relief bi-~
memory. Therefore, if the output data from relief bitmemory is rorced to be logic "1" or "0" sequentially, with the aid of the CPU 13 (Fig. 1) or a conventional service processor (not shown), it is very probable that said 2-bit errors are reduced to a l-bit error, through the l-bit correction operation by the ECC logic circuit 122. With reference to Fig. 3, when 2-bit errors occur during the validation operation, the correction controlling circuit 124 is given the error position signal from the ECC
logic circuit 122. If the error position signal indicates the memory bit (1) of the memory 11-G0, a gate control signal of logic "1" is supplied through signal line L7-G0.
Therefore, an AND gate 35 is closed and, simultaneously, an AND gate 36 is opened. Next, the output data from the OR
gate 37 is forced to be logic "1" or "0", supplied from the signal line L7-D. As previously mentioned, during the validation operation, since one of said 2-bit errors is very liable to be induced by the output data from the OR
gate 37, the ECC logic circuit 122 can reduce the 2-bit errors to l-bit error by suitably providing either logic "1" or "0" from the OR gate 37. Thereafter, corrected data can be rewritten in the memory bits (0) through (11), other than (1), and also in the relief bit memory 121-0.
Regarding the arrangement of the gates 35, 36 and 37 another arrangement can also be employed, as will be explained hereinafter with reference to Fig. 5.
The operation of the error-correcting system 12 will be more apparent from the flow chart depicted in Figs. 4A, 4B and 4C. When one or more bit errors occur in the memory 11 (Fig. 1~, the ECC logic circuit 122 (Figs~ 1 and 3) detects the bit errors (see a step ~ representing "MEMORY
ERROR DETECTION"~. Then, the ECC logic circuit 122 dis-criminates whether or not the error is a l-bit error (see a step ~ representing "I~ IT l-BIT ERROR?" ) . If the result of the step @ is "NO", the error is a ~-bit error. Since the ECC logic circuit 122 cannot correct such a 2-bit error, it produces an alarm (see a step ~ representing "GENERATE ~LARM"). Contrary to this, if the result of the step ~ is "YES", the ECC logic circuit corrects the l-bit error and, thereafter, rewrites the corrected data into the corresponding memory cell which has produced the bit error (see step ~ representing "CORREC~ l-BIT ERROR AND REWRITE
CORRECTED DATA"). Then, the ECC logic circuit again reads the data from the memory 11 including the data of said memory cell, at the corresponding address (see a step ~`
representing "READ DATA AGAIN"). Therefore, the ECC logic circuit detects the occurrence of errors in the data which has been rewritten (see step ~ representing "IS IT l-BIT
ERROR?"). It should be remembered that, since the soft error does not occur repeatedly in the same memory cell, if said l-bit error has been induced by a soft error, such l-bit soft error may be erased by the rewriting operation performed in the step ~ . In other words, if said l-bit error has been induced not by a soft error, but a hard error, such l-bit error is still maintained regardless of the rewriting operation of the step ~ . That is, if the result of the step ~ is "YES", it is concluded that the l-bit error has been induced by a hard error. On the other hand, if the result of the step ~ is "N~", it is concluded that the error may be either 2 or more bit errors or a no error. A discrimination of this i9 achieved in a step ~ representing "ARE THEY 2 or more bit errors?". If the result of this step ~ is "NO", it is determined that the no error is a result of a correc~ion of a soft error.

This soft error is indicated, in a step that the error has been corrected. Thus, a process [I] , defined by the steps , ~
through ~ , can be called a process for discriminating whether the error is induced by a hard error or a soft error.
When the result of the step ~ is "YES", the previously mentioned switching operation, from the memory 11 to the relief bit memory 121 ~Fig. 1), is achieved (see step ~ representing "S~ITCHING TO RELIEF
BIT MEMORY"). Thus, in a process [II], a defective memory cell, having a hard error, is relieved by the corresponding relief bit memory. Thereafter, a process ZIII], for achieving the previously mentioned validation operation, is carried out. The [III] is necessary for storing correct data in the corresponding relief bit memory. The ~ords "correct data" mean data which would have been stored in the memory cell if this memory cell has no hard error.

In the process [III], first, the output data of the relief bit memory is forced to be logic "0" (see step representing "SET LOGIC" "0" AT OUTPUT OF RELIEF BIT
ME~lORY"). The logic "0" is provided rom the aforesaid signal line ~7-D illustrated in Fig. 3. Then, the data, at the corresponding address including the defective memory cell, is read by the ECC logic circuit (see step ~
representing "RE~D DATA"). It should be remembered that, since soft errors occur very frequently, compared to the occurrences of hard errors, a soft error may occur during the process [III] t that is the validation operation.
Accordingly, the following step ~ is important. If such a soft error overlaps the l-bit hard error, a 2-bit error will occur. If the result of the step ~ , representing nDO 2-BIT ÉRRORS OCCUR?", is "NO", it is concluded that no such soft error has occurred. Then, the l-bit hard error is corrected and the corrected data is rewritten into the ~L'~i~3 corresponding relief bit memory (see step ~ representing " CORRECT 1--BIT ERROR AND REWRITE CORR~CTED DATA" ) .
Returning to the step ~ , if the result of this step ~ is "YES'If it is concluded that 2-bit errors have occurred during the validation operation. At this stage, one of the 2-bit errors may be derived from the fixed logic "1" set in the step ~ and the other thereof may be induced by a soft error. Then, the output data of the relief bit memory is changed from logic ~0" to logic "1"
(see a step ~ . The logic "l" is provided from the aforesaid signal line L7-D illustrated in Fig. 3. Then the data, at the corresponding address including the defective memory cell, is read by the ECC logic circuit (see step ~ representiny "READ DATA"). In the read operation of the step ~ , if the ECC logic circuit detects that there is no 2-bit errors (see "NO" of step ~J
representing "DO 2-BIT ERRORS ~CCUR" ), it is concluded that the data to ke stored in the corresponding relief bit memory is logic "1". Thus, the 2-bit errors are reduced to a l bit error. In this case, this l-bit error is deter~ined to be a soft error. Therefore, this l-bit soft error can easily be erased by the rewrite operation of a step ~9) , which is identical to that of the step ~ .
Contrary to the ahover if the result of the step ~ is "YES", it is concluded that the 2-bit errors, which have been detected in the step ~ , may be induced by 2-bit soft errors or a combination of a l-bit soft error and a l-bit hard error occurring in one of the memory cells other than the memory cell which is now relieved by the relief bit memory. In this case, it is impossible for the ECC
logic circuit to correct such 2-bit errors. In the error-correcting system of the present invention, even though the ECC logic circuit itself cannot correct 2-bit errors, the system can correct one of the 2-bit errors by selectively setting the logic "1" or "0" at the output of the relief bit memory. If the other of the 2-bi~ erro~s is a soft error, this soft error can easily be erased by said ~3~

re~rite operation of -the corrected da~a A Returniny to the step `i?, if the result of this step ~ is "~ES", the process is directed to the step i and the alarm is generated.
The above mentioned sequence defined by the steps ~
through ~l or the steps ~ through ~ and ~ through ~9) , is achieved sequentially with respect to all the addresses, for example 64,000 times (refer to 64K WORDS in Fig. 2B) via a step ~ , representing "DOES VALIDATION
FINISH~" and a step ~ , representing "RENEW MEMORY
ADDRESS". If the result of the step ~i is "YES", the validation operation is completely finished and a step starts. In the step @, ( representing "RELEASE RELIEF BIT
MEMORY"), the fixed logic "0" (step ~0~ ) or the fixed logic "1" (step ~ ) is rPleased. That is, referring to Fig. 3, the logic of the signal line L7-G0 is changed from "1" to "0". After releasing the relief bit memory, the step ~ takes place, so that a normaI read and write operation will be commenced.
The arrangement of the gates 35, 36 and 37, illustrated in Fig~ 3, can be replaced by another arrangement, as illustrated in Fig. 5. According to the arrangement of the gates 35, 36 and 37, the signal of logic "1" or "0" is supplied externally via the signal line L7-D. However, according to the arrangement of Fig. 5, such signal of logic "1" or 'l0 is not necessary, because the identical signal of logic "1" or "0" can be the output data of the relief bit memory 121-0 itself. That is, such signal can be produced from an EOR (Exclusive OR) gate 51 r which receives both the output data from the memory 121-0 of Fig. 3 and a gate control signal from the signal line L7-G0 of Fig. 3. The output of the gate 51 is applied to the signal line L10 of Fig. 3. The EOR gate 51 can be made of inverters 52, 53, AND gates 54, 55 and OR
gate 56. If the gate control signal is logic "0", the output data itselE of the memory 121-0 is produced from the OR gate 56 via the AND gate 54. In this cas~, if the ECC

lvgic circuit 122 (Figs. 1 and 3) s-till detects 2-bit errors, then the gate control signal is changed from logic "~" to "1". Then the inverted output data of the memory 121-0 is produced from the OR gate 56 via the inverter 52 and the AND gate 55.
As explained in detail with reference to the accompanying drawings, although the n-bits hard and sGft error-correcting and (n+l)-bits hard and soft error--detecting ECC logic circuit is employed, a (n+l)-bits hard and soft error-correcting system can be realized.

Claims (9)

The embodiments of the. invention in which an exclusive property or privilege is claimed are defined as follows:
1. An error-correcting system located between a memory and a central processing unit and provided with an ECC (Error Correction Code) logic circuit, which can correct n (n is positive integer) -bit errors and detect (n+1)-bit errors, characterized in that said error-correcting system is comprised of:
a first means for discriminating whether an error, occuring in said memory, is a soft error or a hard error, a second means for storing data regarding a memory cell of said memory, which memory cell produces a hard error;
a third means for, when a hard error is found by using said first means, switching the defective memory cell of said memory to said second means;
a fourth means for effecting a validation operation with respect to data to be stored in said second means; and a fifth means for, when (n+1)-bit errors are found during said validation operation, reducing the (n+1)-bit errors to n-bit errors by means of the ECC logic circuit.
2. An error-correcting system as set forth in claim 1, wherein said first means. producing corrected data regarding a soft or hard error by means of said ECC logic circuit, then rewriting the corrected data in the memory cell at which a soft or hard error occurs, and next reading the rewritten data therefrom again, and if the ECC logic circuit determines that the second read data still includes an error, the first means discriminates the occurrence of the hard error and, on the other hand, if the ECC logic circuit determines that the second read data includes no error, the first means discriminates the occurrence of the soft error.
3. An error-correcting system as set forth in claim 1, wherein said second means is made of one or more relief bit memories which can relieve the defective memory cells of said memory.
4. An error-correcting system as set forth in claim 3, wherein each of k words comprising said memory, is divided into a plurality of groups, and the relief bit memories are allotted to these groups, respectively, each of the relief bit memories being composed of 1 bit in width and k bits. in length.
5. An error-correcting system as set forth in claim 1, wherein said third means is made of a switching circuit comprised of first and second switching gates, both arranged in bit-to-bit fashion, each of the first switching gates receives, at its first input, the write data from the ECC
logic circuit, and at its second input, an error-position indicating signal produced from the ECC logic circuit, and one of the first switching gates, specified by the error-position indicating signal, can transfer the write data to the relief bit memory and, on the other hand, each of said second switching gates receives, at its first input, the read data from the memory, at its second input, the data stored in said relief bit memory, and at third input, said error-position indicating signal, and one of the second switching gates, specified by the error-position indicating signal can transfer the read data to the ECC logic circuit.
6. An error-correcting system as set forth in claim 1, wherein said fourth means is made of a correction controlling circuit which cooperates with said ECC logic circuit and controls said third means, the fourth means functions to store corrected data, corresponding to a hard error, in said second means.
7. An error-correcting system as set forth in claim 1, wherein said fifth means sets logics "0" and "1" sequentially at the output of said second means, until the (n+1)-bit errors are reduced to n-bit errors with the aid of the ECC logic circuit.
8. An error-correcting system as set forth in claim 7, wherein said logics "0" and "1" are externally supplied to the output of said second means.
9. An error-correcting system as set forth in claim 7, wherein said logics "0" and "1" are produced by the logic of the output data itself of said second means or the logic of the inverted output data itself of said second means, by means of an EOR (Exclusive OR) gate.
CA000359377A 1979-08-31 1980-08-29 Error-correcting system Expired CA1165893A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP54111152A JPS6051749B2 (en) 1979-08-31 1979-08-31 Error correction method
JP111152/79 1979-08-31

Publications (1)

Publication Number Publication Date
CA1165893A true CA1165893A (en) 1984-04-17

Family

ID=14553772

Family Applications (1)

Application Number Title Priority Date Filing Date
CA000359377A Expired CA1165893A (en) 1979-08-31 1980-08-29 Error-correcting system

Country Status (7)

Country Link
US (1) US4394763A (en)
EP (1) EP0034188B1 (en)
JP (1) JPS6051749B2 (en)
AU (1) AU530666B2 (en)
CA (1) CA1165893A (en)
DE (1) DE3072083D1 (en)
WO (1) WO1981000641A1 (en)

Families Citing this family (98)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS57150197A (en) * 1981-03-11 1982-09-16 Nippon Telegr & Teleph Corp <Ntt> Storage circuit
US4450559A (en) * 1981-12-24 1984-05-22 International Business Machines Corporation Memory system with selective assignment of spare locations
US4458349A (en) * 1982-06-16 1984-07-03 International Business Machines Corporation Method for storing data words in fault tolerant memory to recover uncorrectable errors
US4562576A (en) * 1982-08-14 1985-12-31 International Computers Limited Data storage apparatus
US4506364A (en) * 1982-09-30 1985-03-19 International Business Machines Corporation Memory address permutation apparatus
JPS59117800A (en) * 1982-12-25 1984-07-07 Fujitsu Ltd One-bit error processing system of buffer storage
JPS59165300A (en) * 1983-03-10 1984-09-18 Fujitsu Ltd Memory fault correcting system
US4608687A (en) * 1983-09-13 1986-08-26 International Business Machines Corporation Bit steering apparatus and method for correcting errors in stored data, storing the address of the corrected data and using the address to maintain a correct data condition
US4625312A (en) * 1983-10-06 1986-11-25 Honeywell Information Systems Inc. Test and maintenance method and apparatus for investigation of intermittent faults in a data processing system
US4544850A (en) * 1983-12-05 1985-10-01 Gte Automatic Electric Incorporated Race condition mediator circuit
US4646312A (en) * 1984-12-13 1987-02-24 Ncr Corporation Error detection and correction system
US4654847A (en) * 1984-12-28 1987-03-31 International Business Machines Apparatus for automatically correcting erroneous data and for storing the corrected data in a common pool alternate memory array
JPS61264599A (en) * 1985-05-16 1986-11-22 Fujitsu Ltd Semiconductor memory device
JPS623499A (en) * 1985-06-28 1987-01-09 Mitsubishi Electric Corp Semiconductor memory device
US4710934A (en) * 1985-11-08 1987-12-01 Texas Instruments Incorporated Random access memory with error correction capability
US4719627A (en) * 1986-03-03 1988-01-12 Unisys Corporation Memory system employing a low DC power gate array for error correction
JPS6324428A (en) * 1986-07-17 1988-02-01 Mitsubishi Electric Corp Cache memory
US5128943A (en) * 1986-10-24 1992-07-07 United Technologies Corporation Independent backup mode transfer and mechanism for digital control computers
JPS63245529A (en) * 1987-03-31 1988-10-12 Toshiba Corp Register saving and restoring device
US4980850A (en) * 1987-05-14 1990-12-25 Digital Equipment Corporation Automatic sizing memory system with multiplexed configuration signals at memory modules
US4955024A (en) * 1987-09-14 1990-09-04 Visual Information Technologies, Inc. High speed image processing computer with error correction and logging
US4931990A (en) * 1987-11-19 1990-06-05 Bruce C. Perkin Hardened bubble memory circuit
JPH0290816A (en) * 1988-09-28 1990-03-30 Hitachi Ltd Method and circuit for correcting error
US5199033A (en) * 1990-05-10 1993-03-30 Quantum Corporation Solid state memory array using address block bit substitution to compensate for non-functional storage cells
FR2751083B1 (en) * 1996-07-12 1998-10-30 Sextant Avionique METHOD AND DEVICE FOR QUANTIFYING THE IMPACT OF COSMIC RADIATION ON ELECTRONIC MEMORY EQUIPMENT
JP3871471B2 (en) 1999-07-12 2007-01-24 松下電器産業株式会社 ECC circuit mounted semiconductor memory device and inspection method thereof
KR100829848B1 (en) * 2000-06-23 2008-05-16 소니 가부시끼 가이샤 Reproducing apparatus and reproducing method for record medium, data output controlling method, data outputting method, error detecting method, and data outputting and reproducing method
US7069494B2 (en) * 2003-04-17 2006-06-27 International Business Machines Corporation Application of special ECC matrix for solving stuck bit faults in an ECC protected mechanism
US7392347B2 (en) * 2003-05-10 2008-06-24 Hewlett-Packard Development Company, L.P. Systems and methods for buffering data between a coherency cache controller and memory
US7320100B2 (en) * 2003-05-20 2008-01-15 Cray Inc. Apparatus and method for memory with bit swapping on the fly and testing
US7184916B2 (en) * 2003-05-20 2007-02-27 Cray Inc. Apparatus and method for testing memory cards
US7278083B2 (en) * 2003-06-27 2007-10-02 International Business Machines Corporation Method and system for optimized instruction fetch to protect against soft and hard errors
US7336102B2 (en) * 2004-07-27 2008-02-26 International Business Machines Corporation Error correcting logic system
US7642813B2 (en) * 2004-07-27 2010-01-05 International Business Machines Corporation Error correcting logic system
WO2006039556A2 (en) * 2004-10-02 2006-04-13 Wms Gaming Inc. Gaming device with error correcting memory
JP4734003B2 (en) * 2005-03-17 2011-07-27 富士通株式会社 Soft error correction method, memory control device, and memory system
US7831882B2 (en) 2005-06-03 2010-11-09 Rambus Inc. Memory system with error detection and retry modes of operation
US9459960B2 (en) 2005-06-03 2016-10-04 Rambus Inc. Controller device for use with electrically erasable programmable memory chip with error detection and retry modes of operation
US8205146B2 (en) * 2005-07-21 2012-06-19 Hewlett-Packard Development Company, L.P. Persistent error detection in digital memory
US7562285B2 (en) 2006-01-11 2009-07-14 Rambus Inc. Unidirectional error code transfer for a bidirectional data link
JP2007257791A (en) * 2006-03-24 2007-10-04 Fujitsu Ltd Semiconductor storage device
US7292950B1 (en) * 2006-05-08 2007-11-06 Cray Inc. Multiple error management mode memory module
CN103280239B (en) 2006-05-12 2016-04-06 苹果公司 Distortion estimation in memory device and elimination
WO2007132456A2 (en) 2006-05-12 2007-11-22 Anobit Technologies Ltd. Memory device with adaptive capacity
KR101202537B1 (en) 2006-05-12 2012-11-19 애플 인크. Combined distortion estimation and error correction coding for memory devices
US20070271495A1 (en) * 2006-05-18 2007-11-22 Ian Shaeffer System to detect and identify errors in control information, read data and/or write data
US8352805B2 (en) 2006-05-18 2013-01-08 Rambus Inc. Memory error detection
US7975192B2 (en) 2006-10-30 2011-07-05 Anobit Technologies Ltd. Reading memory cells using multiple thresholds
WO2008068747A2 (en) 2006-12-03 2008-06-12 Anobit Technologies Ltd. Automatic defect management in memory devices
US8151166B2 (en) 2007-01-24 2012-04-03 Anobit Technologies Ltd. Reduction of back pattern dependency effects in memory devices
US8369141B2 (en) 2007-03-12 2013-02-05 Apple Inc. Adaptive estimation of memory cell read thresholds
WO2008139441A2 (en) * 2007-05-12 2008-11-20 Anobit Technologies Ltd. Memory device with internal signal processing unit
US8234545B2 (en) 2007-05-12 2012-07-31 Apple Inc. Data storage with incremental redundancy
US8259497B2 (en) * 2007-08-06 2012-09-04 Apple Inc. Programming schemes for multi-level analog memory cells
US8174905B2 (en) 2007-09-19 2012-05-08 Anobit Technologies Ltd. Programming orders for reducing distortion in arrays of multi-level analog memory cells
US8527819B2 (en) 2007-10-19 2013-09-03 Apple Inc. Data storage in analog memory cell arrays having erase failures
WO2009063450A2 (en) 2007-11-13 2009-05-22 Anobit Technologies Optimized selection of memory units in multi-unit memory devices
US8225181B2 (en) * 2007-11-30 2012-07-17 Apple Inc. Efficient re-read operations from memory devices
US8209588B2 (en) 2007-12-12 2012-06-26 Anobit Technologies Ltd. Efficient interference cancellation in analog memory cell arrays
US8456905B2 (en) * 2007-12-16 2013-06-04 Apple Inc. Efficient data storage in multi-plane memory devices
US8156398B2 (en) 2008-02-05 2012-04-10 Anobit Technologies Ltd. Parameter estimation based on error correction code parity check equations
US8230300B2 (en) 2008-03-07 2012-07-24 Apple Inc. Efficient readout from analog memory cells using data compression
US8400858B2 (en) 2008-03-18 2013-03-19 Apple Inc. Memory device with reduced sense time readout
US8493783B2 (en) 2008-03-18 2013-07-23 Apple Inc. Memory device readout using multiple sense times
US8498151B1 (en) 2008-08-05 2013-07-30 Apple Inc. Data storage in analog memory cells using modified pass voltages
US8949684B1 (en) 2008-09-02 2015-02-03 Apple Inc. Segmented data storage
US8169825B1 (en) 2008-09-02 2012-05-01 Anobit Technologies Ltd. Reliable data storage in analog memory cells subjected to long retention periods
US8482978B1 (en) 2008-09-14 2013-07-09 Apple Inc. Estimation of memory cell read thresholds by sampling inside programming level distribution intervals
US8239734B1 (en) 2008-10-15 2012-08-07 Apple Inc. Efficient data storage in storage device arrays
US8713330B1 (en) 2008-10-30 2014-04-29 Apple Inc. Data scrambling in memory devices
US8208304B2 (en) 2008-11-16 2012-06-26 Anobit Technologies Ltd. Storage at M bits/cell density in N bits/cell analog memory cell devices, M>N
US8248831B2 (en) 2008-12-31 2012-08-21 Apple Inc. Rejuvenation of analog memory cells
US8397131B1 (en) 2008-12-31 2013-03-12 Apple Inc. Efficient readout schemes for analog memory cell devices
US8924661B1 (en) 2009-01-18 2014-12-30 Apple Inc. Memory system including a controller and processors associated with memory devices
US8228701B2 (en) * 2009-03-01 2012-07-24 Apple Inc. Selective activation of programming schemes in analog memory cell arrays
US8832354B2 (en) 2009-03-25 2014-09-09 Apple Inc. Use of host system resources by memory controller
US8259506B1 (en) 2009-03-25 2012-09-04 Apple Inc. Database of memory read thresholds
US8238157B1 (en) 2009-04-12 2012-08-07 Apple Inc. Selective re-programming of analog memory cells
US8479080B1 (en) 2009-07-12 2013-07-02 Apple Inc. Adaptive over-provisioning in memory systems
US8495465B1 (en) 2009-10-15 2013-07-23 Apple Inc. Error correction coding over multiple memory pages
US8677054B1 (en) 2009-12-16 2014-03-18 Apple Inc. Memory management schemes for non-volatile memory devices
US8694814B1 (en) 2010-01-10 2014-04-08 Apple Inc. Reuse of host hibernation storage space by memory controller
US8572311B1 (en) 2010-01-11 2013-10-29 Apple Inc. Redundant data storage in multi-die memory systems
US8694853B1 (en) 2010-05-04 2014-04-08 Apple Inc. Read commands for reading interfering memory cells
US8572423B1 (en) 2010-06-22 2013-10-29 Apple Inc. Reducing peak current in memory systems
US8595591B1 (en) 2010-07-11 2013-11-26 Apple Inc. Interference-aware assignment of programming levels in analog memory cells
US9104580B1 (en) 2010-07-27 2015-08-11 Apple Inc. Cache memory for hybrid disk drives
US8767459B1 (en) 2010-07-31 2014-07-01 Apple Inc. Data storage in analog memory cells across word lines using a non-integer number of bits per cell
US8856475B1 (en) 2010-08-01 2014-10-07 Apple Inc. Efficient selection of memory blocks for compaction
US8493781B1 (en) 2010-08-12 2013-07-23 Apple Inc. Interference mitigation using individual word line erasure operations
US8694854B1 (en) 2010-08-17 2014-04-08 Apple Inc. Read threshold setting based on soft readout statistics
US9021181B1 (en) 2010-09-27 2015-04-28 Apple Inc. Memory management for unifying memory cell conditions by using maximum time intervals
US9235466B2 (en) * 2012-07-03 2016-01-12 Samsung Electronics Co., Ltd. Memory devices with selective error correction code
CN103942119A (en) * 2013-12-26 2014-07-23 杭州华为数字技术有限公司 Method and device for processing memory errors
CN111819547A (en) 2018-03-26 2020-10-23 拉姆伯斯公司 Command/address channel error detection
US11269720B2 (en) * 2019-08-11 2022-03-08 Winbond Electronics Corp. Memory storage apparatus and data access method
US11556416B2 (en) 2021-05-05 2023-01-17 Apple Inc. Controlling memory readout reliability and throughput by adjusting distance between read thresholds
US11847342B2 (en) 2021-07-28 2023-12-19 Apple Inc. Efficient transfer of hard data and confidence levels in reading a nonvolatile memory

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3656107A (en) * 1970-10-23 1972-04-11 Ibm Automatic double error detection and correction apparatus
US3917933A (en) * 1974-12-17 1975-11-04 Sperry Rand Corp Error logging in LSI memory storage units using FIFO memory of LSI shift registers
US3949208A (en) * 1974-12-31 1976-04-06 International Business Machines Corporation Apparatus for detecting and correcting errors in an encoded memory word
JPS51137335A (en) * 1975-05-22 1976-11-27 Yoshihiro Toma Faulty memory permissible control system
JPS5381036A (en) * 1976-12-27 1978-07-18 Hitachi Ltd Error correction-detection system
JPS592057B2 (en) * 1979-02-07 1984-01-17 株式会社日立製作所 Error correction/detection method
US4255808A (en) * 1979-04-19 1981-03-10 Sperry Corporation Hard or soft cell failure differentiator
US4319356A (en) * 1979-12-19 1982-03-09 Ncr Corporation Self-correcting memory system
US4359771A (en) * 1980-07-25 1982-11-16 Honeywell Information Systems Inc. Method and apparatus for testing and verifying the operation of error control apparatus within a memory

Also Published As

Publication number Publication date
AU530666B2 (en) 1983-07-21
EP0034188B1 (en) 1988-03-16
WO1981000641A1 (en) 1981-03-05
US4394763A (en) 1983-07-19
JPS6051749B2 (en) 1985-11-15
EP0034188A4 (en) 1984-08-10
JPS5637896A (en) 1981-04-11
AU6226180A (en) 1981-03-19
DE3072083D1 (en) 1988-04-21
EP0034188A1 (en) 1981-08-26

Similar Documents

Publication Publication Date Title
CA1165893A (en) Error-correcting system
US5491703A (en) Cam with additional row cells connected to match line
EP0226950B1 (en) Memory access control circuit
US4472805A (en) Memory system with error storage
JPS5942396B2 (en) semiconductor memory device
EP0389203A2 (en) Semiconductor memory device having information indicative of presence of defective memory cells
CA1206265A (en) System for correction of single-bit error in buffer storage unit
US4893281A (en) Semiconductor memory system with programmable address decoder
US6035381A (en) Memory device including main memory storage and distinct key storage accessed using only a row address
US20030005210A1 (en) Intelligent CAM cell for CIDR processor
US4016409A (en) Longitudinal parity generator for use with a memory
EP0104850B1 (en) Semiconductor memory device
JP2669303B2 (en) Semiconductor memory with bit error correction function
JPH0589663A (en) Semiconductor memory and its output control method
JP2953737B2 (en) Semiconductor memory having a multi-bit parallel test circuit
JP2000228092A5 (en) Semiconductor integrated circuit device and its data discrimination method
US4114192A (en) Semiconductor memory device to reduce parasitic output capacitance
EP0285125A2 (en) Semiconductor memory having a parallel input/output circuit
KR920003699B1 (en) Register circuit
US6587373B2 (en) Multilevel cell memory architecture
JP2959046B2 (en) Memory control circuit
JPH0440697A (en) Semiconductor memory
JP3071435B2 (en) Multi-bit match circuit
US5918075A (en) Access network for addressing subwords in memory for both little and big endian byte order
JP2716251B2 (en) Semiconductor memory

Legal Events

Date Code Title Description
MKEX Expiry
MKEX Expiry

Effective date: 20010417