US5394530A - Arrangement for predicting a branch target address in the second iteration of a short loop - Google Patents

Arrangement for predicting a branch target address in the second iteration of a short loop Download PDF

Info

Publication number
US5394530A
US5394530A US08/199,970 US19997094A US5394530A US 5394530 A US5394530 A US 5394530A US 19997094 A US19997094 A US 19997094A US 5394530 A US5394530 A US 5394530A
Authority
US
United States
Prior art keywords
branch
address
instruction address
branch target
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
US08/199,970
Inventor
Mayumi Kitta
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to US08/199,970 priority Critical patent/US5394530A/en
Application granted granted Critical
Publication of US5394530A publication Critical patent/US5394530A/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3804Instruction prefetching for branches, e.g. hedging, branch folding
    • G06F9/3806Instruction prefetching for branches, e.g. hedging, branch folding using address prediction, e.g. return stack, branch history buffer

Definitions

  • the present invention relates generally to an arrangement for predicting a branch target address using a branch history table (BHT) in a digital data processing system, and more specifically to such an arrangement by which a branch target address can be predicted irrespective of a time duration for which the BHT is renewed or updated.
  • BHT branch history table
  • branch instructions are resolved in an execution unit. Accordingly, there are several cycles of delay between the decoding of a branch instruction and its execution/resolution. In an attempt to overcome the potential loss of these cycles, it is known in the art to guess, using a BHT, as to which instruction specified by a branch target address is to be applied to the execution unit.
  • FIG. 1 is a block diagram showing schematically a known arrangement of the type to which the present invention is applicable.
  • a system controller 10 is operatively coupled with a processing unit 12, a main memory 14 and an input/output (I/O) controller 16.
  • I/O input/output
  • the cache memory 22 Upon a cache hit, the cache memory 22 applies the corresponding instruction to the task controller 18 and also to a microprogram memory 30, both of which form part of an execution unit 32.
  • the microprogram memory 30 has previously stored a plurality of microprograms for executing an instruction applied thereto of the given job.
  • the execution unit 32 further comprises a buffer memory 34 and an execution circuit 36.
  • the buffer memory 34 stores operand data in this case, while the execution circuit 36 runs or carries out the microprograms using the operand data within the buffer memory 34 under the control of the task controller 18.
  • the cache memory 22 issues a selector control signal SC-1 which is applied to a selector 26 via a line 24.
  • An adder 28 is provided to increment the instruction address relating to the cache hit by predetermined bytes in order to derive the next instruction address.
  • the control signal SC-1 allows the selector 26 to apply the content thereof to the register 20 and, thus the next instruction address data is stored in the instruction prefetch address register 20.
  • a BHT utilizes the address of the instruction array (viz., stream) being prefetched for accessing the table. If a taken branch were previously encountered at that address, the BHT indicates so and, in addition, provides the target address of the branch on its previous execution. This target address is used to redirect instruction prefetching because of the likelihood that the branch will repeat its past behaviour.
  • a BHT section 40 includes a BHT which is established within a branch instruction address (BIA) array memory 42 and a branch target address (BTA) array memory 44.
  • the BHT section 40 further includes a comparator 46.
  • the BIA array memory 42 stores a plurality of branch instruction addresses, while the BTA array memory 44 a plurality of branch target addresses which correspond, on one to one basis, to the counterparts stored in the memory 42.
  • the instruction prefetch address register 20 supplies the two memories 42, 44 with a prefetched instruction address via a line 43.
  • the comparator 46 is provided to compare the prefetched instruction address from the register 20 and the output (viz., branch instruction address) of the memory 42.
  • the comparator 46 detects coincidence of the two instruction addresses applies (viz., a hit), it outputs a selector control signal SC-2 indicative of the hit over a line 45 and thus allows the selector 26 to supply the register 20 with the corresponding branch target which is derived from the memory 44 through a line 47. Following this, the instruction specified by the address of the branch target is searched at the cache memory 22. Contrarily, in the event that the comparator 46 notes a miss hit, it issues the control signal SC-2 representing same and thus inhibits the application of the output of the memory 44 to the register 20.
  • Writing a new piece of branch information into the BHT (viz., updating of BHT) is implemented under the control of the task controller 18.
  • the task controller 18 detects that the execution circuit 36 fails to execute the branched instruction due to the failure of the branch target address prediction, the task controller 18 updates the BHT by writing a more likely pair of branch address and the corresponding branch target address into the memories 42, 44 via lines (3), (4).
  • FIG. 2A is a diagram schematically an instruction sequence A0 ⁇ BR(Branch) ⁇ A1 ⁇ A2 ⁇ A3 ⁇ A4 ⁇ stored in the cache memory 22. It is assumed that these six instructions are derived from the cache memory 22 as a group whose length is one word (8-byte) and which includes two instructions as illustrated. Accordingly, the instruction address at the left side are depicted by "a", “a+8", “a+16” wherein the address "a” is the initial address of the instruction sequence in question. It is understood that the adder 28 increments the address applied thereto by 8-byte in this particular case.
  • the instruction group(s) derived from the cache memory 22 is stored in a suitable buffer (not shown in the accompanying drawings) and then the instructions are sequentially applied to the execution unit 32.
  • FIG. 2B is a flow-chart depicting a routine which executes the above-mentioned instructions A0-A4 at steps 50A-50F.
  • FIG. 2B shows the addresses of the instructions A0-A4 in the cache memory 22.
  • a very small branch loop 52 is established between steps 50A and 50B.
  • FIG. 3 is a diagram showing pipelined operations including five stages denoted by IF, DC, AD, OF and EX.
  • a line extending from the stage AD to the stage IF corresponds to the line (1) via which the initial instruction address (viz., "a") is applied to the prefetch address register 20.
  • the stages IF, DC, AD, OF and EX implement the following operations:
  • the task controller 18 updates the BHT by writing thereinto the most highly guessed branch instruction address and the branch target thereof.
  • FIG. 4 is a timing chart which characterizes the prior art operations at the pipeline stages IF, DC, AD, OF and EX shown in FIG. 3. It is assumed that the BHT section 40 fails to search for the branch instruction address "a+4" (viz., miss hit) at time clock T0. Accordingly, the execution circuit 36 is unable to determine a branch target at time clock T5. In this instance, the operations implemented at all the stages IF-EX are canceled or rendered invalid. Thus, the task controller 18 carries out, at time clock T6, updating by writing the branch instruction address "a+4" into the memory 42 and also writing the branch target "a” into the memory 44 via the lines (3), (4). This means that the miss hit again occurs at the BHT 40 at time clock T6 in that the updating is implemented during the same time clock T6.
  • an aspect of the present invention comes in an arrangement for predicting a branch target address in a digital data processing system, including: a prefetching unit to prefetch an instruction address of a given instruction sequence; a first memory which stores a plurality of branch instruction addresses, a unit coupled to receive the prefetched instruction address which is utilized to output a branch instruction address coincident therewith from the first memory.
  • the first memory subject to updating wherein a new branch instruction address is written thereinto a second memory for storing a plurality of branch target addresses which correspond to the branch instruction addresses memorized in the first memory on a one to one basis.
  • the second memory is coupled to receive the prefetched instruction address which is utilized to derive the branch target address corresponding to the branch instruction address coincident with the prefetched instruction address in first memory, the second meaning being subject to updating wherein a new branch target address is written thereinto
  • a comparator is provided to compare an instruction address outputted from first memory with the prefetched instruction address applied from the prefetching unit, the comparison result is being used to control the instruction address prefetching
  • a company units arranged to compare the prefetched instruction applied from the prefetching unit with said new branch target address and for generating a match/miss match signal indicative of whether the prefetched instruction applied from the prefetching unit and the new branch target address match or miss match, the match/miss match signal being used to control the instruction address prefetching at at the prefetch unit; and a means being arranged to receive the branch target address derived from second memory and said new branch target address, the means selecting said new branch target address in the event that the match/miss match signal from the company unit indicates a match
  • FIG. 1 is a block diagram showing the prior art arrangement discussed in the opening paragraphs of the instant disclosure
  • FIG. 2A is a table illustrating a instruction sequence stored in the cache memory shown in FIG. 1;
  • FIG. 2B is a flow chart for illustrating a routine which executes the instructions listed in FIG. 2A;
  • FIG. 3 is a diagram showing pipelined operations including five stages IF, DC, AD, OF and EX;
  • FIG. 4 is a timing chart which depicts the type of operation which invites the above mentioned machine cycle loss problem
  • FIG. 5 is a block diagram showing the arrangement which characterizes a first embodiment of the present invention.
  • FIG. 6 is a timing chart similar to that shown in FIG. 4 which shows the manner in which the operations are performed with the first embodiment of the present invention.
  • FIGS. 7 and 8 are block diagrams showing the arrangements which characterizes second and third embodiments of the present invention, respectively.
  • FIG. 5 is a block diagram showing an arrangement of a BHT section 40A which characterizes the first embodiment.
  • the section 40A differs, in terms of the arrangement, from the counterpart 40 (FIG. 1) in that the section 40A further includes a comparator 60 and a selector 62.
  • the comparator 60 is provided with two inputs, one of which is coupled to receive the prefetched instruction address via the line 43 while the other input is directly coupled to the line (3) for receiving a new branch instruction address applied from the task controller 18 for the purpose of updating the BHT.
  • the comparator 60 has the output coupled to the selector 26 via the line 45 and also to the selector 62 for controlling the operation thereof.
  • the selector 62 is arranged to receive a branch target address outputted from the BTA array memory 44 and a new branch target address applied from the task controller 18 for updating the BHT via the line (4).
  • FIG. 6 is a timing chart which characterizes the operations of the first embodiment at the pipeline stages IF, DC, AD, OF and EX shown in FIG. 3.
  • the BHT section 40A fails to search for the branch instruction address "a+4" at time clock T0. Accordingly, the execution circuit 36 is unable to determine a branch target at time clock T5. In this instance, the operations implemented at all the stages IF-EX are canceled or rendered invalid. Thus, the task controller 18 carries out, at time clock T6, updating the BHT by writing a new data including the branch instruction address "a+4" into the memory 42 and also writing a new data including the branch target address "a" in this particular case.
  • the comparator 60 is able to detect, during clock time T6, the coincidence between the prefetched instruction address applied from the register 20 and the new branch instruction address "a+4".
  • the comparator 60 issues, during the same time clock T6, the control signal indicative of the comparison hit (viz., coincidence) and applies same to the selector 62.
  • the selector 62 is allowed to pass the new branch target address "a" to the selector 26 via the line 47 at the next time clock T6. It is understood therefore that: (a) the execution circuit 36 executes the branch operation BR at time clock T11 and (b) LOSS B (FIG. 4) inherent in the prior art can be eliminated according to the first embodiment.
  • FIG. 7 shows, in block diagram form, an arrangement of a second embodiment of the present invention.
  • the second embodiment further includes a first and second buffer groups 64, 66.
  • the first buffer group 64 is comprised of four buffers 64(1)-64(4).
  • the second buffer group 66 includes four buffers 66(1)-66(4).
  • the second embodiment is well suited for the case wherein four time clocks are required to write the abovementioned new address data into the array memories 42, 44. It should be noted that the number of the buffers in each of the buffer groups 64, 66 is not limited to four (4) and may be changed to any number depending on a computer system design.
  • the third embodiment further includes four comparators 68(1)-68(4) and a selector 70 in addition to the arrangement of the second embodiment. It is clearly understood that when one of the comparators 60 and 68(1)-68(4) detects coincidence between the two input data thereof, the selector 70 selects the corresponding new address data to be applied to the address array memory 44 for the updating operation.
  • the number of the buffers in each of the buffer groups 64, 66 is not limited to four.
  • the third embodiment is very advantageous in that it includes all the features referred to the first and second embodiments although its hardware arrangement is somewhat complex as compared therewith.

Abstract

Improved techniques for predicting a branch target address using a branch history table (BHT), is disclosed. The BHT stores a plurality of pairs of a branch address and corresponding branch target address. In order to eliminate or effectively reduce loss of machine cycles in the branch target address prediction, a prefetched address data is compared with an incoming new branch instruction address before being applied to a BHT (branch history table) for the purpose of updating same. When the coincidence is detected, a selector selects a new branch target address before being applied to the BHT. The selected new branch target address is fed to an instruction address prefetch register.

Description

This application is a continuation of application Ser. No. 07/852,063, filed Mar. 16, 1992, now abandoned.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to an arrangement for predicting a branch target address using a branch history table (BHT) in a digital data processing system, and more specifically to such an arrangement by which a branch target address can be predicted irrespective of a time duration for which the BHT is renewed or updated.
2. Description of the Prior Art
In most pipeline processors, branch instructions are resolved in an execution unit. Accordingly, there are several cycles of delay between the decoding of a branch instruction and its execution/resolution. In an attempt to overcome the potential loss of these cycles, it is known in the art to guess, using a BHT, as to which instruction specified by a branch target address is to be applied to the execution unit.
Before turning to the present invention it is deemed advantageous to briefly discuss a known technique with reference to FIGS. 1-4.
FIG. 1 is a block diagram showing schematically a known arrangement of the type to which the present invention is applicable. As shown in FIG. 1, a system controller 10 is operatively coupled with a processing unit 12, a main memory 14 and an input/output (I/O) controller 16.
A task controller 18, provided in the processing unit 12, applies an initial instruction address of a given job to an instruction prefetch address register 20 through a line (1). It is assumed that a cache memory 22 has already stored the whole or part of the instruction array (sequence) of the above-mentioned given job which is applied from the task controller 18 via a line (2). The cache memory 22 is supplied with the prefetched address from the address register 20 and, searches for an instruction specified by the prefetched address.
Upon a cache hit, the cache memory 22 applies the corresponding instruction to the task controller 18 and also to a microprogram memory 30, both of which form part of an execution unit 32. The microprogram memory 30 has previously stored a plurality of microprograms for executing an instruction applied thereto of the given job. The execution unit 32 further comprises a buffer memory 34 and an execution circuit 36. The buffer memory 34 stores operand data in this case, while the execution circuit 36 runs or carries out the microprograms using the operand data within the buffer memory 34 under the control of the task controller 18.
In the case of the cache hit, the cache memory 22 issues a selector control signal SC-1 which is applied to a selector 26 via a line 24. An adder 28 is provided to increment the instruction address relating to the cache hit by predetermined bytes in order to derive the next instruction address. In more specific terms, upon occurrence of the cache hit, the control signal SC-1 allows the selector 26 to apply the content thereof to the register 20 and, thus the next instruction address data is stored in the instruction prefetch address register 20.
Contrarily, in the case of a cache miss, the contents of the cache memory 22 is renewed in a manner well known in the art.
As indicated above, the reduction of branch penalty (viz., loss of cycles) is attempted through the use of history focussed on instruction prefetching. A BHT utilizes the address of the instruction array (viz., stream) being prefetched for accessing the table. If a taken branch were previously encountered at that address, the BHT indicates so and, in addition, provides the target address of the branch on its previous execution. This target address is used to redirect instruction prefetching because of the likelihood that the branch will repeat its past behaviour. The advantage of such an approach is that it has the potential of eliminating all delays associated with branches.
As shown in FIG. 1, a BHT section 40 includes a BHT which is established within a branch instruction address (BIA) array memory 42 and a branch target address (BTA) array memory 44. The BHT section 40 further includes a comparator 46. The BIA array memory 42 stores a plurality of branch instruction addresses, while the BTA array memory 44 a plurality of branch target addresses which correspond, on one to one basis, to the counterparts stored in the memory 42. The instruction prefetch address register 20 supplies the two memories 42, 44 with a prefetched instruction address via a line 43. The comparator 46 is provided to compare the prefetched instruction address from the register 20 and the output (viz., branch instruction address) of the memory 42. If the comparator 46 detects coincidence of the two instruction addresses applies (viz., a hit), it outputs a selector control signal SC-2 indicative of the hit over a line 45 and thus allows the selector 26 to supply the register 20 with the corresponding branch target which is derived from the memory 44 through a line 47. Following this, the instruction specified by the address of the branch target is searched at the cache memory 22. Contrarily, in the event that the comparator 46 notes a miss hit, it issues the control signal SC-2 representing same and thus inhibits the application of the output of the memory 44 to the register 20.
Writing a new piece of branch information into the BHT (viz., updating of BHT) is implemented under the control of the task controller 18. In more specific terms, when the task controller 18 detects that the execution circuit 36 fails to execute the branched instruction due to the failure of the branch target address prediction, the task controller 18 updates the BHT by writing a more likely pair of branch address and the corresponding branch target address into the memories 42, 44 via lines (3), (4).
The operations of the BHT section 40 will be further discussed with reference to FIGS. 2A, 2B, 3 and 4.
FIG. 2A is a diagram schematically an instruction sequence A0→BR(Branch)→A1→A2→A3→A4→ stored in the cache memory 22. It is assumed that these six instructions are derived from the cache memory 22 as a group whose length is one word (8-byte) and which includes two instructions as illustrated. Accordingly, the instruction address at the left side are depicted by "a", "a+8", "a+16" wherein the address "a" is the initial address of the instruction sequence in question. It is understood that the adder 28 increments the address applied thereto by 8-byte in this particular case. The instruction group(s) derived from the cache memory 22 is stored in a suitable buffer (not shown in the accompanying drawings) and then the instructions are sequentially applied to the execution unit 32.
FIG. 2B is a flow-chart depicting a routine which executes the above-mentioned instructions A0-A4 at steps 50A-50F. For the sake of a better understanding, FIG. 2B shows the addresses of the instructions A0-A4 in the cache memory 22. As shown, a very small branch loop 52 is established between steps 50A and 50B.
FIG. 3 is a diagram showing pipelined operations including five stages denoted by IF, DC, AD, OF and EX. A line extending from the stage AD to the stage IF corresponds to the line (1) via which the initial instruction address (viz., "a") is applied to the prefetch address register 20. The stages IF, DC, AD, OF and EX implement the following operations:
(a) IF: Instruction prefetch at block 20;
(b) DC: Instruction decode at block 30;
(c) AD: Address generation at block 18;
(d) OF: Operand fetch at blocks 18, 34; and
(e) EX: Instruction execution at blocks 18, 36.
In the case where the miss hit of the branch instruction address at the comparator 46 of the BHT section 40 is found at the pipeline stage EX, the task controller 18 updates the BHT by writing thereinto the most highly guessed branch instruction address and the branch target thereof.
FIG. 4 is a timing chart which characterizes the prior art operations at the pipeline stages IF, DC, AD, OF and EX shown in FIG. 3. It is assumed that the BHT section 40 fails to search for the branch instruction address "a+4" (viz., miss hit) at time clock T0. Accordingly, the execution circuit 36 is unable to determine a branch target at time clock T5. In this instance, the operations implemented at all the stages IF-EX are canceled or rendered invalid. Thus, the task controller 18 carries out, at time clock T6, updating by writing the branch instruction address "a+4" into the memory 42 and also writing the branch target "a" into the memory 44 via the lines (3), (4). This means that the miss hit again occurs at the BHT 40 at time clock T6 in that the updating is implemented during the same time clock T6.
The operations during time clocks T6-T11 are exactly identical with those during time clocks T0-T5. That is, the execution circuit 36 is again unable to determine the branch target address "a" at time clock T11. At the next time clock T12, the comparator 4G detects the hit and thus the target address "a" is applied to the instruction prefetch address register 20. Therefore, the execution circuit 36 is now able to execute the branch operation at time clock T17. In the above-mentioned case, there exists the 4 cycle loss during T12-T15 (LOSS B) in addition to the 4 cycle loss during T6-T9 (LOSS A). This kind of problem is frequently encountered when the execution unit 32 executes a program sequence including short loops as indicated in FIG. 2B. This arises from the fact that, before completing the updating of the BHT, the next branch instruction should be executed.
Summing up, the above-mentioned prior art has encountered the problems that such a cycle loss as indicated by LOSS B is inevitably present in the case where the program sequence to be executed includes the aforesaid short type of branch loop.
SUMMARY OF THE INVENTION
In view of the above drawback it is an object of the present invention to provide an arrangement by which machine cycle loss due to the branch address miss hit can effectively be reduced.
More specifically an aspect of the present invention comes in an arrangement for predicting a branch target address in a digital data processing system, including: a prefetching unit to prefetch an instruction address of a given instruction sequence; a first memory which stores a plurality of branch instruction addresses, a unit coupled to receive the prefetched instruction address which is utilized to output a branch instruction address coincident therewith from the first memory. The first memory subject to updating wherein a new branch instruction address is written thereinto a second memory for storing a plurality of branch target addresses which correspond to the branch instruction addresses memorized in the first memory on a one to one basis. The second memory is coupled to receive the prefetched instruction address which is utilized to derive the branch target address corresponding to the branch instruction address coincident with the prefetched instruction address in first memory, the second meaning being subject to updating wherein a new branch target address is written thereinto A comparator is provided to compare an instruction address outputted from first memory with the prefetched instruction address applied from the prefetching unit, the comparison result is being used to control the instruction address prefetching A company units arranged to compare the prefetched instruction applied from the prefetching unit with said new branch target address and for generating a match/miss match signal indicative of whether the prefetched instruction applied from the prefetching unit and the new branch target address match or miss match, the match/miss match signal being used to control the instruction address prefetching at at the prefetch unit; and a means being arranged to receive the branch target address derived from second memory and said new branch target address, the means selecting said new branch target address in the event that the match/miss match signal from the company unit indicates a match, and selecting the branch target address derived from said second memory in the event that the match/miss match signal from the company unit indicates a miss-match, the output of the means being applied to the prefetch unit.
BRIEF DESCRIPTION OF THE DRAWINGS
The features and advantages of the present invention will become more clearly appreciated from the following description taken in conjunction with the accompanying drawings in which like portions are denoted by like reference numerals and in which:
FIG. 1 is a block diagram showing the prior art arrangement discussed in the opening paragraphs of the instant disclosure;
FIG. 2A is a table illustrating a instruction sequence stored in the cache memory shown in FIG. 1;
FIG. 2B is a flow chart for illustrating a routine which executes the instructions listed in FIG. 2A;
FIG. 3 is a diagram showing pipelined operations including five stages IF, DC, AD, OF and EX;
FIG. 4 is a timing chart which depicts the type of operation which invites the above mentioned machine cycle loss problem;
FIG. 5 is a block diagram showing the arrangement which characterizes a first embodiment of the present invention;
FIG. 6 is a timing chart similar to that shown in FIG. 4 which shows the manner in which the operations are performed with the first embodiment of the present invention; and
FIGS. 7 and 8 are block diagrams showing the arrangements which characterizes second and third embodiments of the present invention, respectively.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
A first embodiment of the present invention will be discussed with reference to FIGS. 5 and 6.
FIG. 5 is a block diagram showing an arrangement of a BHT section 40A which characterizes the first embodiment. The section 40A differs, in terms of the arrangement, from the counterpart 40 (FIG. 1) in that the section 40A further includes a comparator 60 and a selector 62.
The comparator 60 is provided with two inputs, one of which is coupled to receive the prefetched instruction address via the line 43 while the other input is directly coupled to the line (3) for receiving a new branch instruction address applied from the task controller 18 for the purpose of updating the BHT. The comparator 60 has the output coupled to the selector 26 via the line 45 and also to the selector 62 for controlling the operation thereof. The selector 62 is arranged to receive a branch target address outputted from the BTA array memory 44 and a new branch target address applied from the task controller 18 for updating the BHT via the line (4). The operations of the remaining blocks 42, 44 and 46 of FIG. 5 have already discussed in the opening paragraphs and accordingly the further descriptions thereof will be omitted for brevity.
FIG. 6 is a timing chart which characterizes the operations of the first embodiment at the pipeline stages IF, DC, AD, OF and EX shown in FIG. 3.
As in the prior art, it is assumed that the BHT section 40A fails to search for the branch instruction address "a+4" at time clock T0. Accordingly, the execution circuit 36 is unable to determine a branch target at time clock T5. In this instance, the operations implemented at all the stages IF-EX are canceled or rendered invalid. Thus, the task controller 18 carries out, at time clock T6, updating the BHT by writing a new data including the branch instruction address "a+4" into the memory 42 and also writing a new data including the branch target address "a" in this particular case.
In accordance with the first embodiment, the comparator 60 is able to detect, during clock time T6, the coincidence between the prefetched instruction address applied from the register 20 and the new branch instruction address "a+4". Thus, the comparator 60 issues, during the same time clock T6, the control signal indicative of the comparison hit (viz., coincidence) and applies same to the selector 62. Accordingly, the selector 62 is allowed to pass the new branch target address "a" to the selector 26 via the line 47 at the next time clock T6. It is understood therefore that: (a) the execution circuit 36 executes the branch operation BR at time clock T11 and (b) LOSS B (FIG. 4) inherent in the prior art can be eliminated according to the first embodiment.
Reference is made to FIG. 7 which shows, in block diagram form, an arrangement of a second embodiment of the present invention. As compared with the first embodiment of FIG. 5, the second embodiment further includes a first and second buffer groups 64, 66. The first buffer group 64 is comprised of four buffers 64(1)-64(4). Similarly, the second buffer group 66 includes four buffers 66(1)-66(4). The second embodiment is well suited for the case wherein four time clocks are required to write the abovementioned new address data into the array memories 42, 44. It should be noted that the number of the buffers in each of the buffer groups 64, 66 is not limited to four (4) and may be changed to any number depending on a computer system design.
Referring now to FIG. 8, wherein a third embodiment of the present invention is illustrated in block diagram form. The third embodiment further includes four comparators 68(1)-68(4) and a selector 70 in addition to the arrangement of the second embodiment. It is clearly understood that when one of the comparators 60 and 68(1)-68(4) detects coincidence between the two input data thereof, the selector 70 selects the corresponding new address data to be applied to the address array memory 44 for the updating operation. As in the second embodiment, the number of the buffers in each of the buffer groups 64, 66 is not limited to four. The third embodiment is very advantageous in that it includes all the features referred to the first and second embodiments although its hardware arrangement is somewhat complex as compared therewith.
It will be understood that the above disclosure is representative of only a few possible embodiments of the present invention and that the concept on which the invention is based is not specifically limited thereto.

Claims (6)

What is claimed is:
1. A device for predicting a branch target address, comprising:
a prefetch unit operative to prefetch instruction addresses of a given instruction sequence;
a memory for storing a branch history table, said memory having a first memory section for storing a plurality of branch instruction addresses and a second memory section for storing a plurality of branch target addresses, each of said branch target addresses corresponding to one of said branch instruction address on a one to one basis, said branch history table receiving prefetched branch instruction addresses from said prefetch unit;
an execution unit executing instructions under pipeline control, said executing unit supplying a new branch instruction address and a corresponding new branch target address to said branch history table when a prediction of a branch target address fails upon execution of a branch instruction corresponding to said branch target address in said execution unit;
a first comparator coupled to said prefetch unit and said branch history table, said first comparator comparing a prefetched branch instruction address from said prefetch unit with a branch instruction address from said first memory section and outputting a first control signal when said prefetched branch instruction address from said prefetch unit matches said branch instruction address from said first memory section;
a second comparator coupled to said prefetch unit and said executing unit, said second comparator comparing a branch instruction address from said prefetch unit with said new branch instruction address being supplied to said branch history table and being used to update said branch history table and outputting a second control signal when said branch instruction address from said prefetch unit matches said new branch instruction address being supplied to said branch history table and being used to update said branch history table; and
a selector receiving a branch target address from said second memory section and said new branch target address being supplied to said branch history table from said execution unit for outputting one of said branch target address from said second memory section and said new branch target address being supplied to said branch history table from said execution unit in response to said second control signal, wherein said prefetch unit selects said output from said selector in response to said first control signal.
2. A device as recited in claim 1 further comprising a first and second plurality of buffers sequentially connected between said execution unit and said branch history table for temporarily holding said new branch instruction address being supplied to said branch history table and said corresponding new branch target address being supplied to said branch history table, respectively.
3. A device as recited in claim 2 further comprising:
a plurality of comparators corresponding on a one to one basis to said first plurality of buffers and connected to compare a new branch instruction address from said buffer with a single branch instruction address from said prefetch unit, each of said comparators outputting a control signal indicating a match therebetween; and
a second selector connected to receive a new branch target address from each of said second plurality of buffers and to select one of said new branch target addresses from each of said second plurality of buffers in response to a control signal from said plurality of comparators and to supply said selected output to said selector.
4. An arrangement for predicting a branch target address in a digital data processing system, comprising:
first means for prefetching an instruction address of a given instruction sequence;
second means for storing a plurality of branch instruction addresses, said second means being coupled to receive a prefetched instruction address from said first means, said prefetched instruction address also being utilized to index a branch instruction address, said second means being supplied with a new branch instruction address upon failure of a branch target prediction;
third means for storing a plurality of branch target addresses corresponding to said branch instruction addresses memorized in said second means on a one to one basis, said third means being coupled to receive said prefetched instruction address which is utilized to derive said branch target address corresponding to said branch instruction address coincident with said prefetched instruction address in said second means, said third means being supplied with a new branch target address when said second means is updated;
fourth means being arranged to compare an instruction address outputted from said second means with said prefetched instruction address applied from said first means, and for outputting a comparison result of said fourth means as a control signal to said first means to control instruction address prefetching by said first means;
fifth means being arranged to compare a prefetched instruction address applied from said first means with said new branch instruction address being supplied to said second means and for generating a match/miss-match signal indicative of whether said prefetched instruction address applied from said first means and said new branch instruction address match or miss-match; and
sixth means being arranged to receive a branch target address from said third means and said new branch target address being supplied to said third means and being coupled to receive said match/miss-match signal from said fifth means, said sixth means selecting said new branch target address when said match/miss-match signal from said fifth means indicates a match, and selecting said branch target address from said third means when said match/miss-match signal from said fifth means indicates a miss-match, said output of said sixth means being applied to said first means.
5. An arrangement as claimed in claim 4, further comprising:
a plurality of first buffers arranged in series for delaying application of said new branch instruction address to said second means; and
a plurality of second buffers arranged in series for delaying application of said new branch target address to said third means.
6. An arrangement as claimed in claim 5, further comprising:
a plurality of comparators each of which is arranged to compare a prefetched instruction applied from said first means with said new branch instruction address delayed by one or more of the first buffers; and
a selector being coupled to said second plurality of buffers and said plurality of comparators for selecting a new branch target address in response to a match being detected by one of said plurality of comparators and inputting said selected new branch target address to said sixth means.
US08/199,970 1991-03-15 1994-02-22 Arrangement for predicting a branch target address in the second iteration of a short loop Expired - Fee Related US5394530A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US08/199,970 US5394530A (en) 1991-03-15 1994-02-22 Arrangement for predicting a branch target address in the second iteration of a short loop

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
JP37443391 1991-03-15
JP3-74433 1991-03-15
US85206392A 1992-03-16 1992-03-16
US08/199,970 US5394530A (en) 1991-03-15 1994-02-22 Arrangement for predicting a branch target address in the second iteration of a short loop

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US85206392A Continuation 1991-03-15 1992-03-16

Publications (1)

Publication Number Publication Date
US5394530A true US5394530A (en) 1995-02-28

Family

ID=26582585

Family Applications (1)

Application Number Title Priority Date Filing Date
US08/199,970 Expired - Fee Related US5394530A (en) 1991-03-15 1994-02-22 Arrangement for predicting a branch target address in the second iteration of a short loop

Country Status (1)

Country Link
US (1) US5394530A (en)

Cited By (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5634103A (en) * 1995-11-09 1997-05-27 International Business Machines Corporation Method and system for minimizing branch misprediction penalties within a processor
US5642500A (en) * 1993-11-26 1997-06-24 Fujitsu Limited Method and apparatus for controlling instruction in pipeline processor
US5666505A (en) * 1994-03-11 1997-09-09 Advanced Micro Devices, Inc. Heuristic prefetch mechanism and method for computer system
WO1998002800A1 (en) * 1996-07-16 1998-01-22 Advanced Micro Devices, Inc. A delayed update register for an array
US5734881A (en) * 1995-12-15 1998-03-31 Cyrix Corporation Detecting short branches in a prefetch buffer using target location information in a branch target cache
US5822577A (en) * 1996-05-01 1998-10-13 International Business Machines Corporation Context oriented branch history table
US5835967A (en) * 1993-10-18 1998-11-10 Cyrix Corporation Adjusting prefetch size based on source of prefetch address
US5867699A (en) * 1996-07-25 1999-02-02 Unisys Corporation Instruction flow control for an instruction processor
US5875324A (en) * 1995-06-07 1999-02-23 Advanced Micro Devices, Inc. Superscalar microprocessor which delays update of branch prediction information in response to branch misprediction until a subsequent idle clock
US5878255A (en) * 1995-06-07 1999-03-02 Advanced Micro Devices, Inc. Update unit for providing a delayed update to a branch prediction array
US5887174A (en) * 1996-06-18 1999-03-23 International Business Machines Corporation System, method, and program product for instruction scheduling in the presence of hardware lookahead accomplished by the rescheduling of idle slots
US5935238A (en) * 1997-06-19 1999-08-10 Sun Microsystems, Inc. Selection from multiple fetch addresses generated concurrently including predicted and actual target by control-flow instructions in current and previous instruction bundles
US5964869A (en) * 1997-06-19 1999-10-12 Sun Microsystems, Inc. Instruction fetch mechanism with simultaneous prediction of control-flow instructions
US6044222A (en) * 1997-06-23 2000-03-28 International Business Machines Corporation System, method, and program product for loop instruction scheduling hardware lookahead
US6119221A (en) * 1996-11-01 2000-09-12 Matsushita Electric Industrial Co., Ltd. Instruction prefetching apparatus and instruction prefetching method for processing in a processor
US6230260B1 (en) 1998-09-01 2001-05-08 International Business Machines Corporation Circuit arrangement and method of speculative instruction execution utilizing instruction history caching
US6275918B1 (en) 1999-03-16 2001-08-14 International Business Machines Corporation Obtaining load target operand pre-fetch address from history table information upon incremented number of access indicator threshold
US20010020267A1 (en) * 2000-03-02 2001-09-06 Kabushiki Kaisha Toshiba Pipeline processing apparatus with improved efficiency of branch prediction, and method therefor
GB2363487A (en) * 1999-10-21 2001-12-19 Samsung Electronics Co Ltd Branch predictor using a branch prediction accuracy history
US6412059B1 (en) * 1998-10-02 2002-06-25 Nec Corporation Method and device for controlling cache memory
CN1093658C (en) * 1997-03-26 2002-10-30 国际商业机器公司 Branch history table with branch pattern field
US20020166042A1 (en) * 2001-05-01 2002-11-07 Yoav Almog Speculative branch target allocation
US20020188833A1 (en) * 2001-05-04 2002-12-12 Ip First Llc Dual call/return stack branch prediction system
US20020194464A1 (en) * 2001-05-04 2002-12-19 Ip First Llc Speculative branch target address cache with selective override by seconday predictor based on branch instruction type
US20020194460A1 (en) * 2001-05-04 2002-12-19 Ip First Llc Apparatus, system and method for detecting and correcting erroneous speculative branch target address cache branches
US20020194461A1 (en) * 2001-05-04 2002-12-19 Ip First Llc Speculative branch target address cache
US6604190B1 (en) * 1995-06-07 2003-08-05 Advanced Micro Devices, Inc. Data address prediction structure and a method for operating the same
US20040030866A1 (en) * 2002-04-26 2004-02-12 Ip-First, Llc Apparatus and method for buffering instructions and late-generated related information using history of previous load/shifts
US20040139301A1 (en) * 2003-01-14 2004-07-15 Ip-First, Llc. Apparatus and method for killing an instruction after loading the instruction into an instruction queue in a pipelined microprocessor
US20040139292A1 (en) * 2003-01-14 2004-07-15 Ip-First, Llc. Apparatus and method for resolving deadlock fetch conditions involving branch target address cache
US20040139281A1 (en) * 2003-01-14 2004-07-15 Ip-First, Llc. Apparatus and method for efficiently updating branch target address cache
US20040143727A1 (en) * 2003-01-16 2004-07-22 Ip-First, Llc. Method and apparatus for correcting an internal call/return stack in a microprocessor that detects from multiple pipeline stages incorrect speculative update of the call/return stack
US20040143709A1 (en) * 2003-01-16 2004-07-22 Ip-First, Llc. Apparatus and method for invalidation of redundant branch target address cache entries
US20050044343A1 (en) * 2001-07-03 2005-02-24 Ip-First, Llc. Apparatus and method for selectively accessing disparate instruction buffer stages based on branch target address cache hit and instruction stage wrap
US6871275B1 (en) * 1996-12-12 2005-03-22 Intel Corporation Microprocessor having a branch predictor using speculative branch registers
US20050076193A1 (en) * 2003-09-08 2005-04-07 Ip-First, Llc. Apparatus and method for selectively overriding return stack prediction in response to detection of non-standard return sequence
US20050114636A1 (en) * 2001-05-04 2005-05-26 Ip-First, Llc. Apparatus and method for target address replacement in speculative branch target address cache
US20050132175A1 (en) * 2001-05-04 2005-06-16 Ip-First, Llc. Speculative hybrid branch direction predictor
US20050198481A1 (en) * 2001-07-03 2005-09-08 Ip First Llc Apparatus and method for densely packing a branch instruction predicted by a branch target address cache and associated target instructions into a byte-wide instruction buffer
US20050198479A1 (en) * 2001-07-03 2005-09-08 Ip First Llc Apparatus and method for handling BTAC branches that wrap across instruction cache lines
US20050268076A1 (en) * 2001-05-04 2005-12-01 Via Technologies, Inc. Variable group associativity branch target address cache delivering multiple target addresses per cache line
US20060010310A1 (en) * 2001-07-03 2006-01-12 Ip-First, Llc. Apparatus and method for handling BTAC branches that wrap across instruction cache lines
US7085915B1 (en) 2000-02-29 2006-08-01 International Business Machines Corporation Programmable prefetching of instructions for a processor executing a non-procedural program
US20070234009A1 (en) * 2000-08-31 2007-10-04 Intel Corporation Processor having a dedicated hash unit integrated within
US7421572B1 (en) 1999-09-01 2008-09-02 Intel Corporation Branch instruction for processor with branching dependent on a specified bit in a register
US7437724B2 (en) 2002-04-03 2008-10-14 Intel Corporation Registers for data transfers
US7546444B1 (en) 1999-09-01 2009-06-09 Intel Corporation Register set used in multithreaded parallel processor architecture
US20100050164A1 (en) * 2006-12-11 2010-02-25 Nxp, B.V. Pipelined processor and compiler/scheduler for variable number branch delay slots

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4430706A (en) * 1980-10-27 1984-02-07 Burroughs Corporation Branch prediction apparatus and method for a data processing system
US4984154A (en) * 1982-11-17 1991-01-08 Nec Corporation Instruction prefetching device with prediction of a branch destination address
US4991080A (en) * 1986-03-13 1991-02-05 International Business Machines Corporation Pipeline processing apparatus for executing instructions in three streams, including branch stream pre-execution processor for pre-executing conditional branch instructions
US5142634A (en) * 1989-02-03 1992-08-25 Digital Equipment Corporation Branch prediction
US5313634A (en) * 1992-07-28 1994-05-17 International Business Machines Corporation Computer system branch prediction of subroutine returns
US5317702A (en) * 1989-05-25 1994-05-31 Nec Corporation Device for effectively controlling a branch history table for an instruction prefetching system even if predictions are branch unsuccessful

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4430706A (en) * 1980-10-27 1984-02-07 Burroughs Corporation Branch prediction apparatus and method for a data processing system
US4984154A (en) * 1982-11-17 1991-01-08 Nec Corporation Instruction prefetching device with prediction of a branch destination address
US4991080A (en) * 1986-03-13 1991-02-05 International Business Machines Corporation Pipeline processing apparatus for executing instructions in three streams, including branch stream pre-execution processor for pre-executing conditional branch instructions
US5142634A (en) * 1989-02-03 1992-08-25 Digital Equipment Corporation Branch prediction
US5317702A (en) * 1989-05-25 1994-05-31 Nec Corporation Device for effectively controlling a branch history table for an instruction prefetching system even if predictions are branch unsuccessful
US5313634A (en) * 1992-07-28 1994-05-17 International Business Machines Corporation Computer system branch prediction of subroutine returns

Cited By (71)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5835967A (en) * 1993-10-18 1998-11-10 Cyrix Corporation Adjusting prefetch size based on source of prefetch address
US5642500A (en) * 1993-11-26 1997-06-24 Fujitsu Limited Method and apparatus for controlling instruction in pipeline processor
US5666505A (en) * 1994-03-11 1997-09-09 Advanced Micro Devices, Inc. Heuristic prefetch mechanism and method for computer system
US5878255A (en) * 1995-06-07 1999-03-02 Advanced Micro Devices, Inc. Update unit for providing a delayed update to a branch prediction array
US6604190B1 (en) * 1995-06-07 2003-08-05 Advanced Micro Devices, Inc. Data address prediction structure and a method for operating the same
US5875324A (en) * 1995-06-07 1999-02-23 Advanced Micro Devices, Inc. Superscalar microprocessor which delays update of branch prediction information in response to branch misprediction until a subsequent idle clock
US5634103A (en) * 1995-11-09 1997-05-27 International Business Machines Corporation Method and system for minimizing branch misprediction penalties within a processor
US5734881A (en) * 1995-12-15 1998-03-31 Cyrix Corporation Detecting short branches in a prefetch buffer using target location information in a branch target cache
US5822577A (en) * 1996-05-01 1998-10-13 International Business Machines Corporation Context oriented branch history table
US5887174A (en) * 1996-06-18 1999-03-23 International Business Machines Corporation System, method, and program product for instruction scheduling in the presence of hardware lookahead accomplished by the rescheduling of idle slots
WO1998002800A1 (en) * 1996-07-16 1998-01-22 Advanced Micro Devices, Inc. A delayed update register for an array
US5867699A (en) * 1996-07-25 1999-02-02 Unisys Corporation Instruction flow control for an instruction processor
US6119221A (en) * 1996-11-01 2000-09-12 Matsushita Electric Industrial Co., Ltd. Instruction prefetching apparatus and instruction prefetching method for processing in a processor
US6871275B1 (en) * 1996-12-12 2005-03-22 Intel Corporation Microprocessor having a branch predictor using speculative branch registers
CN1093658C (en) * 1997-03-26 2002-10-30 国际商业机器公司 Branch history table with branch pattern field
US5935238A (en) * 1997-06-19 1999-08-10 Sun Microsystems, Inc. Selection from multiple fetch addresses generated concurrently including predicted and actual target by control-flow instructions in current and previous instruction bundles
US5964869A (en) * 1997-06-19 1999-10-12 Sun Microsystems, Inc. Instruction fetch mechanism with simultaneous prediction of control-flow instructions
US6044222A (en) * 1997-06-23 2000-03-28 International Business Machines Corporation System, method, and program product for loop instruction scheduling hardware lookahead
US6230260B1 (en) 1998-09-01 2001-05-08 International Business Machines Corporation Circuit arrangement and method of speculative instruction execution utilizing instruction history caching
US6412059B1 (en) * 1998-10-02 2002-06-25 Nec Corporation Method and device for controlling cache memory
US6275918B1 (en) 1999-03-16 2001-08-14 International Business Machines Corporation Obtaining load target operand pre-fetch address from history table information upon incremented number of access indicator threshold
US7421572B1 (en) 1999-09-01 2008-09-02 Intel Corporation Branch instruction for processor with branching dependent on a specified bit in a register
US7991983B2 (en) 1999-09-01 2011-08-02 Intel Corporation Register set used in multithreaded parallel processor architecture
US7546444B1 (en) 1999-09-01 2009-06-09 Intel Corporation Register set used in multithreaded parallel processor architecture
GB2363487B (en) * 1999-10-21 2002-09-04 Samsung Electronics Co Ltd Branch predictor using branch prediction accuracy history
GB2363487A (en) * 1999-10-21 2001-12-19 Samsung Electronics Co Ltd Branch predictor using a branch prediction accuracy history
US7085915B1 (en) 2000-02-29 2006-08-01 International Business Machines Corporation Programmable prefetching of instructions for a processor executing a non-procedural program
US20010020267A1 (en) * 2000-03-02 2001-09-06 Kabushiki Kaisha Toshiba Pipeline processing apparatus with improved efficiency of branch prediction, and method therefor
US20070234009A1 (en) * 2000-08-31 2007-10-04 Intel Corporation Processor having a dedicated hash unit integrated within
US7743235B2 (en) 2000-08-31 2010-06-22 Intel Corporation Processor having a dedicated hash unit integrated within
US7681018B2 (en) 2000-08-31 2010-03-16 Intel Corporation Method and apparatus for providing large register address space while maximizing cycletime performance for a multi-threaded register file set
US20020166042A1 (en) * 2001-05-01 2002-11-07 Yoav Almog Speculative branch target allocation
US7134005B2 (en) 2001-05-04 2006-11-07 Ip-First, Llc Microprocessor that detects erroneous speculative prediction of branch instruction opcode byte
US20020194460A1 (en) * 2001-05-04 2002-12-19 Ip First Llc Apparatus, system and method for detecting and correcting erroneous speculative branch target address cache branches
US20020188833A1 (en) * 2001-05-04 2002-12-12 Ip First Llc Dual call/return stack branch prediction system
US7707397B2 (en) 2001-05-04 2010-04-27 Via Technologies, Inc. Variable group associativity branch target address cache delivering multiple target addresses per cache line
US7398377B2 (en) 2001-05-04 2008-07-08 Ip-First, Llc Apparatus and method for target address replacement in speculative branch target address cache
US20050114636A1 (en) * 2001-05-04 2005-05-26 Ip-First, Llc. Apparatus and method for target address replacement in speculative branch target address cache
US20050132175A1 (en) * 2001-05-04 2005-06-16 Ip-First, Llc. Speculative hybrid branch direction predictor
US20020194464A1 (en) * 2001-05-04 2002-12-19 Ip First Llc Speculative branch target address cache with selective override by seconday predictor based on branch instruction type
US7200740B2 (en) 2001-05-04 2007-04-03 Ip-First, Llc Apparatus and method for speculatively performing a return instruction in a microprocessor
US20050268076A1 (en) * 2001-05-04 2005-12-01 Via Technologies, Inc. Variable group associativity branch target address cache delivering multiple target addresses per cache line
US7165169B2 (en) 2001-05-04 2007-01-16 Ip-First, Llc Speculative branch target address cache with selective override by secondary predictor based on branch instruction type
US20020194461A1 (en) * 2001-05-04 2002-12-19 Ip First Llc Speculative branch target address cache
US20050044343A1 (en) * 2001-07-03 2005-02-24 Ip-First, Llc. Apparatus and method for selectively accessing disparate instruction buffer stages based on branch target address cache hit and instruction stage wrap
US7234045B2 (en) 2001-07-03 2007-06-19 Ip-First, Llc Apparatus and method for handling BTAC branches that wrap across instruction cache lines
US20050198481A1 (en) * 2001-07-03 2005-09-08 Ip First Llc Apparatus and method for densely packing a branch instruction predicted by a branch target address cache and associated target instructions into a byte-wide instruction buffer
US7203824B2 (en) 2001-07-03 2007-04-10 Ip-First, Llc Apparatus and method for handling BTAC branches that wrap across instruction cache lines
US7159098B2 (en) 2001-07-03 2007-01-02 Ip-First, Llc. Selecting next instruction line buffer stage based on current instruction line boundary wraparound and branch target in buffer indicator
US7162619B2 (en) 2001-07-03 2007-01-09 Ip-First, Llc Apparatus and method for densely packing a branch instruction predicted by a branch target address cache and associated target instructions into a byte-wide instruction buffer
US20050198479A1 (en) * 2001-07-03 2005-09-08 Ip First Llc Apparatus and method for handling BTAC branches that wrap across instruction cache lines
US20060010310A1 (en) * 2001-07-03 2006-01-12 Ip-First, Llc. Apparatus and method for handling BTAC branches that wrap across instruction cache lines
US7437724B2 (en) 2002-04-03 2008-10-14 Intel Corporation Registers for data transfers
US7159097B2 (en) 2002-04-26 2007-01-02 Ip-First, Llc Apparatus and method for buffering instructions and late-generated related information using history of previous load/shifts
US20040030866A1 (en) * 2002-04-26 2004-02-12 Ip-First, Llc Apparatus and method for buffering instructions and late-generated related information using history of previous load/shifts
US20040139301A1 (en) * 2003-01-14 2004-07-15 Ip-First, Llc. Apparatus and method for killing an instruction after loading the instruction into an instruction queue in a pipelined microprocessor
US7165168B2 (en) 2003-01-14 2007-01-16 Ip-First, Llc Microprocessor with branch target address cache update queue
US7143269B2 (en) 2003-01-14 2006-11-28 Ip-First, Llc Apparatus and method for killing an instruction after loading the instruction into an instruction queue in a pipelined microprocessor
US20040139281A1 (en) * 2003-01-14 2004-07-15 Ip-First, Llc. Apparatus and method for efficiently updating branch target address cache
US7185186B2 (en) 2003-01-14 2007-02-27 Ip-First, Llc Apparatus and method for resolving deadlock fetch conditions involving branch target address cache
US20040139292A1 (en) * 2003-01-14 2004-07-15 Ip-First, Llc. Apparatus and method for resolving deadlock fetch conditions involving branch target address cache
US7152154B2 (en) 2003-01-16 2006-12-19 Ip-First, Llc. Apparatus and method for invalidation of redundant branch target address cache entries
US7178010B2 (en) 2003-01-16 2007-02-13 Ip-First, Llc Method and apparatus for correcting an internal call/return stack in a microprocessor that detects from multiple pipeline stages incorrect speculative update of the call/return stack
US20040143709A1 (en) * 2003-01-16 2004-07-22 Ip-First, Llc. Apparatus and method for invalidation of redundant branch target address cache entries
US20040143727A1 (en) * 2003-01-16 2004-07-22 Ip-First, Llc. Method and apparatus for correcting an internal call/return stack in a microprocessor that detects from multiple pipeline stages incorrect speculative update of the call/return stack
US20050076193A1 (en) * 2003-09-08 2005-04-07 Ip-First, Llc. Apparatus and method for selectively overriding return stack prediction in response to detection of non-standard return sequence
US7631172B2 (en) 2003-09-08 2009-12-08 Ip-First, Llc Apparatus and method for selectively overriding return stack prediction in response to detection of non-standard return sequence
US7237098B2 (en) 2003-09-08 2007-06-26 Ip-First, Llc Apparatus and method for selectively overriding return stack prediction in response to detection of non-standard return sequence
US20070083741A1 (en) * 2003-09-08 2007-04-12 Ip-First, Llc Apparatus and method for selectively overriding return stack prediction in response to detection of non-standard return sequence
US20100050164A1 (en) * 2006-12-11 2010-02-25 Nxp, B.V. Pipelined processor and compiler/scheduler for variable number branch delay slots
US8959500B2 (en) * 2006-12-11 2015-02-17 Nytell Software LLC Pipelined processor and compiler/scheduler for variable number branch delay slots

Similar Documents

Publication Publication Date Title
US5394530A (en) Arrangement for predicting a branch target address in the second iteration of a short loop
US5687349A (en) Data processor with branch target address cache and subroutine return address cache and method of operation
US4811215A (en) Instruction execution accelerator for a pipelined digital machine with virtual memory
US5805877A (en) Data processor with branch target address cache and method of operation
US5530825A (en) Data processor with branch target address cache and method of operation
EP0689131B1 (en) A computer system for executing branch instructions
US4476525A (en) Pipeline-controlled data processing system capable of performing a plurality of instructions simultaneously
JP3542021B2 (en) Method and apparatus for reducing set associative cache delay by set prediction
US5125083A (en) Method and apparatus for resolving a variable number of potential memory access conflicts in a pipelined computer system
JP2744890B2 (en) Branch prediction data processing apparatus and operation method
US6081887A (en) System for passing an index value with each prediction in forward direction to enable truth predictor to associate truth value with particular branch instruction
US5535346A (en) Data processor with future file with parallel update and method of operation
EP0213842A2 (en) Mechanism for performing data references to storage in parallel with instruction execution on a reduced instruction-set processor
EP0394624B1 (en) Multiple sequence processor system
US20010047467A1 (en) Method and apparatus for branch prediction using first and second level branch prediction tables
JPH0334024A (en) Method of branch prediction and instrument for the same
JPH0820950B2 (en) Multi-predictive branch prediction mechanism
JPH0557616B2 (en)
US6157999A (en) Data processing system having a synchronizing link stack and method thereof
WO1990003001A1 (en) Pipeline structures and methods
JP2000029701A (en) Method and system for fetching discontinuous instruction in single clock cycle
US5935238A (en) Selection from multiple fetch addresses generated concurrently including predicted and actual target by control-flow instructions in current and previous instruction bundles
US5964869A (en) Instruction fetch mechanism with simultaneous prediction of control-flow instructions
EP1109095B1 (en) Instruction prefetch and branch prediction circuit
US5794027A (en) Method and apparatus for managing the execution of instructons with proximate successive branches in a cache-based data processing system

Legal Events

Date Code Title Description
FPAY Fee payment

Year of fee payment: 4

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 8

REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees
STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20070228