Essays24.com - Term Papers and Free Essays
Search

Discuss Pipeline Hazards and Explain How Pipelining Is Implemented

Essay by   •  February 12, 2018  •  Essay  •  1,506 Words (7 Pages)  •  1,309 Views

Essay Preview: Discuss Pipeline Hazards and Explain How Pipelining Is Implemented

Report this essay
Page 1 of 7

Q1: Discuss pipeline hazards and explain how pipelining is implemented.

Pipeline which is done by the superscalar architecture processor is an implementation technique where multiple instructions are overlapped in execution. The computer pipeline is divided in states (Fetch, Decode, Execute, Store/Write back). The stages are connected one to the next to form a pipe (Prabhu, n.d.).  Pipeline hazards are situations that prevent the next instruction in the instruction stream from executing during its designated clock cycles.

Pipelining is implemented in a way almost identical to the assembly line During each stage it is timed by a clock cycle.  

The idea of pipelining describes by (Englander, 2009) is that as each instruction completes a step, the following instruction moves into the stage just vacated. Thus, when the first instruction is completed, the next one is already one stage short of completion. If there are many steps in the fetch-execute cycle, we can have several instructions at various points in the cycle.

The method is like an automobile assembly line, where several cars are in different degrees of production at the same time. It still takes the same amount of time to complete one instruction cycle (or one car), but the pipelining technique results in a large overall increase in the average number of instructions performed in a given time. Wienand explains Pipelining as a hose filled with marbles, and by putting marbles in the hose one at a time (or per clock cycle), the marbles (the results) will get pushed out at the other end of the same hose. (Wienand, 2016)

Pipeline microprocessor hazards happen when multiple instructions are executed, and conflicts occur (Cheng, 2013).  

There are classes of hazards: structural, data, and control:

  • Structural Hazards. They arise from resource conflicts when the hardware cannot support all possible combinations of instructions in simultaneous overlapped execution.  (Prabhu, n.d.).  

The most common case is when memory is accessed at the same time by two instructions. One instruction may need to access the memory as part of the Execute or Write back phase while other instruction is being fetched. In this case if both the instructions and data reside in the same memory. Both the instructions can’t proceed together and one of them needs to be stalled till the other is done with the memory access part. (Ques10, 2015)

  • Data Hazards. This conflict arises when an instruction depends on the result of a previous instruction in a way that is exposed by the overlapping of instructions in the pipeline. (Prabhu, n.d.).  As an example, whenever there are two instructions one of which depends on the data obtained from the other.

A=3+A

B=A*4

For the above sequence, the second instruction needs the value of ‘A’ computed in the first instruction. As a result, the second instruction is said to depend on the first. If the execution is done in a pipelined processor, it is highly likely that the interleaving of these two instructions can lead to incorrect results due to data dependency between the instructions. Thus, the pipeline needs to be stalled as and when necessary to avoid errors. (Ques10, 2015)

  • Control Hazards. They arise from the pipelining of branches and other instructions that change the PC. (Prabhu, n.d.).  However, the problem arises when one of the instructions is a branching instruction to some other memory location. Thus, all the instruction fetched in the pipeline from consecutive memory locations are invalid now and need to remove (also called flushing of the pipeline). This induces a stall till new instructions are again fetched from the memory address specified in the branch instruction.

Thus, the time lost because of this is called a branch penalty. Often dedicated hardware is incorporated in the fetch unit to identify branch instructions and compute branch addresses as soon as possible and reducing the resulting delay as a result. (Ques10, 2015)

Q2: Describe the role of cache and virtual memory, and the principles of memory management.

Cache memory is memory embedded inside the CPU alongside the registers and processor logic and it is one of the most important element of the CPU architecture. There is a limit to how big it can be and even the largest caches are tens of megabytes, rather than the gigabytes or terabytes of main memory or hard disk. Cache memory is very fast, usually taking just once cycle to access. There are multiple sub-levels of cache memory usually called L1, L2, L3 all with slightly more speed. L1 cache is the fastest and smallest; L2 is bigger and slower, and L3 more so (Wienand, 2016). Cache memory exploits the refresh rate that is closer to the CPU’s clock speed and minimize wasted cycles (Stone, n.d.).  Because cache is much smaller than RAM it runs faster and can significantly increase processor performance.  Many modern computers have high clock rates in billions of cycles per second (GHz).  The higher the clock rate, the better use of cache.  Cache can be external to the CPU, which sits between the RAM and the CPU, or it can be incorporated as part of the CPU.

Virtual memory is a memory management capability of the Operating System that uses hardware and software to allow the computer to compensate for physical memory shortages by temporarily transferring data from random access memory (RAM) to disk storage. Virtual address space is increased using active memory in RAM and inactive memory in hard disk drives (HDDs) to form contiguous addresses that hold both the application and its data.

Virtual memory provides a mapping that the processor and operating system control that converts virtual address that are not unique, but are given out to each program into unique physical addresses that are used by the memory system. This translation happens by dividing memory into pages. A page table that exists in memory contains a translation from virtual page numbers to physical page numbers. Since this page table exists in memory and memory performance is a problem, a higher performance means of finding these translations is needed. A translation look-aside buffer (TLB) behaves as a cache for this page table. The processor first looks in the TLB for a translation. If a translation is available, it uses that and proceeds to look in the cache. If a translation is not available, the processor signals the operating system to look in the page table for the translation. Once the translation is found, the operating system updates the TLB so that the next time the same translation is requested it will be found in the TLB and be a fast translation. (Cox, n.d.)

...

...

Download as:   txt (9.6 Kb)   pdf (125.3 Kb)   docx (15 Kb)  
Continue for 6 more pages »
Only available on Essays24.com