Glossary

Home Up

Home Teaching Glossary ARM Processors Supplements Prof issues About

Glossary

This glossary provides a brief discussion of some of the terms used in this website and in Computer Organization and Architecture. Some terms have a hyperlink to a more detailed discussion.

Absolute address

The absolute address, or direct address, of an operand is its actual address in memory. Eight-bit microprocessors and microcontrollers together with 16/32 bit CISC processors support absolute addresses in which the address of the operand appear in memory. For example, MOVE D0,1234 means copy the contents of D0 to (absolute) memory location 1234. RISC processors do not support absolute addresses and require pointer-based addressing.

Abstraction

The dictionary definition of abstraction defines it as “a general term or idea” or “the act of considering something as a general characteristic free from actual instances”. In computer science abstraction is concerned with separating meaning from implementation. In everyday life we could say that the word “go” was an abstraction because it hides the specific instance (go by walking, cycling, riding, flying). Abstraction is an important too because it separates what we do from how we do it. This is an important concept because it helps us to build complex systems by decomposing them into subtasks or activities.

Access time

One of the key parameters of memory is its access time which is the time taken to locate a cell in the memory, and then to read or write the data. If the access time of all memory cells are approximately equal, the memory is called random access memory. If you have to access memory element one by one in order to find a given location, the access time depends on the location of the data and the memory is called sequential access (disk, tape, and optical).

Active-low

A signal is called active-low if it is asserted by putting it into a low voltage state. An active-low signal is normally indicated by an asterisk as a suffix (e.g., RESET*, BERR*). For example, the signal RESET* is active-low and must be approximately 0 V to cause a reset to take place. An active-high signal is one that is asserted by putting it into high voltage state.

Address

Information is stored in memory in consecutive locations. Each location has a unique address that defines its place in the memory. Information is retrieved from memory by giving it an address and then reading data in that location.

Address register

The 68K has eight 32-bit special purpose address registers, A0 to A7. An address register so-called because it holds the address of an operand. In other words, an address register is a pointer register. An operation on an address register always yields a 32-bit result. Even if a 16-bit operation is applied to the contents of an address register, the result is sign-extended to 32 bits. Operations on the contents of an address register (with the exception of CMPA) do not modify the CCR. Address registers are intended to be used in conjunction with the address register indirect addressing mode. Address register A7 is also used as a stack pointer by the 68000. MIPS and ARM and most CISC processors do not have explicit address registers like the 68K.

Address space

In mathematics the term space defines an abstract region encompassing all possible members of that space. This term has been extended to computing and means all the addresses that can be generated. For example, if a microcontroller has a 16-bit address but, its address space is made up of the 216 combinations from 0000000000000000 to 111111111111111. Address space is often divided into logical or physical address space. Logical address space is the address space spanned by the addresses on the address bus, and physical address space is the address space made up of all the actual physical (e.g., DRAM) memory locations.

Addressing mode

An addressing mode represents the way in which the location of an operand is expressed. Typical addressing modes are literal, absolute (memory direct), register indirect, and indexed.

ALU

Arithmetic and logical unit, ALU. This is normally defined as the part of the computer where basic data processing takes place; for example, addition, multiplication and Boolean operations. Of course, in a modern computer such processing is very much distributed to many ALU-like units. However, the term ALU is still useful because it indicates the processing of information rather than its storage.

Architecture

The structure of a computer. This term is used in different ways by different people. However, architecture or instruction set architecture is generally used to describe the structure of a computer at the register and instruction set level. The term organization or microarchitecture is used to describe how the instruction set architecture is implemented.

Amdahl’s law

Amdahl’s law relate the performance increase of a computer to the number of processors operating in parallel and to the fraction of the work that can be executed in parallel. Essentially, it tells you that if you have a program that can be executed almost entirely in parallel, then adding more processors/cores speeds things up proportionately. On the other hand, if you can’t parallelize a program because a substantial fraction must be executed serially, that adding more and more processors is pointless. Some say that Amdahl’s law has served to hinder progress in parallel processing and that it gives an unduly negative view of parallelism. Another view is that the size of problems is expensing so fast (i.e., massive data sets) that increasing parallelism is effective.

Arbitration

When two or more entities compete for a resource, a mechanism to decide who wins has to be implemented. This mechanism is called arbitration. For example, at a four-way stop where several cars may be waiting to proceed the arbitration mechanism is ‘the longest waiting car’ goes first. Computers require arbitration when, for example, several processors request access to a common bus as the same time. An arbiter is used to decide which processor gets the bus. This can be based on priority (a higher priority device always beats a lower priority device) or on fairness (all devices get an equal chance to access the bus).

Architecture

Architecture is another word for “design” or “structure”. Computer architecture indicates the structure of a computer. The structure of a computer can be viewed in different ways. The programmer sees a computer in terms of what it does; that is, in terms of its instruction set and the data it manipulates. The computer engineer sees the computer in terms of how it is constructed and how it operates. The computer engineer’s view of a computer is normally called its organization and the term architecture is used to describe the programmer’s view of the computer.

Assembler directive

An assembler directive is a statement in an assembly language program that provides the assembler with some information it requires to assemble a source file into an object file. Typical assembler directives tell the assembler where to locate a program and data in memory and equate symbolic names to actual values. The most important assembler directive are: AREA, END, ALIGN, DCB, DCW, SPACE (ARM), and ORG, EQU, DS, DC (68K).

Assemble time

Assemble time describes events that take place during the assembly of a program. That is, assemble time contrasts with run time. For example, symbolic expressions in the assembly language are evaluated at assemble time and not run time. Note that an assemble time operation is performed once only. For example, the expression ARRAY DS.B ROWS*COLS+1 is evaluated by the assembler and the result of ROWS*COLS+1 is used to reserve the appropriate number of bytes of storage. By way of contrast, the address of the source operand 8+[A0]+[D3] in the instruction MOVE 8(A0,D3),D6 is evaluated at run time, because the contents of registers A0 and D0 are not known at assemble time and because the contents of these registers may change as the program is executed.

Asserted

Digital systems use a high voltage and a low voltage to represent the two logical states. Sometimes a high level represents the active or true state, and sometimes a low level represent the active state. To avoid confusion, we use the term asserted to mean that a signal is put in a true or active state.

Associative

Computer memory is generally accessed by providing the address (location) of the data to be retrieved or stored. Associative memory operates on a very different principle. Data is retrieved by matching a key with all memory elements simultaneously. Stored data does not have an address and the location of data in an associative store is of no importance. For example, consider the following key:cache data pairs house:casa, girl:chica, egg:huevo, friend:amigo. If you enter the key ‘casa’ in such a memory it is matched against all entries at the same time but only the pair casa:house responds with ‘house’.

Asymmetric Multiprocessing

A multiprocessor is considered asymmetric if its architecture with respect to its processors and memory is asymmetric. For example, the various processors may be of different types with different architectures, or they may be different parallel architectures (some in a cluster, some on a bus etc.). The memory may be partitioned between various processors and not all processors may have the same operating system.

Asynchronous

Events that are not related to a clock (or to other events) are called asynchronous. For example, a lightning strike, or a telephone call, or striking a keyboard are all asynchronous events. An asynchronous event can cause problems for the designer because the event you are capturing may be changing at the very moment of capture. Asynchronous circuits are arranged so that the completion of one event triggers the next. It is difficult to design reliable asynchronous circuits.

Atomic operation

An atomic operation is one that cannot be interrupted or sub-divided in any way. Such an operation is also called indivisible. Atomic operations are used to implement synchronization in systems with multiple concurrent tasks. An atomic operation allows one process (or processor in a multiprocessor system) to ask if a resource is free and to claim it in one operation. Without an atomic operation, process A could ask if the resource were free, and process B could then step in an claim the resource before process A had claimed it.

Autoindexing

Any variable that is automatically incremented after use is said to be autoindexed or autoincremented. If the value is automatically decremented after use, it is said to be autoindexed or autodecremented. Because some systems perform the indexing before the variable is used, the terms preincrementing, predecrementing, postincrementing and postdecrementing are also used to describe the direction of the increment and when it takes place. Autoindexing is largely used to support data structures like stacks and to step through data structures such as tables, vectors and arrays. Autoindexing is normally applied to addressing modes (mostly associated with CISC processors but the ARM processors has a full complement of autoindexing addressing modes).

Atomic

An operation is said to be atomic if, once execution starts, it must be executed to completion without intervention. Atomic operations are used in situations where two or more proceses may request a resource. Without atomic operations processor P may request a resource and find it free. Then processor Q may also request the resource and also find it free. Consequently, both P and Q grab the resource and deadlock may result. An atomic operation allows a processor to test a resource and then claim it with a guarantee that no other process can interrupt between the test (is it free) and the allocation. Typically, an atomic operation is of the form test and set; that is, test a word in memory and if clear then set a bit.

Bit

A bit is a Binary DigIT and can be in one of two states, normally called 0 and 1. A bit is the smallest unit of information that can be processed by a computer. Computer operations are normally applied to all the bits of a word, although certain instructions are able to operate on an individual bit in a word. For example, you may be able to set bit 12 of a 32-bit word to 1 without affecting the other 31 bits.

Bit field

Digital computers are word-oriented. This means that all addresses fall on word boundaries and you can only read or write an entire word (some processors do allow one or more bytes of a word to be accessed). However, the word is a data structure imposed by technology on the computer. Sometimes it would be nice to deal with strings of bits (i.e., bit strings) that have no byte/word boundaries and can be stored anywhere in memory. For example. you could have a 75-bit string stored from bit 123,456 onward. Such bit strings may represent graphics structures. Most computers do now have explicit mechanisms for handling bit strings. You have to use conventional word/byte access and then put together bit stings using conventional Boolean and shifting operations. Some computers provide a limited degree of support for bit-string handling.

Branch

A branch is a computer instruction that is able to change the flow of control; that is, it may force the computer to continue executing code at another point in the program (rather than the next instruction in sequence). An unconditional branch also forces a jump to the target address (i.e., the next instruction is not executed). It is used to terminate a block of code and to return to a common point in a program. A conditional branch either continues execution or jumps to the target address depending on whether a condition specified by the instruction (e.g., carry set, zero set) is true or false.

Branch prediction

The branch is an instruction that forces a change in the flow of control; that is, it jumps to a new location in memory, the target address. Branch instructions can be conditional and the branch is taken (a jump) or not taken (continue on to the next instruction) depending on the outcome of a test. In pipelined machines, taking a branch means that instructions that have been fetched in advance will no longer be executed, and the pipeline has to be flushed – an inefficient process. Branch prediction uses several techniques (usually based on the past behavior of the program) to decide whether a branch will be taken or not, and whether to begin fetching data at the target address in anticipation.

Breakpoint

A breakpoint is a point in a program at which the execution of a program is stopped and the contents of the processor's registers displayed on the console. Breakpoints are used during the debugging of programs. Software breakpoints are normally implemented by replacing an instruction in a program with a code that forces an exception.

Bubble

A bubble or a pipeline stall is an empty slot in a pipeline (i.e., no useful work takes place during a clock cycle) because either an operand or a functional unit is not ready.

Buffer

A buffer is a mechanism that is used to control the flow of data. The buffer is a short-term store that receives incoming data, holds it, and then forwards it on its way. The purpose of a buffer is to hold bursts of data that are received rapidly until they can be processed. In other words, a buffer is entirely analogous to a dentist’s waiting room. A buffer is also known as a FIFO (first-in, first-out) store).

Burst mode

Some processes appear as a rapid sequence of actions followed by a quiet period. For example, some DRAM memory operates in a burst mode where four or more bytes of data are supplied one-after-another in a read cycle, followed by a gap before the next burst. DMA can operate in the same way. The advantage of a burst mode is that it is not necessary to set up the conditions again between individual elements of a burst; in other words, the first element of a burst may have a long latency, but there is little latency between successive elements within a burst.

Bus

A bus is a data highway that links two or more parts of a computer together. A computer may implement several entirely different buses. A high-speed bus may link a computer to memory. A low-speed bus may link a computer to internal peripherals such as Ethernet controllers or disk controllers. Some buses such as USB, FireWire, or Ethernet link computers to networks and to external peripherals such a printers and keyboards.

Bus Cycle

A clock cycle is the smallest event in most processor systems. Only hardware systems designers are interested in clock cycles. A bus cycle involves a read or write access to external memory or some other transaction involving the bus and may take several clock cycles. A split bus transaction may involve a bus access, giving up the bus to another device, and then completing the access that was begun and not finished.

Byte

A byte is a unit of 8 bits. In general, computers process units of information that are an integer multiple of 8 bits; for example 64 bits.

Cache coherence

In multiprocessor systems individual processors have their own cache memories. Consequently, two processors may have their own individual copies of a data element (which also exists in main store). It is important that when one processor updates its own cache that the same element is either updated in all its other locations or that its other copies are declared invalid. Cache coherency is the mechanism that describes keeping all cache copies in step.

Cache memory

Memory is slower than the processor, which forces a computer to wait for data and instructions to be fetched from memory. Cache memory is a small amount of very high-speed memory that holds frequently-used copies of data and instructions. When the computer fetches instructions or data, it first looks for it in the cache memory. Cache memory is important because it can determine the ultimate performance of a computer.

Chipset

The two major components of a computer are the CPU or microprocessor and the DRAM chips (dynamic memory containing programs and data). It takes a lot of additional logic circuitry to construct a practical working computer. In the early years of the microcomputer the motherboard contains the many integrated circuits required to interface the CPU to memory, provide input/output and so on. Eventually, microprocessors put most of these functions or glue logic, as it was called, into single chips. Collectively, these chips became known as chipsets. Thus, a chip set is a collection of individual chips that furnish most of the additional circuitry to turn a microprocessor into a computer.

CISC

The term complex instruction set computer, CISC, was coined to contrast with RISC. CISC computers generally have register-to-memory or memory-to-memory architectures in which you can perform an operation on the contents of data in memory (such as add a register to a memory location). Instruction sizes (the number of bits per instruction) are often variable and there are relatively few general-purpose registers. Moreover, registers are often specialized in what they can do (i.e., a register may operate only as an accumulator or a loop-coiunter). Most RISC processors have general-purpose registers (i.e., there is no restrictions on how registers can be used). Today, the terms RISC and CISC have lost much of their meaning, because virtually all RISC processors have traditional CISC features and vice versa.

Clock

The clock is a circuit that provides a constant stream of pulses that are used to trigger operations in a digital system. Modern high-performance computers have clock rates of the order of 4 GHz (4 x 109 Hz). The duration of a clock pulse is its period which is the reciprocal of its frequency. At 1 GHz this is 1 ns, and at 4 GHz it is 0.25 ns or 250 ps. To put this in context, light travels 3 inches in 250 ps (1 picosecond is 10-12 s).

Comment

A comment field in either a low-level or a high-level language is a text string that is ignored by the assembler or compiler, respectively. The only purpose of the comment field is to permit the programmer to describe some aspect of the program (i.e., to make a comment). In 68000 assembly language any text following the instruction or assembly directive is regarded as a comment. Any line that begins with an asterisk, *, in column one is also treated as a comment. The following example demonstrates two type of comment. In ARM assembly language a comment field is delineated by a semicolon. In MIPS assembly language a comment field is delineated by a hash, #.

Conditional

A conditional operation or instruction is executed only if certain conditions are met. A typical conditional instruction might be, “If the last operation yielded the result zero then jump to a new point in the program”. Conditional operations allow a computer to select between two possible courses of action depending on the outcome of a previous operation.

Computer organization

The term computer organization generally means the structure, organization, or implementation of a computer; That is, it describes how a particular instruction set (ISA) is implemented in terms of digital logic, registers, and control units. Any given ISA can have an infinite number of organizations. For example, each time a company like Intel brings out a new processor, it is usually an existing ISA with a new organization that increases its performance. The term microarchitecture is, today, synonymous with computer organization.

Coprocessor

A coprocessor is an auxiliary processor designed to perform certain functions not implemented by the CPU itself. Typical coprocessors are floating point units and memory management units. You could also design coprocessors to handler graphics operations or string handling. When a coprocessor is fitted, it may appear to the programmer as an extension of the CPU itself, or special coprocessor instructions may be needed to access it.

CPI

Cycles per instruction, CPI, is a metric of computer performance because it indicates how many clock cycles an instruction takes to execute. If all computers were clocked at the same speed and all programs had exactly the same number of instructions, the average CPI of a computer would be a good measure of its performance. Since computers are clocked at different rates, have different instruction sets, and different internal organizations, the CPI rating of a computer is largely useless as a metric. However, a designer working with a given computer may wish to optimize the CPI by using the resources (circuits on the chip) to reduce bottlenecks and therefore improve the average CPI.

CPU

The term CPU means central processing unit and describes the part of a unit responsible for fetching and executing instructions. The ALU is normally regarded as being part of the CPU (along with registers, buses and the control unit). Memory is not normally thought of as part of a CPU. Some use CPU as meanings microprocessor. However, in today’s world the term CPU is becoming meaningless because microprocessors are immensely far more sophisticated than in the days when CPU was coined. A modern microprocessors may have several autonomous cores, each core might have its own local cache memory, and a second level cache may be shared between the cores.

Data dependency

The emphasis on modern processing is parallelism; that is, the overlapping of operations. Consider a sequence like

A = B + C

D = E x A

P = D + X

You cannot execute these operations in parallel or out of order because each one depends on the previous calculation – hence data dependency.

Destination

When a computer operation such as addition takes place, the result of the calculation is stored in either a memory location or a register. The result is called the destination operand.

Disambiguation

Superscalar processor using out-of-order processing can execute instruction out of their program order as long as doing so does not change the meaning or semantics of the program. However, when the target of an operand is memory and you have two operations such as STR r0,[r2] and STR r1,[r3] the two destination operand appear to be different and it should not matter in which order these are executed. However, because pointers r2 and r3 may contain the same value, an error may occur if the program order is changed. Memory disambiguation is a process that determines whether such addresses conflict with each other and whether out-of-order execution is possible.

Disassembler

A disassembler reads object code (i.e., binary code) and converts it into mnemonic form. That is, it preforms the inverse function of an assembler. However, most disassemblers are unable to provide symbolic addresses and the disassembled code uses absolute addresses expressed in hexadecimal form.

Dyadic

An operator is said to be dyadic if it requires two operands. The following operators are dyadic: AND, OR, EOR/XOR, addition, subtraction, multiplication, division. Operations with a single operand such as 1/x, -x, sqrt(x) are said to be monadic.

Dynamic

The adjective dynamic means changing or varying. When used to describe a class of memory, dynamic RAM refers to a memory where data is stored as a charge on a leaky capacitor and gradually lost unless the data is periodically refreshed. In computer operations like shifting, dynamic refers to the ability to change the number of shifts at run time as the program is being executed. This facility is possible because the number of places to shift is held in a user-accessible register rather than a fixed value in the actual instruction. Some forms of branch prediction are dynamic in the sense that the processor observes the way a branch behaves over a period of time and changes its prediction according to past behavior.

Edge-sensitiveE

Digital systems operate with signals that have two states: electrically low or electrically high. For most of the time, digital systems are concerned only with whether a signal is in a high state or a low state. However, the input circuit of some systems can be configured to detect the change of state of a signal rather than its actual value. Such an input is called edge-sensitive. For example, you might wish a peripheral to interrupt the processor only when a voltage it is monitoring changes.

Endian

This is a word that comes from Jonathan Swift’s Gulliver’s Travels where a war took place between little enders and big enders. The ideological difference between these groups determined whether they opened their hard-boiled eggs at the little and or big end. In computing this has come to describe how you store data in memory locations. In a big endian world, the most significant byte of a word is stored at the largest address and in a little endian world the most significant byte is stored at the lowest address. This is of importance when you pass data between different computers because of the byte order.

Fetch cycle

Traditionally, the von Neumann machine is said to operate on a two-phase fetch/execute cycle where an instruction is read from memory, interpreted (decoded) and then executed. In modern computers, the term fetch cycle is largely meaningless because groups of instructions are fetched in parallel and most instructions are taken from cache rather than the main store. Moreover, instruction fetch and execution is overlapped so that, at any time, several instructions will be in various stages of both the fetch and execute portions of their journey from program to action.

Firmware

The term firmware is intended to fall between hardware (the physical structure of the computer) and software (the programs that run on it). Firmware is code that is not normally changed and which is used to control the operation of the system; for example, the BIOS is considered as part of the firmware. Firmware is periodically updated by manufacturers exactly like operating systems. Note that some might regard the microcode that’s build into the structure of some microprocessors as an example of firmware.

Flip-flop

Computers are constructed from two classes of circuit. The combinational circuit composed of gates has an output that is determined only by the logic values at its inputs and its Boolean transfer function. The flip-flop is a memory element whose output is determined by its current inputs and its past state. Flip-flops are also called bistables and are used to construct memory element, registers, counters, and sequencers. Flip-flops themselves are constructed from conventional gates that are cross-coupled. That is, two or more gates are arranged in a cyclic fashion so that the output of gate A is the input of gate B and the input of gate B is the output of gate A. Consequently, the state of a flip-flop is defined in terms of its current state (that’s what memory is).

Frame pointer, fp

A subroutine might require work space for any temporary values it creates. By using the top of the stack as a work space, it is possible to make the program re-entrant (i.e., it can be interrupted and re-used without corrupting the current work space). This work space is called a stack frame. You can use an address register to point to this frame and then access all data with respect to this register (called a frame pointer). The 68000's LINK instruction is used to create a stack frame and an UNLK instruction to remove it. ARM processors use normal pointer manipulation instructions to implement a frame pointer.

Geometric mean

There are many ways of averaging a set of numbers; for example, by taking the arithmetic average. The geometric mean of n numbers is defined as the nth root product. For example, the arithmetic mean of 1,2,3,4 is ¼ x (1+2+3+4) = 2.5 and the geometric mean is (1 x 2 x 3 x 4)¼ = 24¼ = 2.2134. The geometric mean is used to average the results of individual computer benchmarks by SPEC when measuring the performance of a computer. This use of the geometric mean is considered inappropriate by some computer scientists.

Handshake

When two devices (or systems or circuits) communicate with each other, the term handshake implies a signal returned by one device to acknowledge an event. For example, if you store data in memory, the memory may return a data acknowledge handshake to indicate that the data has been received.

Heat

Heat is where all energy ends up. The power supplied to your computer from the line or from a battery all ends up as heat. In a high-performance processor this heat ends up in a very small place and must be removed rapidly or the temperature will rise until the chip is destroyed; hence the need for heat sinks and fans. Increasing clock rate increases power dissipation and current technology and manufacturing techniques can support clock rates only up to about 4 GHz. Reducing power dissipation is one of the chief concerns of today’s chip designers.

Hierarchy

The dictionary definition of hierarchy is “a body of persons or entities graded according to rank or level of authority”. In the computer realm, hierarchy indicates the relationship between the components of a system in terms of some parameter such as speed or complexity. For example, memory hierarchy refers to the categorization of memory (usually) in terms of speed. In this hierarchy, registers are at the top and optical storage or even magnetic tape at the bottom.

Hot switching

Normally a system should be powered down before it is reconfigured by changing its hardware. Until relatively recently, that was the norm for all digital electronics. Hot switching allows devices to be connected or removed without the need to power down and then power up again. For example, USB devices and some disk drive interfaces are hot switchable or hot pluggable. In order to implement hot switching you have to ensure that electrical contacts are made in an orderly fashion (i.e., the correct sequence of signal needed to prevent spurious operation) and that there is a suitable automatic software startup process.

Iillegal instruction

If an instruction is n bits long, there are 2n possible instructions. Not all these possible op-codes may be assigned to actual instructions. If a processor reads an op-code that does not correspond to a legal op-code, the processor generates an illegal op-code exception. Some processors do not do this and the results of hitting an illegal op-code are ill-defined.

ILP

Instruction level parallelism, ILP, describes a number of techniques used to accelerate the performance of processors. It covers techniques ranging from pipelining (overlapping the instruction execution) to superscalar processing (using multiple execution units).

Immediate

An immediate operand is one that forms part of an instruction and (in the ARM world) is indicated by the prefix '#'. For example, the immediate operand 5 in the instruction MOV r1,#5, is part of the instruction. When the instruction is executed, the operand 5 is immediately available, since the CPU does not have to read memory again to obtain it. Immediate addressing is sometimes referred to as literal addressing, because the operand is a literal or a constant.

Indirect addressing

In indirect addressing the address of the required operand is not provided directly by the instruction. Instead, the instruction tells the processor where to find the address. That is, indirect addressing gives the address of the address. The ARM, MIPS, and 68K all use register direct addressing in which the address of an operand is in a register. This addressing mode is specified by enclosing the address register in round brackets (68K) or square brackets (ARM). For example, LDR r1,[r2] means copy the contents of the memory location whose address is in r2 into r1. Indirect addressing is required to access data structures such as arrays, tables, and lists.

Interlocked

A handshake or other operation is described as interlocked only if each step in a process must occur before the next step takes place. For example, in an input operation a data available, DAV, signal is transmitted. This must be followed by a data accepted message. DAC. Then the data available message must be terminated before the data received is terminated; that is, DAV is negated before DAC.

Interrupt

An interrupt is a request for attention by an external device such as a mouse, keyboard or printer. When the computer detects an interrupt, it saves the current status, executes the program that deals with the interrupt, and then restores the current status. In this way, an interrupt can be thought of as “a program that is jammed between two consecutive instructions by a peripheral that wants attention from the computer”.

IOPS

IOPS or input/output operations per second is an indication of the speed of a storage device. IOPS are frequently used to provide a benchmark for the performance of solid state drives, SSDs.

ISA

The ISA, or instruction set architecture, describes the programmer’s abstract view of a computer; for example, it includes the register set, instruction set, and addressing modes. It is the specification of the computer (in terms of the operations it can carry out but not in terms of the speed or performance of the computer).

The symbol K is reserved for use as the unit of absolute temperature, degrees Kelvin, in the SI units of measurement. The lower-case symbol k is often used to indicate kilo or a factor of 1,000. Once upon a time someone notices that 210 was equal to 1,024 and they thought that it would be really neat to call this 1 K, where the upper-case K represents a binary kilo. So, a processor with 16 address lines could access 216 = 26 x 210 = 64 x 1 K = 64K memory locations. However, 216 = 65,536 in decimal and this means that you can also call this value 65k. Consequently, the size of a memory can be expressed in binary K or decimal k which causes confusion, particularly when the storage size is large. By convention, some use k to indicate decimal kilo and reserve K for binary kilo.

Note that the symbol b is used for bits and B for bytes; for example, 8 Kb indicates 213 bits and 8 KB indicates 213 bytes or 216 bits.

Latency

Latency describes the waiting time between a clock (or other trigger) and an event. For example, if you press the shutter button of a camera and it takes an image 150 ms later, then the shutter latency is 150 ms. In computing latency is often used to indicate the time required to access data on a hard disk, or to set up a communications link, or to access data in a DRAM. For example, a memory device may have a 50 ns latency and then be able to provide data every 10 ns after that.

Literal

In the world of ISAs and assembly language, a literal is an actual value that is to be used in an instruction as opposed to a value that is in memory or a register; The term literal means the same as immediate (because the value is part of an instruction and does not have to be retrieved). A typical instruction using a literal is ADD r0,r1,#4 where the literal 4 is added to the contents of register r1 and the result put in r0. Note that the literal is immediately ready in the instruction, whereas registers r0 and r1 have to be accessed.

Local variable

In computer programming a local variable has a scope and duration (lifetime) that extends only a particular function or subroutine. For example, if you call a function, space may be allocated for local variables required by that function. Once a return is made, those variables cease to exist and their memory space is (normally) reclaimed. At the level of ISAs and assembly language programming, local variables are created whether by automatically allocating new registers to the function. or by creating work space on the top of the current stack. Local variables are intimately related to a computer’s instruction set and stack-handling mechanisms.

Memory-mapped

Most computer architectures have only memory space and lack any input-output space. This means that there are no special provisions for dealing with input and output operations in terms of the hardware and software. Consequently, all I/O operations must use conventional memory accesses and a block of memory must be dedicated to I/O operations. For example, when you write to a region of memory dedicated as output, you transfer the data from that location to its intended destination (display, disk, network, etc). This can cause problems for the programmer because you have to remember not to cache memory-mapped I/O regions.

Memory wall

This is the big bad wolf of computing. To make computers faster and faster, all elements must get faster together, otherwise there will eventually be a bottleneck. In computing, that bottleneck is memory. Processors have been increasing in speed at a remarkable rate whereas the speed of memory has been increasing modestly. For example, in 1975 a processor might take eight clock cycles to execute an instruction and memory could provide data in approximately the same time. Today, a processor may be able to execute `100 instructions in the time it takes to make a single memory access. This represents an increasingly serious bottleneck that has come to be known as the memory wall.

Microcode

The term microcode is not to be confused with programming or microprocessors. It is the lowest level code in a computer and is normally not user-accessible because it is built into the hardware itself. Microcode allows machine-level instructions to be defined as a sequence of primitive operations at the gate level (e.g., clocking registers, gating data onto buses).

Moore’s Law

In the early days of computing it was observed that progress was so steady that the number of devices fabricated on silicon chips doubled every eighteen months or so. This observation became known as Moore’s Law. Over the decades, the exponential increase in the number of devices per chip has continued giving an element of legitimacy to Moore’s Law. Moore’s Law is sometimes used to refer to the ever decreasing cost of processor chips and the ever increasing speed of chips (although such extensions of Moore’s Law are strictly incorrect). Today, people are recognizing that Moore’s law is beginning to run out of steam as physical (atomic) limitations on the design of circuits are being approached.

Multicore

Computers once used a single processor. Then they were able to use two or four processors on the same motherboard. Today, the major semiconductor manufacturers are able to implement several processors on a single chip which frees the user form interconnection problems and complexity. The term multicore processor was coined to indicate a device with several internal processors. It is a multiprocessor on a chip.

Multiplication

Multiplication is important because the multiplication of two m-bit numbers yields a 2m-bit product. This means that if you design a multiplier instruction that multiples the contents of two registers, then you have either to split the result between two registers or to truncate the result. Moreover, conventional multiplication does not work for two’s complement numbers which means that you have to implement signed and unsigned multiplication. Consequently, whereas most architectures have generally the same add and subtract instructions, multiply instruction vary quite markedly from computer to computer.

Nonuniform memory access

Multiprocessor systems come in two flavors: uniform memory access (UMA) and nonuniform memory access (NUMA). NUMA-based multiprocessors do not provide equal memory access to all processors, consequently the memory access time for a given memory block may vary from one processor to another. NUMA systems are more scalable than UMA systems; that is, you can build very large NUMA multiprocessors.

Non-volatile

Traditional main store is fabricated with semiconductor DRAM which has an access time of the order of 50 ns. Unfortunately, DRAM is volatile because the data is lost when you power it down. You have to save data to disk on power down and load it on power up (the booting process). This is time consuming. Non-volatile memory retains data without power. Unfortunately, the mainstream non-volatile technology, flash memory, has great limitations. It is more expensive than DRAM, a write operation takes far, far longer than a read operation and the number of write cycles is limited because it works by brutally dragging electrons through an insulator by means of gigantic electric fields and this will eventually destroy the insulator. New technologies are emerging that are non-volatile, have short write cycles, and can survive a much larger number of write operations.

No operation (NOP)

A no operation instruction is an instruction that does not affect the state of the machine other than to advance the program counter; that is, it has no effect. Some of the purposes of NOP instructions are to support synchronization, avoid the effects of hazards in pipelines, and even to provide known delays in the code.

Open-collector

This term describes a type of transistor gate used in output circuits. Its characteristic is that an open-collector gate can actively drive a bus line down to zero (low voltage) but not up to one (high voltage). If you have ten open-collector circuits connected to a bus, any one of them can be asserted low to drive the line down. This is also called wired OR logic because the bus goes low if any output goes low (in negative logic terms). A resistor is used to weakly put up the bus into an electrically high state when no device is actively driving it low. Open-collectors (or open-drains in MOS technology) are used to support interrupt request and similar buses.

Out-of-order

Superscalar processors execute multiple instructions at the same time. In out-of-order, OOO, execution instructions may be executed out of their normal program order to make best use of available resources. However, the hardware must ensure that OOO execution does not change the semantics of the program and the correct result is achieved.

Parallel

The term parallel indicates that some actions or operations are carried out at the same time. For example, parallel processing means that a program is divided into parts and the various parts can be distributed across several processors to speed up computation. It is also used in data transmission to indicate that a group of bits are transmitted simultaneously over several data paths (in contrast with serial transmission where bits are transmitted one after another along a single data path).

Pipelined

A sequence of operations can be performed one-after-the-other, with the next operation beginning when the current operation ends. In a pipelined system, a new sequence of operations begins before the current sequence of operations has been completed. Suppose, this sequence of operations is the execution of a computer instruction. A pipelined computer starts executing an instruction and then gets the next instruction and begins executing that too before the current instruction has been executed. A highly pipelined computer may have four or more instructions in various stages of execution at any instant. Pipelining can dramatically improve the performance of a computer.

Performance

No one knows what this word means. But that doesn’t stop advertising executives using it. However, Chapter does attempt to describe how people have gone about measuring the speed of a computer. Performance is generally regarded as how fast a computer is; that is, how long it takes to execute a task.

Pointer

A pointer is a variable whose value is an address and is one of the most powerful features of an instruction set architecture. Pointers are always used in registers and these are called variously, pointer registers, address registers, and index registers. Because a pointer is a variable, the element pointed at may be changed by modifying the pointer. Consequently, pointers can be used to access arrays, tables, lists and vectors. RISC processors like the ARM processor provide only pointer-based addressing. ARM processors indicate pointer-based addressing by square brackets; for example, LDR r0,[r1], means load register r0 with the contents of memory pointed at by register r1.

Predicated

The expression predicated execution indicates a system where an instruction may or may not be executed when it is fetched from memory. In general, all instructions are executed when their turn comes. However, in predicated execution, a predicate is tested. If the predicate is true, the instruction is executed. If it is not true, the instruction is ignored or squashed or annulled. The ARM implements predicated execution by making execution dependent on the state of the condition code bits. The Itanium implements predicated computing by associating one of 32 predicate registers with an instruction; if the corresponding predicate bit is true the instruction is executed. Predicated computing is also called gotoless computing because it can avoid the need for branch instructions.

Prediction

Prediction in computing means exactly the same as it does in real-life. It means guessing the outcome of an event in advance. In computing prediction is associated with branching (taking the branch or not taking it before the outcome of a test is known). For example, if we write IF X = Y THEN P = Q ELSE P = R we can set P to Q if we predict that X is equal to Y before we have actually performed the text. Prediction means that we can speed up computation is we are often correct. If a prediction is wrong after we have made the appropriate test, we have to wind back to the point before we made the prediction.

Program counter

Traditionally, the program counter (PC) is defined as the register that contains the address of the next instruction to be executed. After use, the program counter is incremented to point to the next instruction. A change of the flow of control is implemented by changing the value in the program counter to point to a different point in the program. Most computers do not allow the programmer to directly access the program counter – the ARM processor is a notable exception. In practice, this simple definition of a program counter is less than accurate today, because pipelining (instruction overlap) means that the program counter is not pointing at the next instruction (which is almost meaningless when several instructions are in various stages of execution).

Processor status

The processor status defines “what a computer is up to”. It consists of the program counter (the location of the next instruction), the status word (the outcome of the previous instruction) and the contents of its registers. The importance of a processor’s status is simple. If you capture or save the current status, you can use the processor for a different task. If you then restore the processor status, it can continue as if nothing had happened.

RAID

A Redundant Array of Independent Disks defines a collection of two or more disk drives that appears to the computer as a single drive. The purpose of a RAID system is to provide a higher level of reliability by distributing data across disks or to provide more storage by combining physical disks into a larger logical disk. RAID software and hardware is now built into all high performance PC motherboards. The user can choose the specific RAID configuration (storage capacity, performance, reliability) to suit his or her own needs.

A register is a storage element that holds a single word (i.e., string of bits). Registers are used to store temporary data in a computer and have “names” such as PC (program counter) or MAR (memory address register). Some registers are invisible to the programmer and cannot be directly accessed by computer instructions – these registers are used internally to control the operation of the computer. Some registers are user-visible; that is, their contents can be modified by the programmer; for example, the instruction ADD r1,r2,r3 adds the contents of register r2 to r3 and puts the result in r1. Each of the registers r1, r2, and r3 are visible to the programmer. When this instruction is executed, other registers within the processor may be involved in the operation but the programmer is unaware of them.

A programmer may use a register as temporary storage and then later use the same register to store something else. In superscalar processing this presents a problem because the second use of the register has to wait until the first use has been completed. In register renaming, the second use of the register is given a different name and the two operations can now take place in parallel without a name conflict.

This is a mechanism used to increase the apparent number of registers in a processor’s ISA. See windows.

RISC

A convenient term (reduced instruction set computer) for a generic class of ISAs. In particular, RISC machines are register-oriented with load/store instruction sets; that is, the only operations permitted on memory are read and writes (and not operations like add a number to data in memory). RISC processors are inevitably highly pipelined and can execute up to one instruction per clock cycle.

RTL notation

An algebraic notation used to define operations taking place at the level of registers in a computer (i.e., the transfer of data between functional units). A back arrow is used to indicate the transfer of data and square brackets indicate the contents of a storage element; for example,

[PC] ¬ 1000 Load the PC register with the number 1000.

[400] ¬ [MBR] Copy the contents of the MBR register to memory location 400.

SSD

SSD or solid state drive is an innovation dating back to about 2010. Instead of storing secondary data by magnetizing the rotating surface of a rotating platter, the SSD uses flash semiconductor technology to store information without moving parts. SSD is fast, compact, low-power and expensive.

Semantics

Semantics is the study of meaning and, in the computer world, semantics indicates what a program means or does. The semantics of a program describes its outcome. A semantic error indicates an error that prevents a program from doing what it is intended; for example, if you write a program to add up the first 100 numbers and it adds up the first 101 numbers, that is a semantic error. In other words, it’s an error of logic rather than an error of grammar (i.e., an error caused by incorrectly writing down an operation due to a spelling mistake or punctuation error). The notion of a semantic error is important in high-performance computing where re-ordering the sequence in which instructions are executed may change the outcome of the program (i.e., a semantic error).

Semiconductor

There have been three electronic revolutions. The invention of electronics itself and passive devices (resistors, capacitors, inductors, relays). This gave us control systems and primitive systems. The second revolution was the therm ionic vacuum tube that gave us electronics, amplification, radio, television and electronic computers. The third revolution gave us the semiconductor in the 1950 that performed the same operation as the vacuum tube but in a fraction of the space. This gave us modern electronics and the microprocessor. Semiconductor technology relies on two aspects of matter. A semiconductor can be used to control the flow or electrons by using tiny traces of impurity atoms in the bulk semiconductor (usually silicon). Just as importantly, it is possible to manufacture complicated semiconductor devices containing millions of transistors using photographic techniques.

Serial

The term serial indicates a sequence of actions that take place one after another, in contrast with a parallel process where multiple actions take place at the same time. Ultimately, the performance of computers is strongly dependent on the fraction of a job or task that cannot be executed in parallel and must be executed serially.

Setup time

When a signal (e.g., the data from a peripheral or data from memory) is sampled or captured, it is normally latched into a flip-flop by a clock. The setup time is the time for which the data must be valid prior to the clock in order for it to be captured reliably. It’s rather like the time you have to arrive at the gate before the aircraft’s departure to ensure that you get on it. Similarly, the hold time is the time for which the data must be stable following the clock in order to guarantee correct operation.

Short vector

A vector is sometimes defined as quantity with a magnitude and direction. A n-component vector is also a point in n-dimensional space. Essentially, it is a set of values or coordinates that define an element. Vectors can be of any length (i.e., number of elements). However, in multimedia processing most vectors have typically four to 16 components and this class of vector is called a short vector. The term has been coined because ISAs now include special-purpose instructions intended for use with short vectors.

SIMD

Single instruction multiple data, SIMD, is a term employed by Flynn to categorize one class of parallelism in which one instruction acts on multiple data elements. All mainstream high-performance processors now implement multimedia based instructions that implement SIMD operation by defining operations that act on, say, eight bytes in a 64-bit word. A SIMD add would allow you to add eight bytes in register A to 8 bytes in register B in parallel. These operations are very important in graphics.

Sign-extension

Most processors represent negative integers by their two's complement. One of the properties of a two's complement number is that a number in m bits can be represented in m+1 bits by replication the most significant bit. For example, if the two 4-bit two's complement numbers 0011 and 1011 are converted to a 6-bit format, they become 000011 and 111011, respectively. The process of increasing the number of bits while preserving the correct two's complement value is called sign-extension. In practice, sign-extension is normally used to extend a short literal to 32 bits when it is used to calculate an address.

SMP

One form of multiprocessor architecture is called symmetric multiprocessing, SMP, because the processors and memory are symmetric with respect to each other; for example, three identical processors connected to a common bus and common memory. The processors are controlled by a single operating system. The majority of small multiprocessor systems fit into this category. SMP systems are not very scalable (i.e., you can’t simply add more and more processors) and NUMA (non-uniform memory access) processors are capable of a greater degree of scalability by partitioning memory between groups of processors.

Split cache

Cache memory is fast memory holding frequently accessed information. A single cache holds both instructions and data. A split cache uses two independent caches; one for data and one for instructions. Consequently, both data and instructions can be accessed in parallel. Moreover, data and instruction caches have dif rent characteristics and each cache can be optimized for its intended role.

Static

The adjective static means fixed or unchanging or varying. When used to describe a class of memory, static RAM, it refers to a memory where remains stored until it is either overwritten or the memory powered down (if it is volatile). In computer operations like shifting, static means that the number of shifts is fixed when the program is written and cannot be changed during program execution. Static branch prediction mechanisms predict the outcome of a branch before the program runs (e.g., based on the op-code class). Such mechanisms cannot automatically adapt to real program behavior.

Source

When a computer operation takes place, the computer operates on data to create a result. The data used by the computer is called the source operand. Some instructions such as add or multiply require two source operands.

Stack

The stack is a fundamental type of data structure which is also called last-in first-out, LIFO. Stacks are used to manage subroutine calls and return and to provide local work space for subroutines. CISC processors usually provide automatic stack handling instructions, whereas RISC processors require the user to write appropriate stack-handling routines.

Stack pointer

The stack pointer, usually in a register, contains the address of the top of the stack. The stack pointer is instrumental in implementing subroutines, procedures and functions. CISC processors generally have a dedicated stack pointer. RISC processors allow the user to select which register will be the stack pointer – but conventions exist to make it easy for all readers to follow code; for example, the ARM programmers use r13 as the stack pointer.

Status register

A processor’s status register contains information that defines its current operating mode. This information generally consists of two parts: condition code and operating status. The condition codes reflect the Z (zero), V (overflow), C (carry), and N (negative) bits following the last time they were updated. Condition code bits are used by conditional branch instructions; for example, BEQ (branch on zero). The system status bits define the overall or global operating mode of the processor. Typically, status bits control the way in which a processor responds to interrupts and exceptions.

Superscalar

A superscalar processor uses multiple execution units to accelerate the performance of a processor. Both RISC and CISC processors may use superscalar techniques; that is, superscalar technology is not associated with any particular ISA. Superscalar technology is responsible for reducing the average instructions per cycle below the 1.0 minimum that can be achieved by pipelining alone. The key to superscalar performance is the ability to take a group of instructions and to assign them to multiple execution units while handling the conflicts for resources (functional resources such as adders as well as registers).

Synchronous

Most digital systems are synchronous in the sense that all operations take place at a clock pulse (at the rising or falling edge). The circuit designer has to ensure that all the signals generated by one clock pulse are stable (ready) before the next clock pulse. How fast you can clock a computer is related to how rapidly internal operation can be completed. For example, if the worst-case circuit has five gates in series and the propagation delay of each gate is 0.1 ns, then the total delay is 5 x 0.1 ns = 0.5 ns, which corresponds to a clock rate of 2 GHz.

Track

In recording media, a track is a data structure in which information is stored. On a magnetic disk, a track is a concentric ring which is, itself, subdivided into units called sectors. On an optical disk (CD, DVD, Blu-ray) a track is a continuous spiral. On magnetic tape, a track is a parallel band along which data is written (usually, there are nine parallel tracks across the tape). However, helical recording techniques store data in parallel strips written in a slanting fashion across the tape.

Transmission line

A bus in a backplane behaves like a transmission line (a term from electronics). A transmission line has certain characteristics determined by its physical dimensions and the material between the two signal paths. These characteristics are the speed of propagation along the bus and what happens when a signal (pulse) reaches the end of a bus. A poorly designed transmission line may have intermittent behavior if reflected signals travel along the bus from end to end.

Tristate gate

A tristate or three state gate has three output states: active high, active-low, and disabled (or high impedance). These gates are used to drive buses because they can drive the bus high, low or simply be disconnected from it and allow another device to drive the bus. No two tristate buses may drive the same bus line actively at the same time otherwise the circuits could be damaged.

Two’s complement

Because numbers in a computer are stored as strings of bits, a negative number cannot be represented by putting a minus sign in front of it. The most popular way of representing negative numbers is by means of their two’s complement. In m bits the two’s complement of N is defines as 2m – N which is generated by inverting all the m bits of N and adding 1. Two’s complement is used because the subtraction of X – Y is performed by adding the two’s complement of Y to X (you don’t need a subtractor to perform subtraction because you add the easily formed complement). Two’s complement is not used to represent negative numbers in floating-point arithmetic where a simple sign-bit indicates positive or negative. Moreover, floating-point exponents are represented in biased form where a constant is added to the actual exponent. For example, if the constant is 128, a negative exponent of -20 would be represented as -20 + 128 = 108.

Virtual Memory

This is the memory that can be addressed by the computer. It is abstract in the sense that it takes no account of where the actual data is. The advantage of virtual memory is that it frees the programmer (operating system) from worrying about where to put data or how to assign memory locations to data. Moreover, it allows the DRAM main store and disk memory to appear to the computer as one seamless storage unit. A memory management unit is needed to translate virtual addresses into physical addresses that correspond to actual data locations. The memory management unit works with the operating system to move data from disk to DRAM whenever a virtual address corresponds to a location on disk.

Volatile memory

Memory is said to be volatile if its data is lost when it is powered down. All semiconductor DRAM is volatile (which is why you have to wait so long when powering down and booting up). Magnetic memory (hard drives), optical storage, and flash memory are all non-volatile and do not lose data. Unfortunately, all these mechanisms either have long access times, intolerably long write cycles, or very limited lifetimes which makes them unsuitable candidates for DRAM replacement. Current research is aimed at replacing DRAM with non-volatile memory that has a low access time, a short write time, and that lives long and prospers.

VLIW

The very long instruction word processor, VLIW, uses an instruction that is far longer than that used by most computers because the instruction specifies multiple actions in one instruction or bundle. VLIW processors require multiple execution units in order to execute these multiple operations in parallel. A VLIW is a superscalar in the sense that several operations are executed at once. However, the implication of superscalar is that the processor reads several instructions from the instruction stream and then decides which can be executed in parallel. The VLIW requires a compiler or programmer to select the instruction that can be executed in parallel when the code is written or compiled. You could also say that the notion of superscalar does not affect a processor’s ISA (any given ISA may be superscalar or not superscalar). However, a VLIW processor very much defines its ISA.

von Neumann

The terms provides us with a double fiction! Historically, von Neumann machine is the term used to describe the stored program computer in which data and programs are stored in the same memory. A von Neumann machine is characterized by having a two-phase fetch/execute cycle. In the first phase, an instruction is fetched from memory and in the second phase it is executed. The two fictions are: (1) The term is used in honor of the great mathematician and computer pioneer von Neumann who is now believed to have taken more than his fair share of credit for the development of the computer, (2) Although we call computers von Neumann machines, modern processors have advanced so much that the simple fetch/execute notion is virtually meaningless. For example, in pipelined processors multiple instructions are fetched and the computer is run as a sort of production line with many instructions in various stages of execution. Moreover, the notion that instructions and data share the same memory is no longer valid because instructions are often stored in an instruction cache and data in a data cache. Consequently, both instructions and data can be access simultaneously. Finally, with superscalar out-of-order processing, you can have the situation that instructions later in the program are executed in advance of the current instruction in a sort of electronic queue jumping.

Weighting

In the positional notation, the value of each digit depends on its weighting. Consider the decimal number 276. The weighting of the 6 is one; the weighting of the 7 is ten, and the weighting of the 2 is one hundred. The weightings used in this text are 10 (decimal), 2 (binary), and 16 (hexadecimal). These weightings are sometimes referred to as natural weightings because they define the positional notation system employed in conventional arithmetic. However, you can define other weightings. For example, if I defined a weighting of 6,3,5, the number 123 would be equal to 1 x 6 + 2 x 3 + 3 x 5 = 2710

Window

In the context of RISC architectures, a window is a term applied to a set of registers associated with a specific instance of a procedure (or subroutine). The number of registers in both RISC and CISC processors is limited by the number of bits in an instruction word that can be assigned to register selection; for example, MIPS uses 5 bits to select one of 32 registers and ARM uses 4 bits to to select on of 16 registers. Register windows allow the use of more registers by associating a set of registers with each new procedure call. The available registers are typically divided into four groups: global (can be accessed from all procedures), local (can be accessed only from the current procedure), and input/output (registers shared with a parent and a child procedure). This mechanism increases the number of apparent registers and provides local workspace. Unfortunately, when all available registers (windows) in use, calling a new procedure requires the dumping of an old window to memory and this takes time. So much time that windowing has fallen out of flavor. It is implemented by SPARC processors.

Write back

When data is written to a cache memory, it must also be written to main memory since the copy in cache is a temporary copy. In order to improve system performance some cache systems do not update main memory when cache is written to; they only update memory when a cache line is ejected from the cache. This is called write back.

Write back

When data is written to a cache memory, it must also be written to main memory since the copy in cache is a temporary copy. If data is written to both cache and to main memory in parallel, the process is called write through. However, it is slower than write back because it ties up memory for the duration of the write cycle. However, it is more reliable because a write back mechanism would not update data if the system crashed.

Word

The fundamental unit of data processed by a computer is the word. As microprocessors have evolved, the typical word size has increased steadily. Computers have been designed with word sizes of 4, 6, 8, 16, 32, 64, and 128 bits (this list is not complete). Today, most processors have word sizes of 2, 4, or 8 bytes corresponding to 16, 32, or 64 bits. Computers are usually byte addressed; that is, each memory address differs from the previous one by 1, and an address corresponds to a specific byte. Because the words are multiples of bytes (e.g., 4), consecutive word addresses differ by 4 and words addresses are 0, 4, 8,… This means that you have to remember to increment pointers by 4, not 1.

Zoned storage

When data is stored on hard drives that rotate at a constant angular velocity (i.e., a constant rpm), the number of bits that can be stored along an outer track is greater that the number that can be stored around an inner track. The physical size of a bit (the region of magnetization holding a bit) should be as small as possible to ensure best use of the recording area. In order to archive this, the number of bits on inner tracks is lower than the number of bits on outer tracks. In some disks tracks are grouped into zones and the number of bits per track is constant within a zone but varies from zone to zone. This technique improves storage efficiency.

Top of page Top of page Top of page Top of page Top of page Top of page Top of page Top of page Top of page Top of page Top of page