ARMs for the Poor: Selecting a Processor for Teaching Computer Architecture
Abstract – Teachers of computer architecture and organization courses have to choose a target processor to illustrate the basic principles of instruction set design. In this paper we suggest that it is time to choose the ARM processor architecture that is markedly different to those used in most current courses.
A specific computer architecture is required as a vehicle to teach about registers, addressing modes, instruction types, and so on. Resorting to a hypothetical teaching machine reduces the student’s learning burden and makes their learning curve shallow, but failing to introduce them to the complexities they will encounter in the real world can destroy their motivation.
Teachers are concerned not only with covering a body of knowledge; they must motivate
students and create a sense of excitement. In a discipline as rapidly changing as
computer science, only those students who can adapt to change are likely to thrive
over the four or more decades of their career. This paper explains why the ARM architecture
is an excellent vehicle for teaching computer architecture; in particular, its predicated
execution, inclusion of shifting in all data-
Computer Architecture Curriculum
Computer architecture is a key component of degree courses in computer science; in particular, the joint ACM IEEE Computer Society Computing Curriculum spells out what should be included in the core curriculum for computer architecture [1] [2]. There is a widespread consensus on the content of computer architecture courses, although, in the UK, there is a growing tendency to combine architecture with computer networks or operating systems because of the way in which curricula overlap.
Table 1 lists the key components of the computer architecture curriculum proposed by the recently revised CC2001 report. Most of the topics refer to elements of the computer system other than the CPU itself. Table 2 expands the curriculum and includes the learning objectives for the CPU [1]. Note that no specific computer is specified and the individual teacher is free to choose a suitable example.
Table 2 demonstrates that the intent of the curriculum is to cover the underlying
principles of the operation of the computer and not the details of either its low-
TABLE 1 PROPOSED ARCHITECTURE IN THE REVISED CURRICULUM 2001
TABLE 2 PROPOSED ARCHITECTURE IN THE REVISED CURRICULUM 2001
Computer architecture [core]
|
Learning objectives:
|
The Professor’s Dilemma
In practice computer architecture is taught by real professors to real students;
and that rather complicates matters. A glance at typical computer architecture texts
[4]-
Why do professors make life difficult for themselves by using CPUs that were made by engineers wanting to maximize market penetration and company profits? Why don’t they specify a simple, hypothetical teaching machine and make it easier to teach the subject? Some professors do invent their own machines; I do myself for the first two weeks of the course. Most do not. Professors I have spoken to say that students do not want to use hypothetical hardware because they feel it is unrealistic and does not give a true picture of the real world they will soon be entering. Moreover, I find that students prefer to use hardware like the PC because they feel familiar with it. When I based my courses on the 68K microprocessor, it was well received by students in the days of the Apple Mac that incorporated it. When the 68K was dropped by Apple, students became less enthusiastic.
The Professor ARMed
The most visible role of a professor is to teach a student a given body of knowledge and to examine the quantity and quality of their knowledge. The real job of the professor is to instill in the student a love of the subject [6]. Without that, it’s difficult to transform the student into an autonomous learner who will work independently and continue to build on the course after its end.
I decided to change the processor I use to teach computer architecture from the Motorola 68K to the ARM family. The principle reasons for making the change are that the ARM covers the requirements of existing curricula, is easy to learn, and has an elegant and sophisticated architecture. Moreover, it is widely found in real systems.
ARM – the Background
Microprocessors used as vehicles to teach computer architecture are often mainstream
industry-
Advanced RISC Machines Ltd. was founded in the UK in 1990 and changed its name to
ARM Ltd. Unusually, ARM does not manufacture microprocessors. It is an IP (intellectual
property) company that designs systems and licenses other companies to make them;
for example, ARM microprocessors are manufactured by Intel, Texas Instruments, and
Samsung. Indeed, ARM 32-
ARM in a Nutshell
As there is insufficient space here to discuss all the ARM’s attributes, we list its key features and highlight how they differ from other processors, pointing out the pedagogical advantages.
Register set: The ARM has 16 general-
Instruction Set: The ARM has a RISC load/store (register-
Instruction types: The ARM has a conventional integer data-
Subroutine call: Two subroutine call mechanisms are widely in use. CISC processors
are stack-
Shadow registers: Shadowing, where two physical memory locations share the same
logical name, is an important concept. For example, the 68K has two physical stack
pointers with the same name. One stack pointer is visible to the user programmer
and one to the operating system. By using different physical pointers, an application
program can’t corrupt the operating system stack. Shadowed registers allow the professor
to mine a rich vein of system security and reliability. The ARM has several shadowed
registers and the physical instance is determined by the interrupt and exception
handling mechanism. When the ARM is interrupted, a new bank of shadowed registers
is switched in. This allows an interrupt handler to access a clean set of registers
and avoid saving pre-
Literals: All computers provide a means of loading a literal (immediate value).
The ARM deals with literals in a unique way by providing a 12-
Shift instructions: The ARM implements a zero-
Highlights of the ARM instruction Set
Although we can’t cover all the ARM’s architectural features, there are three that are particularly important from a teaching point of view because they illustrate interesting and innovative features – some of which are excellent vehicles for engaging in class discussions with the students. These are: the shift, predicated execution, and addressing modes.
A shift operation moves a string of bits by one or more positions left or right. The difference between shift operations depends on:
The ARM implements shift instructions but in an entirely unusual way. The computer
architect is engaged in an eternal struggle to minimize the time taken to perform
operations. A designer’s ultimate goal is the zero-
ASL r0,#4 ;shift contents of register r0 left 4 places
ADD r1,r1,r0 ;Add the contents of r0 to r1
The time taken to execute this code is two cycles. The ARM implements shifts ingeniously
by shifting the second operand during a data-
ADD r1,r1,r0 ASL #3 ;shift r0 left before adding
and implements [r1] = [r1] + 8 * [r0]. The shift and addition are performed in a single cycle. To perform a shift without data processing, a shift can be placed in the data path of a move operation; that is
MOV r1,r1,ASL #3 ;shift r1 left before moving to r1.
The time taken to execute this code is two cycles. The ARM implements shifts ingeniously
by shifting the second operand during a data-
ADD r1,r1,r0 ASL #3 ;shift r0 left before adding
and implements [r1] = [r1] + 8 * [r0]. The shift and addition are performed in a single cycle. To perform a shift without data processing, a shift can be placed in the data path of a move operation; that is,
MOV r1,r1,ASL #3 ;shift r1 left before moving to r1.
There is quite a pedagogical significance in this operation. A little thought and
ingenuity in the design of the ARM’s architecture has significantly increased performance
without incurring a lot of additional logic. This demonstrates that tried-
The Delights of Predication
When teaching computer architecture it is important to let students know exactly
why you are using a particular architecture out of the many available. The aspect
of the ARM that I find most appealing is its predicated execution ability where an
instruction is executed if, and only if, certain conditions are met. Typical architectures
used in teaching lack predicated execution and each op-
A suffix can be applied to an ARM op-
If (x == 0) || (y < 5) p = p + 1;
A conventional assembly language uses two conditional tests and generate the following (illustrative) code:
CMP r1,#0 ;test x in r1
BNE exit ;if not zero then exit
CMP r2,#5 ;compare y in r2 with 5
BGE exit ;if greater than 4 then exit
ADD r3,r3,#1 ;increment p in r3
Exit ;exit point
Now consider the use of predicted code.
CMP r1,#0 ;test x in r1
CMPEQ r2,#5 ;if zero then the test y < 5
ADDLT r3,r3,#1 ;if y<5 then e=e+1
In this case only three instructions are necessary and the execution time is three clock cycles, whereas the previous version requires up to five cycles.
From a teaching point of view, the ARM’s predicated execution has the following advantages:
It demonstrates an alternative way of implementing a conditional cooperation.
It provides an introduction to branchless computing, which reduces the severe cost of branches in heavily pipelined computers.
Creative Programming
The position of assembly language within the curriculum is sometimes a contentious issue between academics. Some argue that prosaic programming techniques are the order of the day to ensure readable, maintainable, and reliable code. Others like to exploit a machine’s architecture to the full to extract the highest performance.
My own approach is to discuss the pros and cons of the situation including ethical considerations; for example, I ask students whether they would use a short cut to speed up a PC (they all say ‘Yes’). Then I ask them, ‘Would you use the same techniques when designing a nuclear reactor control system’… I came across a rather unusual application of an ARM instruction that both demonstrates the power of the ARM instruction set and the ability to write code whose immediate meaning is rather less than clear. Consider the single operation BIC r0,r0,r0 ASR #31.
The BIC forms the logical AND between the first operand and the logical complement
of the second operand. Both operands are specified as register r0. The ASR #31 shifts
the second operand right 31 places using an arithmetic shift that propagates the
sign-
Since BIC complements the second operand, if r0 initially contains a positive number, the AND will be carried out between the value in r0 and the compliment of 000..00, which is r0. If r0 contains a negative number, the AND will be between r0 and 0000…00 which is zero. That is, this operation implements: If (x < 0) x = 0;
This example demonstrates an insight into the nature of binary strings and binary arithmetic and demonstrates the power of assembly language.
Addressing Modes
I have observed that addressing modes and pointers give my students more difficulty than any other concept. Some students have great difficulty in distinguishing between a pointer and the value pointed at. A good machine architecture should help students overcome their conceptual difficulties with pointers and yet allow the introduction of more advanced topics.
The ARM has a simple pointer-
Figure 1 The ARM’s Indexed addressing modes
Traditional RISC processors, like the MIPS or PowerPC, do not provide an automatic stack mechanism. The ARM is closer to a CISC processor and provides a stack mechanism. Indeed, you can structure ARM’s stack to grow upwards or downward and to point to either the next element at the top or the next free element above that. Figure 2 illustrates typical ARM stack accesses.
Figure 2 The ARM and the Stack frame
ARM Extensions
One of the most unusual aspects of the ARM family is its extensions or personalizations that add considerable value to the ARM as a teaching machine. The Pentium processor is a Pentium processor. Always. The ARM can behave exactly like an ARM, but it can also mimic a very different processor; that is, the ARM’s instruction set can be changed at run time to provide a very different model of computation.
In order to understand the ARM’s versatility, you have to appreciate the market.
The ARM is intended primarily for end-
The ARM is designed so that it can dynamically adopt a 16-
The ingenuity of the ARM mode is that it is possible to map the Thumb mode onto the
ARM’s normal instruction set architecture. That is, 16-
Figure 3 illustrates the Thumb mode architecture. Of the ARM’s 16 general purpose
registers, only eight, r0 to r7, are directly visible to the programmer. Because
the Thumb architecture uses a two address format, with instructions of the form ADD
r0,r2 which implements [r0] = [r0] + [r1], only 6 bits/instruction are required to
specify register addresses as opposed to the ARM’s 12 (i.e., three 16-
Figure 3 The Thumb-
The Thumb mode instruction set is a simplified version of the underlying ARM instruction
set. Some of the ARM’s features, such as its predicated execution mode, have been
abandoned. Code compression yields a 30% improvement in code density over native
ARM code [4]. The ARM cannot handle exceptions in the Thumb mode. Thumb and ARM code
cannot be mingled. Special instructions are used to switch between the two modes.
In practice, a real ARM system will locate some code (including exception handling)
in a 3-
ARM has implemented a second version of Thumb that improves Thumb’s performance by
permitting some 32-
Classwork
It’s virtually impossible to teach the architecture of a machine effectively without a working example that students can play with – preferably at home on their PCs. Fortunately several ARM simulators are available to the student, either in a textbook or via the web [3], [7]. Figure 4 demonstrates the output of an ARM simulator where a student can execute a program line by line, observe the way in which registers change, view both memory and the stack and implement a simple input/output port.
As well as helping students understand the processor, class based simulators have an additional spinoff. They encourage students to work in groups and learn from each other very successfully. More importantly, simulators free the teacher to help those who are having particular difficulties.
Figure 4 Using the ARM Simulator
The following code, a simple loop, demonstrates that ARM assembly language is easy to follow (the MLT instruction is the multiply and accumulate instruction).
MOV r0,#0 ;clear total in r0
MOV r1,#10 ;FOR i = 1 to 10
Next MUL r2,r1,r1 ; square number
MLA r0,r2,r1,r0 ; cube and add to total
ADD r1,r1,#1 ; increment number
CMP r1,#11 ; test for end
BNE Next ;END FOR
Summary
It is impossible to teach a course on computer architecture without introducing a
computer, which means inventing one or using one off the shelf. This paper has suggested
that the ARM is an ideal choice for both the professor and the student because it
is easy to understand initially, teaching material is widely available, and high-
The purpose of this paper has not been to suggest that the ARM is yet another processor indistinguishable from all the rest. I have presented the case that the ARM has a wealth of interesting, indeed exciting, features such as predicated computing and compressed instruction encoding facilities that can inspire the student. Moreover, some of the concepts that the ARM introduces cut across the computer science curriculum, increasing its value as an educational tool.
References
[1] ACM, IEEE Computer Society, "Computing Curriculum 2001".
[2] Nelson, V,P, Theys, M,D, Clements, A, “Computer Architecture and Organization in the Model Computer Engineering Curriculum”, Frontiers in Education, Boulder 2003.
[3] Clements, A, “The undergraduate curriculum in computer architecture”, IEEE Micro,
Volume 20, No. 3, pp13-
[4] Patterson, D,A, Hennessy, J, L, “Computer Organization and Design”, Morgan Kaufmannn 4th edition 2009
[5] Clements, A, “Principles of Computer Hardware”, Oxford University Press 3rd edition, 2004.
[6] Brewer, E,W, “Professor’s Role in Motivating Students to Attend Class”, Journal of Industrial Teacher Education, Vol. 42, No. 3, 2005.
[7] Hohl, W, “ARM Assembly Language – Fundamentals and Techniques”, CRC Press, 2009
[8] http://en.wikipedia.org/wiki/ARM_Holdings
[9] Goudge, L, Segars, S, “Thumb: Reducing the cost of 32-
This is the text of an paper I wrote for an Frontiers in Engineering Education Conference