Introduction to Computer Systems and Software

Brian Bramer, DeMontfort University, UK (bb@dmu.ac.uk)

Goto Computer Systems Notes main page
 

Contents


1 Introduction to Computer Systems

1.1 Users of computer systems
1.2 A microcomputer system
1.3 Instruction and data storage
1.4 Low and high-level languages
1.5 Review of information storage


2 Computer Hardware

 
2.1 Memory
2.2 The CPU (Central Processing Unit) and co-processors
2.3 Input and Output
2.4 Development system configurations


3 Computer Software
 

3.1 Monitors and operating systems
3.2 Software development facilities
3.3 Cross assemblers and compilers
3.4 Absolute and relocatable code
3.5 Run-time system facilities


Answers to Exercises

References
 
 
 

1 Introduction to Computer Systems

Until the early 1980's computer systems were large, expensive and required many expert staff to maintain, operate and program them. The users of such systems were generally restricted to staff within large organisations that could afford the high purchase and operating costs. The advent of high-powered microcomputers has changed this, providing systems which are not only sufficiently powerful to be used throughout commerce and industry but cheap enough to be used in many every-day appliances, e.g. cars, video recorders, etc.
 

1.1 Users of computer systems

1.1.1 An end-user's viewpoint of a computer system

The vast majority of users of computer systems are end-users, in that they use computer systems as a tool to aid their everyday work. The computer system can be considered as a 'black box' into which information is fed for processing and then results are produced. The information input can range from text (a wordprocessor) or numerical data (a spreadsheet), entered via a keyboard, to diagrams (a CAD design system) entered via a mouse or a digitising tablet. Similarly, information output may be text, numbers or diagrams and pictures presented on a display screen or printer. To communicate with the system, end-users employ terms which are common to their everyday working environment, e.g. accountants use columns of numbers and artists use pictures.

In general, an end-user would purchase a complete computer system consisting of hardware (the physical components) and software (the programs that tell the hardware what to do) to suit their application environment. They do not need to have knowledge of how a computer system works or skills in the design, implementation and testing of programs. In fact, attempts by end-users to learn these skills would divert them from gaining more important knowledge, e.g.:
 

  1. The applications of computer systems to their specialty with consideration of limitations due to accuracy,  size problem which can be solved, etc.
  2.  Purchase and installation of a computer system (Bramer 1989):
    1.       carrying out a feasibility study to determine determine disadvantage of installing a computer system
    2.        drawing up a specification of a computer system to suit their requirement
    3.        from the specification draw up an ITT (Invitation to Tender) which will be sent to prospective vendor
    4.        processing tenders and shortlisting vendors systems using various criteria, e.g. cost, functionality,  usability
    5.       system installation and acceptance tests
    6.      training, system management, maintenance, etc.

1.1.2 A computer programmer's viewpoint

1.1.2.1 Application programmers

In many application areas there is still a need for specialists with a knowledge of computer programming. These specialists may be professional computer scientists implementing software packages in particular application areas, or scientists or engineers writing programs for their own applications. Such programs would generally be written in a high-level problem solving language that, apart from some appreciation of information storage and accuracy limitations, requires little knowledge of the internal workings of computer systems.
 

1.1.2.2 Low-level programmers and hardware designers

Computer scientists and electronic engineers working in the areas of real-time systems, hardware design, process control, etc., are expected to implement low-level programs, e.g. to control input and output devices. As such they require a knowledge of computer architecture and skills in low-level programming techniques. In the past, for reasons of efficiency and speed, such programs would have been written in an assembly language. However, programming in assembly language is difficult, requiring highly skilled programmers, and large programs implemented in it are very difficult to update and maintain. Today, the majority of low-level programs would be written in a systems implementation language (a language which has many of the facilities of an assembly language, e.g. bit-manipulation, ability to access to I/O registers) such as C, C++, Modula 2 or Ada. Assembly language would be used for time critical modules and modules which cannot be coded in the high-level language. In educational establishments the teaching of assembly languages is still regarded as important  for a number of reasons, including:
 

1.2 A microcomputer system

At a superficial level a computer system can be considered as consisting of three components, namely hardware, software and data.
 

1.2.1 Hardware

The term hardware embraces the physical components of the system:
  1.  The box which contains the printed circuit boards, power supply, etc.
  2.  The display screen and keyboard for user interaction.
  3.  The peripheral devices such as disks and printers.
The internal electronic circuits of modern computers are made up from a number of integrated circuit chips and other components. An integrated circuit chip is a small packaged device a few centimetres square which contains complex electronic circuits. The heart of the modern microcomputer is the microprocessor which is an integrated circuit chip containing the central processing unit (the basic control and processing circuits) of a small computer system. A complete microcomputer system contains a microprocessor plus memory, input/output devices, power supplies, etc.

However, before the computer hardware can perform a task (for example add numbers or read a character from a keyboard), it requires a program to tell it what to do.
 

1.2.2 Software

Software comprises the programs that tell the hardware what to do. A program is a sequence of instructions stored in the memory of the computer system. The central processing unit fetches an instruction, decodes it and then executes the required operation (e.g. to add two numbers). When an instruction has been executed the next instruction is fetched, decoded and executed, etc. A program may be very simple, for example, to calculate the average of ten numbers, or very complex, as would be required to draw a television quality picture on a display screen.
 

1.2.3 Data

The data is the information to be processed by the computer system. Data may be simple numbers for mathematical calculations, text such as names and addresses or more complex structures such as pictures or drawings. The instructions that make up the program define what data is to be processed, in what form and at what time.
 

1.3 Instruction and data storage

Within the computer hardware there must be memory to store the instructions of the program to be executed and the data to be processed.

1.3.1 Representation of integer numbers

Within modern computer systems the basic element of storage is the binary digit (or bit) which can represent a 0 or a 1. The reason for this is that it is very easy to build electronic switches where an off/on condition is used to represent a 0/1 binary value. Although a single bit can only have two states, 0 or 1, a sequence of bits can be used to represent a larger range of values. Such a sequence is called a word of storage and is usually 8, 16, 32, 64 or 128 bits in length. An 8-bit word, for example, can represent an unsigned positive number in the range 0 to 11111111 binary (0 to 255 decimal) thus:
 
 bit            7          6           5           4         3          2           1          0
 bit value             27            2      2      2      2            2              2      2

In the diagram above the least significant or rightmost bit, bit 0, represents 20 or 1 and the most significant or leftmost bit, bit 7, represents 27 or 128 decimal (the convention for identifying the bits within a word is that the rightmost or least significant bit is numbered 0). The combinations of 1s and 0s of the 8-bit word thus represent an unsigned value in the range 0 to 11111111 binary (0 to 255 decimal). The general term given to an 8-bit storage word is a byte which is used by the majority of modern computer systems as their fundamental unit of storage. To represent values that are too large to store in 8-bits a number of bytes may be used. For example, a 16-bit number (made up from two bytes) can represent an unsigned value in the range 0 to 65535 decimal.

Many commercial and scientific calculations require the use of signed numbers and the majority of modern computer systems use twos complement binary arithmetic in which the most significant bit is used to store the sign (1 for a negative number and 0 for a positive number). Using twos complement binary arithmetic an 8-bit number can represent values in the range -128 to +127 and a 16-bit number values in the range -32768 to +32767. The Motorola MC68000 allows arithmetic operations to be carried out on 8-, 16- and 32-bit signed and unsigned numbers (the programmer would use the most appropriate for the application) which can represent numeric values shown in Table 1.1.

In practice it would be both difficult and error prone to enter data directly in binary form, so hexadecimal (base 16) or decimal are more commonly used. It is a relatively easy task to convert between binary and hexadecimal.
 
 
    Data size   unsigned range        signed range
        8-bit         0 to 255         -128 to +127
        16-bit         0 to 65535       -32768 to +32767
        32-bit         0 to 4294967295  -2147483648 to 2147483648
Table 1.1 Numeric range of 8, 16 and 32-bit unsigned and signed numbers
 

1.3.2 Representation of real numbers

Integer numbers are suitable for whole number calculations (i.e. no fractional component) and where a limited number range is acceptable. The majority of scientific and engineering applications use numbers with fractional components and which can vary in size from very small to very large values, e.g. from the size of atomic particles to intergalactic distances. For such applications programming languages provide a data type called real numbers which is represented internally in floating point format.
 

Exercise 1.1 (see Appendix for sample answer)

onvert the following numbers to binary and hexadecimal; do the calculation in each case and convert the result back into decimal (use signed 8-bit twos complement binary numbers):

16     45     110     110

+32   +60    - 45    + 45

The last calculation gives a condition called overflow; explain what has happened.
 

1.3.3 Character data

Character data is used within a computer to represent text (such as names and addresses) and consists of the usual printable characters, e.g. the alphabet A-Z and a-z, digits 0-9 and other characters such as +, -, *, /, !, $, %, and &.

Each character is stored in a byte of memory and represented by a particular binary pattern or character code. To enable different computers, terminals and printers to be connected together there are a number of standard character codes. The most commonly used character code is ASCII (American Standard Code for Information Interchange). The character A, for example, is represented by the binary pattern 01000001 (41 hexadecimal), and B by 01000010 (42 hexadecimal). The majority of computer users do not need to know or even be aware of these codes as the keyboard and display equipment converts between the characters and the internal codes automatically, i.e. if the user hits the key A on the keyboard the binary value 01000001 is sent to the computer.
 

Exercise 1.2 (see Appendix for sample answer)

Determine the decimal, hexadecimal and binary values of the ASCII character code for the following characters:

        A B C Z a b c 0 1 2 9 - ? =

Can you see anything significant about the order of the ASCII character codes for letters and digits and why the latter is useful when writing a program to read a sequence of digit characters to be converted into a numeric value ?
 

1.3.4 Instruction representation

A computer program is made up of a sequence of instructions which are represented by binary patterns. For example, the binary pattern 0100001001000011 (4243 hexadecimal), when executed by the Motorola MC68000 microprocessor, would set the lower 16 bits of the data register D3 to 0. Each instruction that the computer hardware can execute has a particular binary pattern, with sequences of such binary patterns in the memory of the computer forming a program. Programs in this form are in a language called machine code, i.e. the language the hardware of the computer understands. It is clear that if humans had to write programs in machine code, programming would be a very error prone and time consuming task. In practice, professional programmers use either an assembly language or a high-level language.
 

1.4 Low and high-level languages

1.4.1 Assembly languages

In assembly languages each machine instruction is represented by a meaningful mnemonic (ADD, SUB, DIV) and data specified in binary, hexadecimal, decimal and character form. For example, the MC68000 instruction which clears the lower 16 bits of data register D3 (0100001001000011 in machine code) would be written in 68000 assembly language:

    CLR.W     D3

where CLR.W is the instruction or operation-code mnemonic and D3 is the position of the data being operated upon (called the operand). The computer hardware can only understand machine code, so before it can be executed an assembly language program has to be converted into machine code. This is done by a program called an assembler which takes each assembly language statement and converts it on a one-to-one basis into the equivalent machine code instruction which can then be executed.

Assembly language programming is difficult because it is only one level above machine code and hence orientated to a particular computer (each type or model of central processing unit has its own machine code language). For example, a program which had been implemented in assembly language on an MC68000 microcomputer would have to be totally rewritten if transferred to an Intel 8086 based system. In addition, the programmers who had written the original software would have to learn a new assembly language for the Intel 8086 processor.

Even with the above disadvantages assembly language programming is still required in many industrial applications, in particular, in real-time control systems for the implementation of time critical modules (the majority of the system would be implemented n a high-level language).

Machine code and assembly languages are described as low-level languages in that they are orientated towards the computer hardware. High-level languages on the other hand are problem orientated and computer independent.
 
 

1.4.2 High-level problem solving languages

High-level languages are written in an English or mathematical notation which is orientated towards solving practical problems. Some examples of high-level languages are:

BASIC                 Beginners All-purpose Symbolic Instruction Code: a simple language available on many home  microcomputers;
FORTRAN          FORmula TRANslation: a language widely used for mathematical, scientific and engineering  applications;
COBOL          COmmon Business Orientated Language: a language designed for commercial business  applications;
PASCAL             a general purpose problem solving language;
C                      a systems implementation language;
C++                      C with Object Oriented  enhancements
JAVA                  OO language for internet and general applications using basic C syntax
Ada                a modern systems implementation language designed for real-time applications.

After the program source code has been entered into the computer it has to be converted into machine code by a program called a compiler. Each statement in a high-level language can be converted into a number of machine code instructions. For example, consider the following Pascal statement:

    D0:=7+2*(267-23);

The equivalent in 68000 assembly language is:

     MOVE.L        #267,D0
             SUB.W         #23,D0
             ADD.W         D0,D0
             ADD.W         #7,D0

and in 68000 machine code (hexadecimal byte values) is:

   20   3C    00   00    01   0B    04   40    00   17   D0   40   06   40    00   07

In general the compilation process is not 100% efficient so a program written in a high-level language will take more memory and run more slowly than an equivalent assembly language program written by a good programmer. However, the advantages of working in a language which is orientated towards solving problems rather than towards the computer hardware means that the majority of application programs are written in high-level languages.

An additional advantage of using high-level languages is that such languages are less computer dependent than assembly languages (depending upon the quality of the international standard of the language and the particular implementation being used).
 
 

1.5 Review of information representation

ALL information within the computer, either instructions or data, is represented in binary form. To a programmer working in a high-level language this is not a problem as the compiler and run time system assign the appropriate data storage and convert information entered to and from binary. Consider the following simple Pascal program which writes a number and a character:

   PROGRAM TEST;
        CONST I=20; X='D';
        BEGIN
        WRITELN(' number is ',I,' letter is ',X);
        END.

When the above program is compiled (i.e. converted into machine code), the compiler assigns the storage for any variables and sets up the values defined by CONSTant expressions, i.e. the values of I and X are converted to the equivalent 16-bit (or 32-bit) and byte binary values 0000000000010100 and 01000100 respectively (0014 and 44 hexadecimal). When the program is executed the WRITELN statement converts the internal binary representation of the integer number I (value 20 decimal) into a string of ASCII character codes (32 and 30 hexadecimal) to be transmitted to the display screen. The display hardware then converts these into characters to be viewed, i.e. 20.

When working in a low-level language such as machine code or assembly code, the binary pattern 01000010 01000011 (42 43 hexadecimal) could represent to the computer hardware:
 

  1. two 8-bit integer numbers: 66 and 67 decimal;
  2. a 16-bit number: 16963 decimal;
  3. two characters in the ASCII character code: B and C respectively;
  4.  the instruction to the Motorola MC68000 microprocessor to set the lower 16 bits of data register D3 to 0,  i.e. CLR.W D3 in 68000 assembly language.


When working in a high-level language the compiler and run time system look after the organisation of data storage and conversion between external characters and the internal binary form. When working in machine code or assembly language the programmer is responsible for ensuring that instructions and data are separate and that the correct code is executed and data processed. It is very easy to get mixed up and try to add character data or even execute data. Careful program design and coding will avoid this problem.
 

Exercise 1.3 (see Appendix for sample answer)

What particular problems could face an assembly language programmer when looking for a new job ?
 
 

Problems


   Why is the binary system used for information storage within modern computer systems ?
   Convert the following numbers to binary and hexadecimal; do the calculation in each case and then convert
       the result back into decimal (use signed 16-bit twos complement binary numbers):
          15   67 189    456  1027
        +86 -86 +345 -345 +2056
3    Describe, in each case, the advantages/disadvantages and areas of application of:
   (a)   machine code programming;
   (b)   assembly language programming;
   (c)   high-level language programming.
 


2 Computer Hardware

For further information check the following links
The WWW Virtual Library on computing - http://src.doc.ic.ac.uk/bySubject/Computing/Overview.html
 CPU Information centre - http://bwrc.eecs.berkeley.edu/CIC/
PC reference information - http://www.pcguide.com/index.htm
IBM PC compatible FAQ -   http://www.undcom.com/compfaq.html
History of CPUs - http://bwrc.eecs.berkeley.edu/CIC/archive/cpu_history.html
CPU Information & System Performance Summary -  http://bwrc.eecs.berkeley.edu/CIC/summary/
Chronology of Events in the History of Microcomputers - http://www.islandnet.com/~kpolsson/comphist/

       information highways: address, data and control buses

Fig. 2.1 Typical microcomputer configuration using a common bus system

Fig 2.1 is a representation of the hardware (physical components) of a simple single processor computer system comprising:
 

  1.  CPU and associated circuits, e.g. microprocessor integrated circuit chip.
  2.  Co-processor (if fitted), e.g. for real number calculations or graphics.
  3.  Primary Memory (RAM and ROM).
  4.  Disk interface which controls a floppy disk and/or hard disk as secondary memory for saving programs and  data.
  5.  Terminal interface which controls the display screen and the keyboard.
  6.  Input/output interface devices (for connecting external devices such printers), e.g.:
    1.     Serial I/O interface, e.g. MC6850 ACIA or MC68681 DUART.
    2.     Parallel I/O Interface, e.g. MC6821 PIA or MC68230 PIT.
    3.     Timer controller, e.g. MC6840 PTM or MC68230 PIT.


It can be seen from Fig. 2.1 that an information highway or bus system connects the various components of the system together:

Address Bus  carries the address of the memory location or I/O device being accessed.

Data Bus         which carries the data signals.

Control Bus   which carries the control signals between the CPU and the other components of the system,
                         e.g.  signals to indicate when a valid address is on the address bus and if data is to be read or
                         written.

Unless a user, usually an electronics engineer, is building components to connect directly to the bus, the physical connections and signal timing is of little interest to the majority of programmers. Even users writing assembly language programs to control external devices (e.g. motors, heaters) can do this via a parallel I/O interface such as a MC6821 PIA or MC68230 PIT.

 See http://www.intel.com/network/performance_brief/pc_bus.htm and  http://www.pcguide.com/ref/mbsys/buses/func.htm for a discussion of PC busses and http://agpforum.org/ and http://www.pcguide.com/ref/mbsys/buses/types/agp.htm for a discussion on the AGP (Accelerated Graphics Port).
 
 

2.1 Memory (also see http://www.cms.dmu.ac.uk/~cph/Teaching/CSYS1001/lec15/c1001l15.html )

There is a general rule that the faster the memory the more it costs. Although it would be desirable to have a large amount of very high speed memory for fast program execution it is not always economically possible. Within a computer system there is a hierarchy of memory:
  Table 2.1 shows typical sizes and access times of memory types used in modern computer systems. To simplify memory size notation the basic memory size is stated in K , M or G, where a K is a unit of 1024 (not 1000 as in Kilometres), M a unit of 1048576 and G a unit of 1073741824  (because primary memory is built up as a square matrix of storage elements memory sizes are a power of 2).
 
   Typical size   Access time
CPU Registers
Primary memory
Secondary memory:
   floppy disks 
   hard disks 
   magnetic tape
   10 to 1000 byte
   512K to 64Mbyte

   320K to 2Mbyte (to 120Mbyte LS-120 floptical) 
  upto 70Gbyte
   10/20Gbyte

  less than 10nSec
     less than 100nSec

  50 to 500mSec
  less than 20mSec
  seconds to minutes

Table 2.1 Typical microcomputer memory sizes and access times

The access time is the time between the request for information and its availability for use. This is normally stated in nSec (nanosecond = 10-9 or 0.000000001 of a second), uSec (microsecond = 10-6 or 0.000001 of a second) or mSec (millisecond = 10-3 or 0.001 of a second). The three orders of magnitude difference between the access times of primary and secondary memory is mainly because the former is purely electronic and the latter has mechanical moving components. In addition the technique used to access information is different in that primary memory is random access and secondary memory (disk and magnetic tape) is sequential access, i.e.:

Sequential access. To access a particular piece of data all information between the current position and the target has to be accessed, e.g. as in a magnetic tape storage system.

Random access. Any memory location may be accessed directly.
 
 

2.1.1 The organisation of the primary memory

Primary memory is used to store the machine code and data during program execution. The majority of modern computer systems use a memory store built up of bytes of storage with each byte being assigned a location address. Fig. 2.2 shows such a memory organisation with the first byte of memory having address 0, the next 1, the next 2, etc.

Fig. 2.2 The organisation of computer primary memory

The maximum amount of memory is limited by the number of bits used by the address bus to access memory locations. Table 2.2 lists some microprocessors with their address and data bus sizes and the maximum amount of primary memory which can be addressed. For example:
 

  1. The early microcomputers (e.g. Intel 8080, Zilog Z80, and Motorola 6800 series) have a 16-bit address  bus which can address a maximum  memory size of 65536 bytes or 64 Kbytes, i.e. 1111111111111111 in  binary.
  2. The Intel 8086 (used in the original IBM PC microcomputer) and Motorola MC68008 have a 20-bit address bus which can address a maximum memory MC68008 have a 20-bit address bus which can address a  maximum memory size of 1048576 bytes or 1 Mbyte.
  3. The Intel 80186/286 and Motorola MC68000/10 have a 24-bit address bus which can address a maximummemory size of 16777216 bytes or 16 Mbytes.
  4.  The Intel 80386/486 and Motorola MC68020/30/40 have a 32-bit address bus which can address a maximum memory size of 4294967296 bytes or 4 Gbytes.
Microprocessor manufacturer & type  address bus size in bits  maximum memory size bytes  data bus size bits  clock
Intel 8080  16  64K  8
Zilog Z80  16  64K  8
Motorola 6800  16  64K  8
 Intel 8086 IBM/PC  20  1M  16
 Intel 8088 IBM PC/XT  20  1M  8
 Motorola 68000 AND 68010  24  16M  16
 Motorola 68008  20  1M  8
 Intel 80186 and 80286  24  16M  16
 Motorola 60020 68030 68040  32  4G  32
 Intel 80386SX  24  10M  16
 Intel 80386DX  32  4G  32
 Intel 80486DX  32  4G  32
 Intel 80486SX (no floating point CPU)  32  4G  32
 Intel 80486DX2  32  4G  32  *2
 Intel 80486DX4  32  4G  32  *3
 Pentium 400  32  4G  32/64 PCI  *4

Table 2.2 Common microprocessors with address and data bus sizes

    Note: K = 1024 (210), M = 1048576 (220), G = 1073741824 (230)
    The 40486SX is identical to the DX except that it has no floating point  coprocessor

Table 2.2 shows the maximum amount of primary memory which can be addressed. In practice a computer system may be fitted with less, e.g. typically a MC68030 system has 16, 32 or 64 Mbytes. Although the primary memory is organised in bytes an instruction or data item may use several consecutive bytes of storage, e.g. using 2, 4 or 8 bytes to store 16-bit, 32-bit or 64-bit values respectively.

The size of the data bus determines the number of bits which can be transferred between system components in a single read or write operation. This has a major impact on overall system performance, i.e. a 32-bit value can be accessed with a single memory read operation on a 32-bit bus but requires two memory reads with a 16-bit bus. In practice the more powerful the processor the larger the data and address busses.

The size of the address and data busses has a major impact on the overall cost of a system, i.e. the larger the bus the more complex the interface circuits and the more 'wires' interconnecting system components. Table 2.2 shows that there are versions of some processors with a smaller data and addresses busses, e.g. the Intel 80386SX is (from a programmers viewpoint) internally identically to the 80386 but has a 20-bit address bus and a 16-bit data bus. These are used to build low cost systems which are able to run application programs written for the full processors (but with reduced performance).

The 80486DX2 and 80486DX4 have on-chip clock multipliers which multiply the clock by *2 and *3 respectively, i.e. on-chip operations are performed at two or three times the external clock speed making a particular improvement in processor bound jobs. In addition, the DX4 has a large cache (hence DX4 rather than DX3). This has little effect on I/O bound jobs (e.g. a database server or a file server) where a Pentium with a 64-bit bus would be used.
 
 

2.1.1.1 RAM and ROM primary memory

The majority of computer systems contain two types of primary memory RAM and ROM: In addition to the primary memory there is other memory storage which is used for temporary information. These memory stores are called registers and may be found within the Control Unit, ALU and the I/O (Input/Output) device interfaces.
 
 

2.1.2 Cache memory

There has always been a problem of maintaining comparability between processor and memory speed. Increasing processor speed is relatively cheap in comparison to corresponding increases in the speed of the bus and main memory configuration.

A cache memory (see http://www.infc.ulst.ac.uk/~desi/b94mn/cache.htm) makes use of the locality of reference phenomenon, i.e. over short periods of time references of both instructions and data tend to cluster. The cache is a fast memory (matched to CPU speed), typically between 4K and 256Kbytes in size, which is logically positioned between the processor and bus/main memory. When the CPU requires a word (instruction or data) a check is made to see if it is in the cache and if so it is delivered to the CPU. If it is not in the cache a block of main memory is fetched into the cache and it is likely that future memory references will be to other words in the block (typically a hit ratio of 75% or better can be achieved).

Fig. 2.3 Showing CPU (with ALU, Control unit and internal cache), external cache
 

2.1.3 Secondary memory ( see also http://www.cse.dmu.ac.uk/~cph/Teaching/CSYS1001/lec17/c1001l17.html)

The number of programs developed by a single user may be many hundreds and there must be some means for the long term storage of information. The secondary memory of a computer system (disks and magnetic tape) is used for this purpose. A floppy disk can vary in storage capacity from 320 Kbytes to 2 Mbytes and hard disks can store up to 2000 Mbytes or more. The typical access time can range from a few milliseconds for a fast hard disk, to as long as half a second for a floppy disk. This variation is due to the mechanical nature of the storage medium and the partially sequential access method (to get to a byte of information it may be necessary to read over intermediate information). The information on the disk is organised into named files and the system software provides functions for accessing these, e.g. open a file, read/write a file, close a file. Therefore even assembly language programmers rarely need to control disk I/O interfaces directly.

The concept of a cache has been extended to disk I/O. When a program requests a block or blocks several more are read into the cache where it is immediately available for future disk access requests. Disk caches may take two forms:

Software disk cache in which the operating system or disk driver maintain  the cache in main memory, i.e.
        using the main CPU of the system to carry out the caching operations.

Hardware disk cache in which the disk interface contains its own cache RA memory (typically 4 to 16Mbytes)
        and control circuits, i.e. the disk cache is independent of the main CPU.

Hardware disk caches are more effective but require a more complex (and expensive) disk controller and tend to be used with fast disks in I/O bound applications, e.g. databases.
    See http://www.pcguide.com/ref/hdd/index.htm for more information on hard disks.
 

Exercise 2.1 (see Appendix for sample answer)

  1.  What limits the maximum size of the primary memory of a computer system?
  2.  What are RAM and ROM and what are they typically used for?
  3.  Why is secondary memory required in computer systems ?
  4.  Explain the terms sequential and random access and give examples.
  5.  What is the role of the bus system within a computer system ?

2.2 The CPU (Central Processing Unit) and co-processors

The CPU (see http://www.mkdata.dk/click/module3a.htm) contains the control unit, ALU (Arithmetic Logic Unit) and associated high speed registers used for storing information during instruction processing, see Fig. 2.3. The processor of a microcomputer is an integrated circuit chip which contains the CPU and, in the case of a microcontroller chip, some primary memory, I/O device interfaces and other specialist facilities facilities (see http://www.industrialtechnology.co.uk/micro2.htm).
 

2.2.1 The control unit

This component of the computer hardware has overall control of the computer system. During program execution the Control Unit fetches instructions from the primary memory, decodes them to determine the operation required, and then sets up instruction execution, e.g. to add two numbers or read a character from a keyboard. A number of registers are associated with the control unit, including:
 

2.2.1.1 The instruction cycle (fetch/execute cycle)

A program consists of a sequence of instructions in primary memory. Under the control of the Control Unit each instruction is processed in turn in a cyclic sequence called the fetch/execute or instruction cycle:

Fetch Cycle. A machine code instruction is fetched from primary memory (the PC points at each instruction in
        turn) and moved into the Instruction Register, where it is decoded (after the fetch PC is incremented the
        next instruction).

Execute Cycle. The instruction is executed, e.g. data is transferred from primary memory and processed by
`       the ALU.
 
 

2.2.1.2 Instruction prefetch and pipelining

To speed up the overall operation of the CPU some microprocessors employ instruction prefetch or pipelining techniques which were first used in mainframe computers (Foster 1976). The MC68000, for example, uses a two-word (each 16-bits) prefetch mechanism comprising the IR (Instruction Register) and a one word prefetch queue. When execution of an instruction begins, the machine code operation word and the word following are fetched into the instruction register and one word prefetch queue respectively. In the case of a multi-word instruction, as each additional word of the instruction is used, a fetch is made to replace it. Thus while execution of an instruction is in progress the next instruction is in the prefetch queue and is immediately available for decoding (see http://www.cs.herts.ac.uk/~comrrdp/pipeline/pipetop.html and
http://www.cs.umass.edu/~weems/CmpSci535/535lecture8.html).
 

2.2.2 The Arithmetic/Logic Unit (ALU)

The ALU is the component of the computer system which, under the direction of the Control Unit, performs operations upon numeric and other data, e.g:
  1. The integer arithmetic instructions add, subtract, multiply and divide (usually denoted by + - * / in high-level languages). Early 8-bit microprocessors could only carry out integer addition and subtraction and then only on 8-bit numbers directly. Multiplication and division and 16-bit operations had to be carried out using sequences of the available 8-bit instructions. The Motorola MC68000 can do integer addition and    subtraction of 8-, 16- and 32-bit numbers, multiplicationm of 16-bit numbers, and divide a 32-bit number by a 16-bit number.
  2. Logical instructions such as NOT, AND, OR and EOR (exclusive OR) and shift instructions.
Associated with the ALU and Control Unit of the MC68000 there are the following registers (in addition to the PC and IR):                            IF X=0 THEN Y:=20 ELSE Y:=30;

The SR (Status Register) contains a number of condition code bits which indicate the result of the last instruction (for example, if the result was zero or negative). These bits can be tested using branch instructions to control program flow.
 
 

2.2.2.1 8-bit, 16-bit and 32-bit microprocessors

The terms 8-bit, 16-bit and 32-bit, when used to refer to a microprocessor, give an indication of the power and facilities of the processor. In general, the number (8, 16 or 32) indicates the size of the data which can be processed directly by the ALU. For example, an 8-bit microprocessor can directly process 8-bit numbers with larger data types using a sequence of 8-bit instructions, e.g. using two 8-bit add instructions to add 16-bit numbers. Some extended microprocessors have instructions to process larger data (e.g. the Zilog Z80, an extended version of the Intel 8080 8-bit microprocessor, and has instructions to add and subtract 16-bit data) -  for details of microprocessor history see http://bwrc.eecs.berkeley.edu/CIC/archive/cpu_history.html and information on embedded microcontrollers see http://bwrc.eecs.berkeley.edu/CIC/embed/
 

2.2.3 Co-processors

Mathematical and scientific applications generally require mathematical calculations using real numbers (held in floating point form. The ALU of the majority of microprocessors can only carry out calculations on integer data. Where real numbers are used there are two ways to carry out floating point calculations:
  1. by program subroutines which use the normal integer ALU, or
  2. by using a floating point co-processor chip which can be up to 100 times faster, e.g. the Motorola  MC68881.
The modern microprocessor chips, e.g. Intel 80486 and Motorola MC68040, have a floating point co-processor on the same chip. In addition to floating point co-processors there may be other special purpose co-processors for graphics, signal processing, etc.
 
 

Exercise 2.2 (see Appendix for sample answer)

  1. Why is it necessary to have high speed registers within the CPU of a  computer ?
  2. What information does the status register of a CPU to contain and what is it used for ?
  3. What applications would require a system with a floating point co-processor and why ?

2.3 Input and output

Input and output devices provide the computer user with the means to transfer information in and out of a computer system (for example, to enter a program and data, and then display the results of program execution). A typical microcomputer would have a keyboard (similar to that of a typewriter) for entry of information, and a display screen (similar to a TV set) for output information display. In addition to the display of character information it is often possible to draw diagrams on the display screen.

Many applications require information be fed directly into a computer system from the external world, e.g. readings of temperature and pressure in a washing machine controller. Parallel input/output devices such as the MC6821 PIA and MC68230 PIT facilitate this.
 

2.3.1 I/O (input/output) interface registers

The interface circuit of an I/O Device (http://www.cs.umass.edu/~weems/CmpSci535/535lecture10.html) contains the circuits to control the peripheral device and status and control registers which, respectively, enable a program running in the CPU to:
  1. determine the state of the device, e.g. check if the keyboard has been hit; and
  2. control the device, e.g. to move the disk head.

2.4 Development system configurations

2.4.1 Self contained development systems

The majority of computer systems used for program development appear to the user as a self contained environment equipped with processor, primary and secondary memory, user keyboard and display. In practice, this may range from a stand alone IBM/PC compatible personal computer, through networked professional workstations to intelligent terminals attached to a mainframe computer. The user interacts with the system via an operating system which provides program development facilities, e.g. MS-DOS or UNIX (see 3.2).

The problem with using such systems for low-level program development is that the operating system environment often imposes restrictions on what user programs can do. Consider, for example, a multi-user environment where, if user programs were allowed to write data anywhere in memory, or access I/O device control registers, they could crash the whole system.

Even low-level self contained program development systems are designed with the intention that the majority of programming will be carried out using a high-level systems implementation language such as C, Modula 2 or Ada (a systems implementation language has many of the facilities of an assembly language, e.g. bit-manipulation, ability to access to I/O registers). Software systems are implemented mainly in the high-level language with the use of assembly language being restricted to specialist functions, e.g. time critical modules. The assembly language modules have to conform to restrictions imposed by the high-level language and operating system making assembly language programming in its own right very difficult.
 

2.4.2 Single board computer development systems

Single board development systems are essentially a printed circuit board with processor, primary memory and some I/O device interfaces, i.e. no secondary memory (Coats 1985/86). The on-board software is generally a simple monitor program (see 3.1) and no restrictions are placed on user programs. They have to be attached to a terminal and/or host computer which provides I/O facilities and secondary memory.

Fig 2.4 shows a host computer, which can vary in power from an IBM/PC compatible up to a professional workstation, attached to a single-board target system via a serial communications line. A program running on the host enables a user to communicate with the target system, entering commands using the host's keyboard and displaying the results on the host's display screen. Suitable cross assemblers and compilers (see 3.3) enable programs to be developed on the host and downloaded onto the target for execution (Bramer 1990).

Fig. 2.4   ( a link to animation)also shows external experiments or devices (e.g. the Bytronic multi-application board) connected to the target single board computer via parallel communications lines.

Fig. 2.4 A microcomputer acting as a host to a single board system.
 

Problem

Examine the manuals for any microcomputer systems you have access to and in each case determine (if possible):
  1.  What type of microprocessor is used ?
  2.   Is it an 8-, 16- or 32-bit microprocessor ?
  3. Does it have a co-processor or can one be fitted and if so what type ?
  4.  What is the size of ROM and RAM memory ?
  5.  What are the addresses of the ROM and RAM memory ?
  6.  What is the maximum amount of memory the microprocessor can address and  why ?
  7. What peripheral input/output devices is it equipped with ?
  8. What is the secondary memory and how many bytes can it store ?
  9. Does it use a proprietary or standard bus system, e.g. IBM PC/AT, VME, etc.



3 Computer Software

Before a microcomputer can process information (i.e. carry out calculations or read a character from a keyboard) it requires a program. A program is a series of instructions stored in the primary memory that are executed sequentially by the processor. The programs of a computer system are called its software and include:

System software which provide aids to program development and operation of the computer system.

Application programs for solving end-user problems, e.g. word processors, spreadsheets, accounting
        programs, CAD design tools, etc.
 

3.1 Monitors and operating systems

3.1.1 System start-up or bootstrapping

When the computer is switched on it requires some instructions to initialise the hardware and start up the system software (see http://www.pcguide.com/ref/mbsys/bios/boot.htm). In the past, these initial instructions had to be loaded as binary machine code into the primary memory using switches on a control panel. Today computer systems contain these initial instructions in ROM (Read Only Memory) and they are executed automatically when the computer is switched on or the reset button hit (reset is used to restart or reload the system software). This initial program in ROM may be quite complex, carrying out initial hardware tests, and then go on to provide:
  The general name for ROM based resident software is firmware, i.e. software which is permanently fixed in ROM memory.
 

3.1.2 A system monitor

Programmers implementing low-level software, either in assembly language or a systems implementation language, require facilities to access to CPU registers, physical memory, I/O device registers, etc. There are a number of ways that such facilities may be provided depending upon the development system being used:
  When a single board development system is switched on or reset, the resident monitor program in ROM carries out hardware tests and then prompts the user for command input. The user can enter a command to be executed by the monitor; typical facilities include:
 
  1. Display and set the contents of the CPU registers.
  2. Display and set the contents of RAM memory (values are usually displayed a hexadecimal numbers or  ASCII characters and may be entere using decimal or hexadecimal numbers or ASCII characters).
  3. Load instructions into memory using hexadecimal machine code.
  4. Load instructions into memory using a line by line assembler (i.e. as each program statement is entered it
  5. Load instructions into memory from a host computer.
  6. Start program execution, the user enters the start address of the program.
  7. Set breakpoints within programs. The user defines breakpoints as memory addresses. If during program  execution a breakpoint is reached (a)     program execution is suspended,  (b)   the microprocessor register contents are displayed, and (c) the user is prompted for a command.   The user can then continue program execution or enter other commands.
  8.  Program Trace. After execution of each instruction (in the user'sprogram) the microprocessor register  contents are displayed.
The monitor provides a program development environment in which the user can load a program, set up initial values in the CPU registers and RAM memory, and then execute the program. After execution, the register contents can be displayed and the memory examined to check for correct results. Breakpoints and trace provide debugging aids to find program errors. A debugger provides similar facilities under a disk based operating system (high-level language programs may be traced line by line, variables displayed, etc).
 

3.1.3 Bootstrapping an operating system

Due to limitations the size of physical memory which may be fitted to a computer system ROM based software is restricted to providing facilities such as a monitor, a Bootstrap Loader or other permanent programs, e.g. a washing machine controller.

On disk based computer systems the Bootstrap Loader is a program contained in ROM which is executed when the system is switched on or reset. The Bootstrap Loader checks out the hardware and then loads the operating system from a known position on disk into primary memory. After loading the program the Bootstrap Loader transfers control to it and, after initialisation, the operating system prompts the user for command input.

A computer equipped with disks can provide a large range of system software with several languages. The operating system looks after the overall operation of the computer, the programs running in it and interaction with the user(s). The operating system (see http://www.rsc.co.uk/kincorth/5-14IT/ychs.htm) will be provided on disk (secondary memory) and it must be loaded into the primary memory before it can be used. Operating systems are not normally in ROM because:

The facilities provided by an operating system include:
  1.  Control of the disk file system, e.g. opening/closing/reading/writing, etc.
  2.  Editors for the creation and modification of programs and data.
  3.  Assemblers and compilers for programming languages.
  4.  A linker which links various program modules into a complete executableprogram.
  5.  Execution and debugging of systems, application and user programs
In general, the more complex operating systems available on larger computers are used for development of programs in high-level languages. In particular, when executing programs on a multi-user machine, an assembly language program will be restricted in what it can do, e.g. not allowed to access input/output devices directly.
 

Exercise 3.1 (see Appendix for sample answer)

  1. Power up the target microcomputer displaying any systems tests.
  2. After the monitor bootstraps use the help facility to display the commands available.


3.2 Software development facilities

3.2.1 Editors

An editor allows the user to: Early editors tended to be line editors in which a single line of text was displayed and modified at a time. The majority of modern editors are of the full screen type which display a screen full of text, typically 25 lines, which is a window into the program source file. The user selects the window which shows the section of the file where the correction is to be made. A screen cursor is positioned, using keyboard keys or a mouse, at the exact place of the correction and the user can then add, change or delete characters as required. Some editors can invoke a compiler (or assembler) to compile the program source code and display an error message for each source line where an error was found.
 
 

3.2.2 Assemblers and compilers

Assemblers convert assembly language programs (and compilers convert high-level language programs) into machine code (sometimes called object code). A program listing or error file is generated which shows any errors detected. If errors are found, these will have to be corrected and the process repeated.

Assemblers running on disk based machines can produce either absolute or relocatable code (compilers always generate relocatable code). In the case of absolute code the memory addresses that will be used to store the machine code and/or data are specified at assembly time. Relocatable code is generated from a base address of 0 and the linker then sets up the absolute addresses (see Linkers next section).

Microcomputers with ROM based assemblers generate absolute machine code directly into memory where it can be executed immediately (assuming that all modules are present - see Linkers next section).
 
 

3.2.3 Linkers

A complete program may be built up from a number of parts called modules which are formed using subroutines and functions. Each module may be in a separate file and assembled/compiled individually. When all modules are complete and error free, they are combined together using a linker to form the complete program. The linker goes through the program, assigning modules to memory, setting up links between modules, and checking for any missing or multiply defined modules. The output of the linker is a complete machine code program which can then be executed.

ROM based assemblers generate absolute machine code directly into memory (no link stage). By editing and assembling a sequence of modules a large program can be built up. In such a case care must be taken to ensure that absolute modules are separate in memory.
 
 

3.3 Cross assemblers and compilers

A native compiler or assembler produces code suitable for execution on the host computer or systems with a compatible processor. A cross assembler or cross compiler executing on one computer (the host) generates code suitable for another computer (the target) usually with a different processor. The resultant output object code is then linked (on the host) to other object code files such as:
  1. Other program modules, e.g. C, Pascal or assembly language routines.
  2. Libraries containing language support routines for the target, e.g. software floating point, mathematical functions, character manipulation, input/output, etc.
  3. Libraries containing routines required by the target operating syste or monitor, e.g. process switching and communication, interrupt handling, memory allocation, etc.
The host and target computers are usually connected via a simple asynchronous serial line with communication limited to the printable ASCII character set (e.g. as shown in Fig. 2.4). The output of the linker, which is in some binary format, must be converted into one of the standard formats used for the transfer of binary information over character oriented communications systems, e.g. Motorola S-record, Intel hex and Tektronix hex formats. A simple communications program on the host computer transmits the resultant character file to the target computer where it can then be executed (see 2.4).

Using cross software has the advantage that a single, possibly expensive, host system can be used to produce code for a range of target systems. The target only needs sufficient power and facilities to run the final application (normally much less than a full program development system) plus debugging aids (see reference Bramer 1990 for a full discussion on the importance of using modern computer tools and the advantages of using a host computer to develop common software for a variety of target systems).
 

3.4 Absolute and relocatable code

Absolute machine code programs contain instructions which refer to fixed addresses in memory where instructions and data are stored. In general, this means that the program must always be executed in the same place in memory.

Relocatable machine code contains no absolute references to particular memory locations. Instructions and data are referenced using addresses relative to the program instructions or to a base data address and thus the program can be executed anywhere in memory.

The advantage of relocatable code is that a complete program can be built up from modules which can be placed anywhere in memory, i.e. there is no need for a module to start at a particular address. When writing high-level language programs the programmer generally has no method of specifying absolute memory addresses, so the object code generated by the compiler is relocatable (there are exceptions, e.g. using pointers in C to access I/O device registers at absolute addresses in memory). Absolute addresses can usually be specified in assembly language programs thus allowing the generation of absolute code when required. Some microcomputer systems designed for program development in high-level languages do not allow absolute addresses even in assembly language programs and reference should be made to the microcomputer manuals to see if this is the case. Students attending formal courses of instruction will be given guidance on this point by the tutor.
 

3.5 Run-time system facilities

A modern monitor or operating system provides facilities, which can be accessed by programs, to carry out common functions, including:
 
  1. Read a character or text string from the keyboard.
  2.  Write a character or text string to the display screen.
  3. Open, close, read and write disk files on disk based machines.
  4. Write a character or string to a printer.
In addition, a high-level language will have libraries of routines which may be accessed by application programs.
 

Problem

For any microcomputers you have access to, determine the following:
 
  1.  What is the name of the monitor or operating system ?
  2.  What program debugging facilities does it provide ?
  3.  What other system software is available and what is its function ?
  4.  What applications software is available and what is its function

Appendix: Answers to Exercises


Exercise 1.1Calculations in decimal, binary and hexadecimal.

 16 00010000 10        45 00101101 2D        110 01101110 6E        110 01101110 6E

+32 00100000 20      +60 00111100 3C        -45 11010011  D3        +45 00101101 2A

 48 00110000 30          105 01101001 69       65 01000001 41         -101 10011011 9B
 


Exercise 1.2 Characters with ASCII codes in decimal, hex and binary.

Character                        A                     B                     C                  Z                   a                    b                     c
Decimal                          65                    66                   67                90                  97                98                  99
Hexadecimal                 41                    42                  43                 5A                 61                62                  63
Binary                      01000001      01000010    01000011     01011010   01100001   01100010    01100011
 

Character                       0                       1                   2                    9                   -                     ?                   =
Decimal                        48                     49                 50                   57                45                 63                 61
Hexadecimal               30                     31                 32                   39                2D                  3F               3D
Binary                  00110000       00110001    00110010      00111001   00101101     00111111   00111101

From Table A.2 it can be seen that the letter and digit character codes are in ascending numeric order, i.e. the characters '0' to '9' have the ASCII codes 48 to 57. This simplifies:
 

  1. testing to see if a character is within a given range, e.g. a character read from a keyboard is a digit if its code is in the range 48 to 57 decimal;
  2. the conversion between characters and numeric values, e.g. subtract the ASCIIcode for '0' (30 hexadecimal)  from the character to give its equivalent numeric value
Exercise 1.3

Each model of processor has a different assembly language. Thus a programmer who is an expert in one assembly language can be limited in choice of job opportunities and/or have to learn a new language when changing employment.

Exercise 2.1
 

  1. The maximum size of primary memory is limited by the number of bits that the processor uses to address  the memory, e.g. 24 bits can address 16Mbytes.
    1. RAM (Random Access (read/write) Memory) is random access memory that can be read/written;  used for storage of programs and data during execution.
    2. ROM is Read Only (random access) Memory: information can only be read; used for storage of  permanent programs and data, e.g. the bootstrap loader.
  2. Secondary memory (disks & tape) is used for long term program and data storage.
    1. Random Access: any data word can be accessed directly, e.g primary memory.
    2. Sequential Access: data is accessed in sequence, e.g. a magnetic tape where to access data down the tape it is necessary to read over intermediate data.
  3. A computer bus is the information or data highway which carrieinformationbetween the various components of a computer system.
Exercise 2.2
 


Exercise 3.1 The following listings are from a FORCE single board MC68000  microcomputer running the ROM based M68 Monitor (commands are invoked   by single key hits and are self explanatory).

M68 monitor power up checks and command help.

-------------------------------------------------------------------------------------

MC68000 microcomputer monitor
Copyright Brian Bramer, January 1990, Leicester Polytechnic
System hardware checks, hit RESET button to abort
FORCE microcomputer, CPU MC68000, 127K bytes RAM
MC6840 PTM @ 4CF41 interrupt test OK
MC6850 ACIA @ 50040 write check OK, ACIA @ 50041 write check              OK
MC6821 PIA @ 5CEF1 data direction registers write/read check                      OK
Monitor checksum C406 OK, Editor checksum 28E7                                           OK
Test system RAM address 00000008 to 00000FFF                                             OK
Test user RAM address 00001000 to 0001FFFF..                                                OK
Test finished in 0 hours 0 minutes 3.2 seconds, 00000020 interrupts occurred
00000001 test sequence(s) executed, no errors found                                      all OK
 

MC68000 monitor V1.04b, please enter command (<ESC> to abort, ? for help)

M68> Help

Valid commands and subcommands are:

E - Edit/assemble a program
D - Dissemble a program
R - Register: initialise, set, display
M - Memory: display, set, modify, block
X - Convert decimal/hexadecimal/text values
L - Load S-record program from: Terminal
S - System tests
G - Go a program
T or I - Trace program execution
B - Breakpoint: display, set, clear
C - Continue program execution from breakpoint or trace

Data may be hexadecimal (default), decimal (prefix with .) or text in '...'
Hit <ESC> to abort a command sequence

M68>

-------------------------------------------------------------------------------------

On power up the M68 monitor performs a sequence of memory and I/O device checks and then prompts the user for command input with the prompt M68> . Commands are generally single keystrokes, e.g. in response to the key H the monitor displays the help screen which lists the commands available.
 

References

Bramer, B, 1989, 'Selection of computer systems to meet end-user requirements', IEEE Computer Aided Engineering Journal, Vol. 6 No. 2, April, pp. 52-58.

Bramer, B, 1990, 'Using a common host system to develop software products for a variety of target computer environments', IEE Computer-Aided Engineering Journal, Vol. 7 No. 5, October, pp. 129-134.

Coats, R F, 1985/86, '68000 Board', Electronics and Wireless World, October pp. 51-54, November pp. 51-54, December pp. 36-38, January pp. 67-70, February pp. 72-74, May pp. 24-27.
 

Foster, C C, 'Computer Architecture', Van Nostrand Reinhold, 1976.

This book describes prefetch and other techniques used to enhance processor performance of the mainframes in the 1970's.