Representation of Data Within the Computer

Brian Bramer, DeMontfort University, UK (bb@dmu.ac.uk)

Goto Computer Systems Notes main page

Contents


1 Decimal and binary integer numbers

2 Binary Addition

3 Signed Binary Numbers

4 Overflow

5 Hexadecimal Numbers

6 Conversion between the binary and hexadecimal number systems

7 Conversion of Decimal Numbers to Binary

8 Conversion of binary numbers to decimal

9 Conversion of decimal numbers to hexadecimal

10 Conversion of hexadecimal numbers to decimal

Example of decimal/binary/hex/Ternary/Quintal/octal calculator

11 Fixed Point Real Numbers

12 Floating Point Real Numbers

13 ASCII Character Code
 
 

When humans use numeric data they usually represent the numbers using the decimal system, i.e. base 10. Working in decimal numbers requires the ability to differentiate between ten different states, the digits 0 through 9. For the human brain this is straightforward, and may even be extended to take account of alphabetic information (i.e. the characters a to z and A to Z). Computer systems are built from large numbers of similar electronic circuits. Although it is possible to build electronic circuits which can store and manipulate ten states, it is easier and cheaper to build electronic switches that may be in one of two states, either ON or OFF. Such circuits can therefore be used to represent binary data (base 2) with, for example, binary 1 and 0 being represented by the ON and OFF states respectively.

Computer systems internally represent all information, data and instructions, in binary form, with conversion between binary and human readable forms for input and output. When working in machine code or assembly language it is sometimes necessary to use binary or some similar number system. Binary numbers tend to be very long and hence it is easy to make mistakes when dealing with such data. In such situations the hexadecimal (base 16) number system is commonly used (it is very easy to convert between binary and hexadecimal).
 
 

1 Decimal and binary integer numbers

        Decimal Digit                 Binary Bit 
        number base 
          possible states
                  10
   0, 1, 2, 3, 4, 5, 6, 7, 8, 9
                    2 
                0 or 1

The above table shows that a decimal digit can represent one of ten states, 0 through 9, and a binary bit (a single binary digit is called a bit) can represent two states, 0 or 1. It is possible, however, to represent more states by joining a sequence of digits or bits together, and in such a case it is assumed that the least significant digit or bit is the rightmost and the most significant is the leftmost. The bits or digits are generally numbered starting with the least significant from 0.
 

1.1 An eight digit decimal number

 digit          7         6          5         4        3         2          1          0
digit value       107        106        105        104        103         102        101        100

In the decimal system the least significant (rightmost) digit represents units (100), the next tens (101), the next hundreds (102), etc., therefore the above eight digit number can represent values in the range 0 (all digits 0) to 99999999 (all digits 9).
 

1.2 An eight-bit binary number

 bit          7          6          5         4         3         2        1        0
 bit value          27          26         25        24         23         22         21         20

In the binary system the least significant (rightmost) bit represents units (20), the next twos (21), the next fours (22), etc., therefore the above 8-bit binary number can represent values in the range 0 (all bits 0) to 11111111 binary (all bits are 1). It is possible to convert between number bases and 11111111 binary is equivalent to 255 decimal. Larger values can be represented by more bits, for example a 16-bit binary number can represent 0 to 65535 decimal, and a 32-bit number 0 to 4294967295.

Within a computer system a memory word is built up from a number of bits. Typical word sizes are eight bits (usually called a byte), 16 bits, 32 bits or 64 bits. In practice the majority of modern computer systems use a memory based on bytes of storage, with sequences of bytes being used to store 16-, 32- or 64-bit numeric data.
 
 

2 Binary Addition

The following truth tables show all the possible combinations of the addition of:
  1. two bits A and B,and
  2. two bits A and B plus a carry in from a previous addition.
In both cases the addition results in a SUM and a carry out.
 
     A    +    B           SUM         carry
        0             0
        0             1
        1              0
        1              1
               0
               1
               1
               0
              0
              0
              0
              1
   A   +   B  +  carry in          SUM               carry out
    0              0            0
    0              0            1
    0              1            0
    0              1            1
    1              0            0
    1              0            1
    1              1            0
    1              1            1
                     0
                     1
                     1
                     0
                     1
                     0
                     0
                     1
                     0
                     0
                     0
                     1
                     0
                     1
                     1
                     1

The following are examples of decimal and binary addition:
 
  decimal    binary   decimal    binary   decimal    binary
      5     101       10     1010       27       11011
     +2   +  10      +  9    +1001       +15     + 1111
      7     111        19     10011       42     101010

The rightmost bits are added using the left hand table above. This results in a SUM and a carry bit which is carried out to be added into the addition of the next two bits (using the right hand table above). This addition then results in a sum and a carry out, etc.

The majority of modern computer systems store numeric values in sequences of bytes, i.e. 8-bit words of storage. A single byte is limited to representing a number in the range 0 to 255 decimal. If the addition of two bytes results in a carry out of bit 7, the result is greater than 255, and an error has occurred. When carrying out integer arithmetic on a computer system care must be taken to ensure that the results will fit the word size being used (generally 16 or 32 bits are used for integer number calculations).
 
 

3 Signed Binary Numbers

Mathematical and scientific calculations require the storage of negative as well as positive integer numbers. To represent a positive or negative number using the binary system one bit, usually the leftmost bit, is reserved for the sign. A negative number can then be represented in a number of forms, e.g. to represent -10 decimal as an eight bit signed binary number:

(a)    sign-true magnitude              (b)     ones-complement    (c)    twos complement
             10001010                                       11110101                                 11110110

Sign-True Magnitude Form. The leftmost bit holds the sign of the number, 0 for positive and 1 for negative, and the other seven bits represent the magnitude. In the example (a) 0001010 is the magnitude equivalent to 10 decimal, and the leftmost bit is 1 to indicate that the value is negative. This system is not commonly used in computer systems because it requires separate addition and subtraction circuits.

Ones Complement Form. To obtain the negative of a number each bit of the positive binary value is complemented, i.e. 0s are replaced with 1s and 1s with 0s. In example (b) +10 decimal, 00001010 binary, is complemented to form -10 decimal, i.e. 11110101 binary. This form is used in some computer systems, e.g. CDC 7600 series, but it has the problem that 0 can take two forms +0 (00000000) or -0 (11111111).

Twos Complement Form. To obtain the negative value of a number the ones complement is obtained, and then 1 added, i.e. in (c) above the value of +10 decimal, 00001010, is ones complemented to obtain 11110101, and then 1 added to obtain 11110110 (-10 decimal).

The advantage of complemented numbers is that separate addition and subtraction circuits are not required. To subtract a number, its complement is formed (a very easy operation), and the result added (using the normal adder circuits) to the other number. The majority of modern computer systems use twos complement form to represent signed binary numbers. In practice signed numbers are used for normal arithmetic calculations, and unsigned numbers for addresses, e.g. in assembly language programs. The range that can be represented by signed and unsigned 8-bit, 16-bit and 32-bit binary numbers is shown in Chapter 1 Table 1.1.
 
 

4 Overflow

Overflow occurs if the number of bits is too small to store the result of an arithmetic operation. For example, when using 8-bit signed numbers the binary addition 01101110 + 00101101 (decimal: 110 + 45) would result in the value 10011011 binary. It can be seen that the addition of the two positive numbers has resulted in the incorrect negative value -101 decimal. After the computer hardware has carried out an arithmetic operation it sets condition code bits that indicate if: The condition code bits can be used in program control structures and for checking for error conditions. Many high-level language run-time systems automatically check for overflow errors, and special instructions can be used by assembly language programs to test the condition code bits.
 
 
 

5 Hexadecimal Numbers

When working in assembly languages it is often necessary to specify memory addresses and bit patterns. To do this using binary numbers would be cumbersome and error prone, i.e. to represent a 16-bit binary number sixteen 0s and 1s would have to be entered. In practice, the hexadecimal (base 16) number system is commonly used:
    decimal  hexadecimal     binary    decimal  hexadecimal      binary
           0
           1
           2
           3
           4
           5
           6
           7
           8
           9
          10
          11
          12
          13
          14
          15
           0
           1
           2
           3 
           4
           5
           6
           7
           8
           9
          A
          B
         C
          D
          E
          F
         0000
         0001
         0010
         0011
         0100
         0101
         0110
         0111
         1000
         1001
         1010
         1011 
         1100
         1101
         1110
         1111
          16
          17
          18
          19
          20
          21
          22
          23
          24
          25
          26
          27
          28
          29
          30
          31
           10
           11
           12
           13
           14
           15
           16
           17
           18
           19
           1A
           1B
           1C
           1D
           1E
           1F
00010000       00010001
00010010
00010011
00010100
00010101
00010110
00010111 
00011000
00011001
00011010
00011011
00011100
00011101
00011110
00011111

Table 1: Decimal, hexadecimal and binary numbers
 
 

6 Conversion between the binary and hexadecimal number systems

To convert a binary number to hexadecimal:
  1. working from the least significant (rightmost) bit split the binary number up into groups of four bits;
  2. using Table 1 convert each group of four bits into the equivalent hexadecimal digit.
For example:

0001001110011110 to 0001 0011 1001 1110 = 139E hexadecimal

To convert from hexadecimal to binary, replace each hexadecimal digit with the equivalent four bit binary value.
 
 

7 Conversion of Decimal Numbers to Binary

To convert a positive decimal integer the following algorithm starts by generating the least significant binary bit, then the next, etc.:

LOOP
    next binary bit = remainder of DECIMAL_VALUE/2
    DECIMAL_VALUE = DECIMAL_VALUE/2 (ignoring remainder)
UNTIL DECIMAL_VALUE=0

  e.g.convert decimal 38 to binary                result
    38/2 = 19 remainder 0 gives binary 0            0
    19/2 = 9  remainder 1 gives binary 1           10
     9/2 = 4  remainder 1 gives binary 1          110
     4/2 = 2  remainder 0 gives binary 0        0110
     2/2 = 1  remainder 0 gives binary 0      00110
     1/2 = 0  remainder 1 gives binary 1    100110

To obtain the binary equivalent of a negative decimal number, convert the absolute value to binary then take the twos complement.
 
 

8 Conversion of binary numbers to decimal

The following algorithm converts a binary number into decimal:

DECIMAL_VALUE=0
LOOP starting with the most significant binary bit
     BIT_VALUE = value of current binary bit
     DECIMAL_VALUE = DECIMAL_VALUE*2 + BIT_VALUE
UNTIL current bit is the least significant

For example, convert 100110 binary (remember the least significant bit is bit 0):
bit processed    5     4      3      2      1      0
DECIMAL_VALUE (((1*2 + 0)*2 + 0)*2 + 1)*2 + 1)*2 + 0 = 38
 

9 Conversion of decimal numbers to hexadecimal

The following algorithm generates the least significant (rightmost) hexadecimal digit, then the next digit, etc.:

LOOP
    REMAINDER = remainder of DECIMAL_VALUE/16
    next hexadecimal digit =
        hexadecimal equivalent of REMAINDER
    DECIMAL_VALUE = DECIMAL_VALUE/16 (ignoring remainder)
UNTIL DECIMAL_VALUE=0

e.g. convert 1567 to hexadecimal                                         result
       1567/16 = 97 remainder 15 gives hexadecimal F            F
           97/16 = 6  remainder 1 gives hexadecimal 1              1F
             6/16 = 0  remainder 6 gives hexadecimal 6            61F
 
 
 

10 Conversion of hexadecimal numbers to decimal

DECIMAL_VALUE=0
LOOP starting with the most significant hexadecimal digit
    DIGIT_VALUE = decimal value of current hexadecimal digit
    DECIMAL_VALUE = DECIMAL_VALUE*16 + DIGIT_VALUE
UNTIL current hexadecimal digit is the least significant

For example, convert 61F hexadecimal to decimal:
hex digit processed         2            1             0
DECIMAL_VALUE        ((6*16) + 1)*16 + 15 = 1567
 
 

11 Fixed Point Real Numbers

So far only integer numbers have been considered. Such numbers are useful when calculations on whole number values are required, e.g. for loop control in programs. In practice, however, it is necessary to be able to represent fractional components of numbers as well. These are called real numbers and one means by which these may be represented is in fixed point form. The following shows a 16-bit binary value in which the whole number part (with sign) is stored in eight bits (bits 8 to 15) and the fractional component in eight bits (bits 0 to 7):
 
   15    14     13    12    11    10    9     8     7     6     5     4     3     2     1      0
 Sign    2     2     24      23      22     21     20     2-1      2-2     2-3     2-4      2-5     2-6      2-7       2-8

For example decimal 10.75 would be 1010.11 binary. The major limitations of this system of real number representation are that:

  1. The maximum absolute size of numbers is limited by the number of bit assigned to hold the whole number part (as with a normal binary integer number).
  2. When dealing with small fractional components accuracy is lost, and  very small values cannot be represented at all, i.e. the smallest value that can be represented by the above fixed point number is  0.00390625.
In practice these restrictions on fixed point numbers do not make it worth while providing the extra software or hardware within the computer system to process them.
 
 

12 Floating Point Real Numbers

In many scientific and engineering applications very small or very large numbers have to be represented, e.g. from the sizes of atomic particles to intergalactic distances. In the floating point number system the real value is represented by a signed fractional component called the mantissa and a signed exponent. For example, decimal floating point numbers (using base 10) can be represented:

    mantissa * 10exponent where 0.1 >= mantissa < 1.0

To maintain accuracy the absolute value of the mantissa is maintained within the range shown (this process is called normalisation), e.g. e.g 6520000.0 would be 0.652*107 and -0.00000000652 would be -0.652*10-8. In practice many printers cannot print superscripts so the above examples would be printed as follows: 6520000.0 as 0.652E7 and -0.00000000652 as -0.652E-8 where the E indicates an exponent of 10.

Within computer systems the fractional component is held as a binary fraction and the exponent is a power of 2 (or possibly 16). A typical system may store each floating point number in 32 bits with 24 bits to hold the signed mantissa and 8 bits for the signed exponent. In this case the accuracy of the mantissa is 23 binary bits (which is equivalent to 6 or 7 decimal figures of accuracy), and the range of the exponent would be -128 to +127. Greater accuracy can be obtained by using 64-bit storage in which 53 bits may be used to store the signed mantissa (giving 15 to 17 decimal figures of accuracy) and 11 bits for the exponent.

Floating point calculations can be carried out using floating point co-processor chips, or emulated in software that uses the integer arithmetic operations of a computer. The advantage of floating point hardware is that it can be several orders of magnitude faster than software emulation, but it requires more complex and expensive hardware.
 
 

13 ASCII Character Code

Table 2 lists the ASCII character codes (American Standard Code for Information Interchange), with the columns being the decimal value, the hexadecimal value, then the corresponding character. ASCII is the most widely used character code for data transmission between computers, terminals and printers. As with all information within the computer system, characters are represented by binary patterns. In the ASCII code each character is represented by a seven bit code that is stored one character per byte (with bit 7 set to 0 or used as a parity check).

The characters below 32 decimal (20 hexadecimal) are non-printing control characters. These are used to control the action of printers, display screens, communications systems, etc. Important control characters are:

NUL      null: no action (used as a fill or delay character)
BEL      bell: rings the keyboard bell or buzzer
BS      backspace: move back one character width
HT      horizontal tabulate: move horizontally to next tabulate position
LF      line feed: move page vertically one character height
FF         form feed: new page on printer, clear display screen
CR        carriage return: move to start of current line
ESC     escape: used in many systems as a program control character
SP        space: move horizontal by one character width

For example to move a printer or a display screen to a new line position the characters CR (carriage return) then LF (line feed) will be output. In addition some of the printable characters will depend upon the printer font being used.

It is worth noting that the ASCII codes for the numeric characters 0 to 9, and alphabetic characters A to Z and a to z, are arranged in ascending numerical order. This property can be used for:

  1.  Testing if a character is within a range, i.e. in the range A to Z.
  2.  The conversion of numeric decimal data, entered at a keyboard, into internal binary form.
Do not confuse the code for a numeric character with the equivalent numeric binary value, i.e the code for the character 1 is 31 hexadecimal (49 decimal). When a number composed of several digits is read from a keyboard the character codes are read, turned into the equivalent binary numeric value and then added to any previous total. The following algorithm reads a decimal number from a keyboard (until a non-digit is entered):

NUMBER=0
READ(character)
LOOP WHILE character is in the range '0' to '9'
    DIGIT_VALUE = character - '0'
    NUMBER = NUMBER*10 + DIGIT_VALUE
    READ(character)
END LOOP

In the majority of programming languages a character code value is specified by enclosing it in quote marks. In the above algorithm characters are read from the keyboard until a non-digit character is hit. If the character is a digit, say 7 was hit, the ASCII code for 0 is subtracted from it to get the equivalent numeric value DIGIT_VALUE, i.e. in this case 30 hexadecimal (the code for '0'), will be subtracted from 37 hexadecimal (the code for '7'), to give DIGIT_VALUE=7. The NUMBER entered so far is then multiplied by ten and the current DIGIT_VALUE added.
 
 
   0
   1
   2
   3
   4
   5
   6
   7
   8
   9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
   00
   01
   02
   03
   04
   05
   06
   07
   08
   09
   0A
   0B
   0C
   0D
   0E
   0F
   10
   11
   12
   13
   14
   15
   16
   17
   18
   19
   1A
   1B
   1C
   1D
   1E
   1F
 NUL
 SOH
 STX
 ETX
 EOT
 ENQ
 ACK
 BEL
 BS
 HT
 LF
 VT
 FF
 CR
 SO
 S1
 DLE
 DC1
 DC2
 DC3
 DC4
 NAK
 SYN
 ETB
 CAN
 EM
 SUB
 ESC
 FS
 GS
 RS
 US
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  60
  61
  62
  63
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  2A
  2B
  2C
  2D
  2E
  2F
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  3A
  3B
  3C
  3D
  3E
  3F
 SP
 !
 "
 #
 $
 %
 $amp;
 '
 {
 }
 *
 +
 ,
 -
 /
 0
 1
 2
 3
 4
 5
 6
 7
 8
 9
 :
 ;
 &lt
 =
 &gt
 ?
  64 
  65
  66 
  67
  68
  69
  70
  71
  72
  73
  74
  75
  76
  77
  78
  79
  80
  81
  82
  83
  84
  85
  86
  87
  88
  89
  90
  91
  92
  93
  94
  95 
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  4A
  4B
  4C
  4D
  4E
  4F
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  5A
  5B
  5C
  5D
  5E
  5F 
 @
 A
 B
 C
 D
 E
 F
 G
 H
 I
 J
 K
 L
 M
 N
 O
 P
 Q
 R
 S
 T
 U
 V
 W
 X
 Y
 Z
 {
  \
  }
 ^
 _
  96
  97
  98
  99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 111
 112
 113
 114
 115
 116
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 6A
 6B
 6C
 6D
 6E
 6F 
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 7A
  7B
  5C
  7D
  7E
  7F 
 `
 a
 b
 c
 d
 e
 f
 g
 h
 i
 j
 k
 l
 m
 n
 o
 p
 q
 r
 s
 t
 u
 v
 w
 x
 y
 z
 {

 }
 ~
 DEL

Table 2: The ASCII Character Codes: columns are decimal and hexadecimal numeric character code value followed by the character

When character information is transmitted over a noisy communications channel a parity bit can replace bit 7 (which is not used in the ASCII code) or be added to make the total character length of 9-bits (for more details of parity checking see the Problem for Chapter 12).