Algorithms related to number systems and data representation


In most cases, the description of the discussed algorithms is also presented with short JavaScript programs. An online JavaScript interpreter that can be used to try the examples can be accessed as follows:

Online JavaScript Interpreter by Peter Jipsen, Chapman University (January 2013).
http://math.chapman.edu/~jipsen/js/ (2020-11-19)

Important note: Since the link above doesn't work for some reason, let's use the link below.

The flowchart of the JavaScript programs can be displayed using the following web application:

Live code editor. Created by Bogdan Lyashenko.
https://bogdan-lyashenko.github.io/js-code-to-svg-flowchart/docs/live-editor/index.html (2022-10-02)

The flowchart does not represent the 'for' loop correctly, so it is always worth rewriting each 'for' loop into a 'while' loop before displaying the program's flowchart.



Various representation of numbers in digital computers

In digital computers, data is represented in binary notation, as sequences of bits. In the following, we assume that the data is stored in an 'n' bit storage unit called a register. In a register, 'n' different bits can be stored independently which allows a total of m=2n different bit variations. So an 'n' bit register can store 'm' different (binary coded) data. The number 'm' is also called the modulus of the register.
For example,
– the modulus of an 8 bit register is m=256;
– the modulus of a 16 bit register is m=65536;
– the modulus of a 24 bit register is m=16,777,216;
– the modulus of a 32 bit register is m=4,294,967,296.

Exactly m=2n different variations of binary coded data can be stored in an 'n' bit register.

Since we store individual bits in a register, an 'n' bit register can also be thought of as a series of one-bit storage units denoted by b0, b1, ... respectively (where each unit has a value of 0 or 1). Using such notation, a register can be described in the form of
    bn−1bn−2...b2b1b0
where bn−1 stores (or represents) the most significant bit (MSB), and b0 the least significant bit (LSB).

The meaning of MSB and LSB depends on the actual represention of data, e.g. when we deal with integers represented in a binary number system (using direct coding), MSB corresponds to the bit with the highest positional value, and LSB corresponds to the bit having lowest positional value (i.e. the number of ones).

In the case of the representation of signed numbers (more precisely, when the sequence of bits stored in the register is interpreted as a signed number, e.g. when using two's complement representation of binary integers or floating point representation of real numbers), the leftmost bit usually specifies the sign of the number:
    for bn−1=0, the number stored in the register is positive;
    for bn−1=1 the number stored in the register is negative.
In this case, the leftmost bit is called the sign bit.

As for placing the bits of an unsigned binary number in a register, it is a general rule that
– the digit having the highest positional value of the number is stored in the leftmost or MSB position in the register (i.e. which has the highest sequence number, e.g. bn−1), and accordingly,
– the digit having the lowest positional value of the number is stored in the rightmost or LSB position in the register (i.e. which has the lowest sequence number, e.g. b0).

If we store a given number in a register, the digits of the number can be assigned to the bits of the register in different ways (depending on the type and size of the represented number, the used encoding etc.). Accordingly, we can talk about the following coding or number representation methods:

  1. direct coding of binary integers
  2. two's complement representation of binary integers
  3. binary coded decimal (BCD) representation of signed or unsigned integers
  4. floating point representation of real numbers


(1) direct coding

Direct coding is used especially for representing non-negative (or unsigned) integers. In this case, the bits of the unsigned binary integer are stored in the usual order of their positional value in the one-bit storage units of the register (i.e. the highest positional value belongs to the MSB of the register, and the smallest positional value the the LSB, respectively). Since we only represent non-negative numbers in the case of direct coding, we do not need a sign bit.

In the case of direct coding, the smallest number that can be stored in an 8-bit register is 0000|00002=010, and the largest number is 1111|11112=25510.



(2) two's complement representation of binary integers

Let's suppose we have an 'n'-bit register having the modulus m=2n and an integer between the range −m/2 and m/2−1. We can code the integer as follows:

This is called two's complement coding.

For example, let's represent some numbers in an 8-bit register using two's complement coding (the subscript '2c' after the bit sequences denotes a number represented in two's complement code):

decimal value direct code decimal value two's complement code
010 0000|00002
110 0000|00012 −110 1111|11112c
210 0000|00102 −210 1111|11102c
310 0000|00112 −310 1111|11012c
...
2310 0001|01112 −2310 1110|10012c
...
12610 0111|11102 −12610 1000|00102c
12710 0111|11112 −12710 1000|00012c
−12810 1000|00002c

For example, −2310=1110|10012 because 0001|01112=1110|10002 (one's complement) and adding 1 in binary 1110|10002+12=1110|10012 is given.

Note that in two's complement code 12710 is the largest positive number that can be represented, and −12810 is the smallest negative number that can be represented.

Two important rules:

By representing negative integers in two's complement code, subtraction can be performed by addition.


(3) binary coded decimal (BCD) representation of signed or unsigned integers

BCD number representation encodes decimal integers by specifying the decimal digits as 4-bit binary numbers. For example, the decimal integer 12310 can be represented

There are two types of BCD number representation:

In the case of packed BCD number representation, usually two digits are represented on one byte (the digit with a higher positional value is stored in the "upper" 4 bits or tetrad, while the digit with a lower positional value is stored in the "lower" 4 bits or tetrad of the byte). In case of a signed number representation, the lower 4 bits of the last byte are used to represent the sign of the number (e.g. 1100 for the plus sign or 1011 for the minus sign, see later). In digital computers usually all data are stored in bytes or the multiples thereof (e.g. because the smallest accessible unit of the computer memory is a byte). Therefore, if the number of tetrads is odd, then the packed BCD code (or string) is completed from the left with a tetrad containing a "leading" 0 (i.e. a sequence of four 0 bits).

In the case of unpacked BCD number representation, one decimal digit is represented in one byte:
– On the "upper" 4 bits (the so-called zone bits) a predetermined bit sequence is stored (this can be e.g. 0000 or FFFF); an exception may be the zone bit of the last (or rightmost) byte, which contains the code of the sign in the case of a signed number representation (see later).
– A decimal digit is stored in binary form on the "lower" 4 bits or tetrads.

In the case of signed unpacked BCD representation, the sign of the number is located in the "upper" 4 bits of the byte containing the last digit, instead of the zone bits of the last byte. The plus or minus sign is encoded as a 4-bit represented hexadecimal digit:

If the number is stored in memory, the digit(s) with the highest positional value will be at the lowest memory address ("big-endian order"). (This rule is followed when storing the characters of a string or the elements of an array in memory.)



(4) floating-point number representation

In the case of floating-point number representation, real numbers (either integers or fractions) are given in normalized form, and they are encoded with a given precision. The real number 'x' to be encoded has the general form
x = s*m*2k
where 's' indicates the plus or minus sign of the number (s=±1), 'm' is the so-called mantissa (m∈ℝ), and 'k' is the exponent (k∈ℤ).

The number of the digits of the mantissa determines the accuracy of the representation, and the exponent determines the order of magnitude of the number that can be represented (cf. Nyakóné Juhász 2011: 21).

The floating-point representation of real numbers is usually

In the following, we will deal with single-precision, 32-bit floating-point number representation. The specification is based on the IEEE 754 standard published in the eighties by the Institute of Electrical and Electronics Engineers (IEEE) (cf. Nyakóné Juhász 2011: 21-22).

If we represent a floating-point number with single precision in 32 bits, then
– the sign of the number is given in the leftmost one bit in the usual way (i.e. as a sign bit, see before),
– the exponent is given in the next 8 bits in excess-127 notation,
– and the mantissa is represented in fixed-point notation on the remaining 23 bits.

The format of the 32-bit single-precision floating-point representation of real numbers is as follows:

b b b ... b b b b b b b ... b b b
sign bit
encoded exponent (exp, 8 bit)
encoded mantissa (frac, 23 bit)

We shall denote
– the real number to be represented by 'x';
– the exponent of the normal form by 'k', and the encoded value of the exponent by exp (which is stored in the 8 bits reserved for it in the single-precision format shown above);
– the mantissa of the normal form by 'm', and the encoded value of the mantissa by frac (which is stored in the 23 bits reserved for it in the single-precision format shown above).

We shall distinguish three different cases:

(1) "normal" case:

In normal case for the floating-point number 'x' to be represented
    |x|≥2−126≈1.17549−38 and
    |x|<2128≈3.40282+38
occurs.

(2) the case of "very small" numbers (close to or equal to zero):

(3) the case of overflow: the absolute value of the numbers is "too" large (and therefore they cannot be represented as floating-point numbers)

In case of overflow, the displayed bit sequence is not interpreted as a number (regardless of the value of the mantissa), but rather as plus or minus infinity depending on the sign bit (±∞). The usual notation for the overflow is NaN ("not a number").

Example 1 (cf. Nyakóné Juhász 2011: 22): Consider the following 32-bit sequence of bits that defines a single-precision floating-point number:
    x=1 10000111 101000000000000000000002
Let's determine the corresponding real number in the decimal number system!

Based on the normal case (1)
– since the sign bit is 1, therefore 'x' is negative;
– exp=1000|01112=13510 ⇒ k=exp−127=8;
– m=1.101000...2, therefore
    m=1+1/2+1/8=8/8+5/8=13/8
occurs.

Since 28=256, so the real number we are looking for is
x=−13/8*28=−13/8*256=−416


Example 2: Consider the following 32-bit sequence of bits that defines a single-precision floating-point number:
    x=0 00000000 101000000000000000000002
Let's determine the corresponding (approximate) real number that corresponds the above sequence of bits!

Based on case (2)
– since the sign bit is 0, therefore 'x' is positive;
– exp=0000|00002=010 ⇒ k=exp−126=−126;
– m=0.101000...2 (the mantissa in case (2) begins with 0.), therefore
    m=1/2+1/8=5/8
occurs.

So the real number we are looking for is
x=5/8*2−126≈7.3468...10−39


Example 3: Consider the following real number:
    x=7.510
Let's determine the bits that the floating-point representation of the number consists of.

The floating-point representation of the number clearly corresponds to the normal case because x≫2−126 holds.

Based on the above calculations, the floating point representation of x=7.5 is
    x= 0 10000001 11100000000000000000000



Boda István, 2022-2025.