Construction of an ALU.

The picocomputer has a very primitive ALU. It consisted of just an Adder with the ability to subtract using 2’s compliment arithmetic. There was no Logic in it at all!

The symbol for an ALU is normally written:

Where the Function input selects the desired combination of the two inputs.

If the Function input consists of 2 control lines, 4 functions are possible:

eg:

00 Add C = A + B

01 Subtract C = A - B

10 And C = A and B

11 Or C = A or B

If we increase the number to 3 control lines then further functions (8 in total) are possible - eg Xor

ALU’s are constructed by creating each of the functions separately, and then using a multiplexor to select the desired output. The Function input is used for the input address lines of the mux.

This
ALU can be constructed using a 74153 4-input mux, a 7483 4-bit adder,
**and** and **or** gates. The subtract function needs a little
work. It has to generate the inverse of B, and the carry-in signal to
the adder, as well as being used by the address lines of the mux.

Adders

There
is one small problem with designing an n-bit adder. It is normal to
design a 1-bit adder and then use n of them to handle n-bits - a**
ripple-carry adder**

The
problem is to do with the **carry**. Every time a signal has to go
through a gate, it incurs a small delay. This gate delay is ~5 ns.
The adder itself can be made from 2 levels of gates. The first level
is an **and** array, this is then fed into the second level which
is an **or** array.

To add 2 bits together takes 2 gate delays.

If
we chain several 1-bit adders together to make an n-bit adder, we
have to wait a considerable time for the **carry out** to
propagate through. The carry out takes 2-gate delays for each single
adder, and so will take 2*n gate delays to appear at the final bit.

For a 32-bit adder this will increase the time from 10ns to 320ns - 32 times slower!

It is possible to speed up this process. Indeed, it is always possible to design a circuit that has only two levels of logic. But for a 32-bit adder, this is extremely hardware intensive.

Cin1 (the carry in of bit 1 ) =

Cout0 (the carry out of bit 0)

Cout0 = (B0.Cin0) + (A0.Cin0) + (A0.B0)

=>

Cin1 = (B0.Cin0) + (A0.Cin0) + (A0.B0)

Cin2 (the carry in of bit 2) =

Cout1 (the carry out of bit 1)

Cout1 = (B1.Cin1) + (A1.Cin1) + (A1.B1)

= (B1.B0.Cin0) + (B1.A0.Cin0) + (B1.A0.B0)

+ (A1.B0.Cin0) + (A1.A0.Cin0) + (A1.A0.B0)

+ (A1.B1)

This complexity can be simplified somewhat, by substituting

gi = ai.bi

which
is true if bit i of the adder **generate**s a *carry out*
independent of *carry in*.

and

pi = ai+bi

which
is true if bit i of the adder **propagate**s a *carry in*
to *carry out*.

c1 = g0 + (p0.c0)

c2 = g1 + (p1.g0) + (p1.p0.c0)

c3 = g2 + (p2.g1) + (p2.p1.g0) + (p2.p1.p0.c0)

c4 = g3 + (p3.g2) + (p3.p2.g1) + (p3.p2.p1.g0)

+ (p3.p2.p1.p0.c0)

A
carry is generated if *carry in* for
some earlier bit is a one, and all intermediate bits propagate a
carry. This is called a** carry-lookahead adder**

Now we have groups of 4-bits with the carry being handled as quickly as the addition.

To extend this scheme to more than 4-bits we can either:

block them together as before and let the carry ripple through

a
**partial carry-lookahead adder**

or

use a higher level carry lookahead on each block of 4-bits

P0 = p3.p2.p1.p0

G0 = g3 + (p3.g2) + (p3.p2.g1) + (p3.p2.p1.g0)

C1 = G0 + (P0.c0)

and similarly for C2, C3 and C4

Multipliers - Booth’s Algorithm

Booth observed that if we allow the ’adder’ to subtract as well as add, we can multiply numbers differently. When he invented this approach, shifting right was quicker than adding. If the multiplier contains sequences of 1’s or 0’s then we can reduce the number of additions.

Instead of just looking at the LOBIT of the shift register, we look at the LOBIT and the 'previous LOBIT'

The new algorithm is now:

Depending on the LOBIT and previous LOBIT do one of the following:

00: Middle of a string of 0’s - do nothing

01: End of a string of 1’s - add the multiplicand to the high order

result

10: Beginning of a string of 1’s - sub multiplicand

11: Middle of a string of 1’s - do nothing

After this perform a one bit arithmetic right shift, and repeat n times.

An ASR shifts 1 bit to the right, but the high order bit inserted is the same as the previous high order bit

eg 0111 => 0011

1100 => 1110

The very first previous LOBIT is 0.

Example of 2 * 6

Multiplicand Step Shift Regs

0010 Initial
values 0000 011**0 0**

1 0010 00: do nothing 0000 0110 0

0010 ASR 0000
001**1 0**

2 0010 10: sub mcand 1110 0011 0

0010 ASR 1111
000**1 1**

3 0010 11: do nothing 1111 0001 1

0010 ASR 1111
100**0 1**

4 0010 01: add mcand 0001 1000 1

0010 ASR 0000 1100 0

One side-effect of Booth’s algorithm is that it handles negative as well as positive numbers.

Using the previous algorithm, it is necessary to remember the initial signs, convert both numbers to be positive, and negate the final result if the two initial signs were different.

159.233
- Lecture 14 -