Advanced Programming Techniques (2011, 2013)

5. Counting Bits

73

boolean variable, named after the British mathematician George Boole, is a variable that holds either false or true and nothing else. Sometimes within a program we have many boolean variables, perhaps hundreds or thousands. An efficient way to store many boolean variables is in a group called a bitset where each boolean variable is stored in a single bit with 0 meaning false and 1 meaning true. If a bit within a bitset holds 0, we say it isoff or clear, and if it holds 1, we say it is on or set.

One important operation that must be performed on a bitset is to count how many bits are set, which is sometimes called the population count. This chapter explores six different algorithms for computing the population count. Although counting bits is an important operation, it is unlikely you will ever be required to write code to do it. However, these six algorithms are highly instructive because they show very different approaches to the same problem, including straight forward or naive, combinatorial, and look-up.

Bitwise Operators

In order to understand the six population count algorithms, you must first understand the standard bitwise operations that a computer can perform.   A bitwise operation is a manipulation of one or two binary numbers to produce another binary number. A bitwise operation processes one bit at a time, thus the name bitwise. There are seven standard bitwise operations that a computer can perform:

·        left shift

·        unsigned right shift

·        signed right shift

·        not

·        and

·        or

·        exclusive or

The following examples are written in Java, but most programming languages have similar bitwise operators.

Left Shift

The left shift operator is two less than symbols (<<) and shifts all the bits in an integer to the left by a specified number of locations, filling the right-most bits with 0 and losing the values in the left-most bits. Arithmetically this is the same as multiplying an integer by a power of two. In other words, if the variable x holds an integer, x << n is equivalent to x * 2n.

Example 1

int x = 58;

int pos = x << 2;  // 58 * 4 = 232

int z = -42;

int neg = z << 2;  // -42 * 4 = -168

Desk Check

 

decimal

hexadecimal

binary

x

58

0000003a

0000 0000 0000 0000 0000 0000 0011 1010

pos

232

000000e8

0000 0000 0000 0000 0000 0000 1110 1000

z

−42

ffffffd6

1111 1111 1111 1111 1111 1111 1101 0110

neg

−168

ffffff58

1111 1111 1111 1111 1111 1111 0101 1000

74

Unsigned Right Shift

The unsigned right shift operator (also called the logical right shift operator) is three greater than symbols (>>>) and shifts all the bits in an integer to the right by a specified number of locations, filling the left-most bits with 0 and losing the values in the right-most bits. Arithmetically this is the same as dividing any non-negative integer by a power of two. If the variable x holds a non-negative integer, then x >>> n is equivalent to using integer division to compute x / 2n.

Recall that integer division always truncates the quotient to an integer. In other words, integer division simply ignores the remainder. This is what right shift does. It shifts all the bits to the right, losing the right-most bits which is the same as dividing a non-negative integer by a power of two and ignoring the remainder.

Example 2

int x = 58;

int pos = x >>> 2;  //  58 / 4 = 14

int z = -42;

int neg = z >>> 2;  // -42 / 4 != 1,073,741,813

Desk Check

 

decimal

hexadecimal

binary

x

58

0000003a

0000 0000 0000 0000 0000 0000 0011 1010

pos

14

0000000e

0000 0000 0000 0000 0000 0000 0000 1110

z

−42

ffffffd6

1111 1111 1111 1111 1111 1111 1101 0110

neg

1073741813

3ffffff5

0011 1111 1111 1111 1111 1111 1111 0101

Signed Right Shift

In a two’s complement integer, the left-most bit is the sign bit and contains a 0 if the integer is non-negative and contains a 1 if the integer is negative. The signed right shift operator (also called the arithmetic right shiftoperator) is two greater than symbols (>>) and shifts all the bits in an integer to the right by a specified number of locations, filling the left-most bits with the value in the sign bit and losing the values in the right-most bits. Arithmetically this is the same as dividing any non-negative integer by a power of two. If the variable x holds a non-negative integer, then x >> n is equivalent to using integer division to compute x / 2n.

Example 3

int x = 58;

int pos = x >> 2;  //  58 / 4 = 14

Desk Check

 

decimal

hexadecimal

binary

x

58

0000003a

0000 0000 0000 0000 0000 0000 0011 1010

pos

14

0000000e

0000 0000 0000 0000 0000 0000 0000 1110

Because signed right shift preserves the value of the sign bit, it is also the same as dividing any negative integer by a power of two if no 1 bits are lost when shifting right. In other words if you use signed right shift to divide negative integers by 2 (x >> 1), the answer will be the same as integer division only half the time. If you use signed right shift to divide negative integers by 4 (x >> 2), the answer will be the same as integer division only one fourth of the time,

75

Example 4

int w = -40;

int ssrw = w >> 2;  // -40 / 4 == -10

int x = -41;

int ssrx = x >> 2;  // -41 / 4 != -11

int y = -42;

int ssry = y >> 2;  // -42 / 4 != -11

int z = -43;

int ssrz = z >> 2;  // -43 / 4 != -11

Desk Check

 

decimal

hexadecimal

binary

w

−40

ffffffd8

1111 1111 1111 1111 1111 1111 1101 1000

ssrw

−10

fffffff6

1111 1111 1111 1111 1111 1111 1111 0110

x

−41

ffffffd7

1111 1111 1111 1111 1111 1111 1101 0111

ssrx

−11

fffffff5

1111 1111 1111 1111 1111 1111 1111 0101

y

−42

ffffffd6

1111 1111 1111 1111 1111 1111 1101 0110

ssry

−11

fffffff5

1111 1111 1111 1111 1111 1111 1111 0101

z

−43

ffffffd5

1111 1111 1111 1111 1111 1111 1101 0101

ssrz

−11

fffffff5

1111 1111 1111 1111 1111 1111 1111 0101

The best way to use right shift with all integers (positive and negative) and to get the same result as integer division by a power of 2 is to add 2k − 1 to negative integers before using signed right shift, where k is the number of bits to shift right. The next example demonstrates signed right shift to divide a negative integer by 4.

Example 5

int x = -43;

if (x < 0) {

    x += (1 << 2) - 1;  // can be simplified to x += 3;

}

int div = x >> 2;  // -43 / 2 == -10

Desk Check

 

decimal

hexadecimal

binary

x

−43

ffffffd5

1111 1111 1111 1111 1111 1111 1101 0101

x

−40

ffffffd8

1111 1111 1111 1111 1111 1111 1101 1000

div

−10

fffffff6

1111 1111 1111 1111 1111 1111 1111 0110

76

Not

The bitwise not operator is the tilde (~) and takes an integer as input and clears every set bit and sets every clear bit, or in other words, switches every 1 bit to 0 and every 0 bit to 1.

Example 6

int x = 58;

int result = ~x;

Desk Check

 

decimal

hexadecimal

binary

x

58

0000003a

0000 0000 0000 0000 0000 0000 0011 1010

result

−59

ffffffc5

1111 1111 1111 1111 1111 1111 1100 0101

And

The bitwise and operator is the ampersand (&) and takes two integers as input and produces an integer with bits set where the bits are set in both inputs and clear everywhere else. This is the same operation as logical and but performed on each bit. Within a program, bitwise and is often used to test if a bit is set and to clear bits in a variable.

Example 7

int x =  58;

int y = 107;

int result = x & y;

Desk Check

 

decimal

hexadecimal

binary

x

58

0000003a

0000 0000 0000 0000 0000 0000 0011 1010

y

107

0000006b

0000 0000 0000 0000 0000 0000 0110 1011

result

42

0000002a

0000 0000 0000 0000 0000 0000 0010 1010

Or

The bitwise or operator is the vertical bar (|) and takes two integers as input and produces an integer with bits set where the bits are set in either or both inputs and clear everywhere else. This is the same operation as logical orbut performed on each bit. Within a program, bitwise or is often used to set bits in a variable.

Example 8

int x =  58;

int y = 107;

int result = x | y;

Desk Check

 

decimal

hexadecimal

binary

x

58

0000003a

0000 0000 0000 0000 0000 0000 0011 1010

y

107

0000006b

0000 0000 0000 0000 0000 0000 0110 1011

result

123

0000007b

0000 0000 0000 0000 0000 0000 0111 1011

77

Exclusive Or

The bitwise exclusive or operator is the caret (^) and takes two integers as input and produces an integer with bits set where the bits in the two inputs are different and clear everywhere else. This is the same operation aslogical exclusive or but performed on each bit. Within a program, bitwise exclusive or is often used in data encryption and can even be used to swap the values of two variables.

Example 9

int x =  58;

int y = 107;

int result = x ^ y;

Desk Check

 

decimal

hexadecimal

binary

x

58

0000003a

0000 0000 0000 0000 0000 0000 0011 1010

y

107

0000006b

0000 0000 0000 0000 0000 0000 0110 1011

result

94

00000051

0000 0000 0000 0000 0000 0000 0101 0001

Shortcut Operators

All the bitwise operators, except not (~), can be combined with the assignment operator (=) to make shortcut operators as shown in the following examples.

Example 10

int x = 58;

x &= 0x0f;

x |= 0x70;

x ^= 0xb7;

x <<= 2;

x >>>= 1;

x >>= 1;

Desk Check

 

decimal

hexadecimal

binary

x

58

0000003a

0000 0000 0000 0000 0000 0000 0011 1010

x &= 0x0f

10

0000000a

0000 0000 0000 0000 0000 0000 0000 1010

x |= 0x70

122

0000007a

0000 0000 0000 0000 0000 0000 0111 1010

x ^= 0xb7

205

000000cd

0000 0000 0000 0000 0000 0000 1100 1101

x <<= 2

820

00000334

0000 0000 0000 0000 0000 0011 0011 0100

x >>>= 1

410

0000019a

0000 0000 0000 0000 0000 0001 1001 1010

x >>= 1

205

000000cd

0000 0000 0000 0000 0000 0000 1100 1101

78

Example 11: Set and Clear Bits

int x = 58;

x |= (1 << 2);  // Turn on bit #2

// Test if bit #2 is turned on

if ((x & (1 << 2)) != 0) {

    System.out.println("bit 2 is set");

}

else {

    System.out.println("bit 2 is clear");

}

x &= ~(1 << 3);  // Turn off bit #3

Desk Check

 

decimal

hexadecimal

binary

x

58

0000003a

0000 0000 0000 0000 0000 0000 0011 1010

1 << 2

4

00000004

0000 0000 0000 0000 0000 0000 0000 0100

x |= (1 << 2)

62

0000003e

0000 0000 0000 0000 0000 0000 0011 1110

x & (1 << 2)

4

00000004

0000 0000 0000 0000 0000 0000 0000 0100

1 << 3

8

00000008

0000 0000 0000 0000 0000 0000 0000 1000

~(1 << 3)

−9

fffffff7

1111 1111 1111 1111 1111 1111 1111 0111

x &= ~(1 << 3)

54

00000036

0000 0000 0000 0000 0000 0000 0011 0110

Example 12: Switch the Values of Two Variables

int x = 58;

int y = -42;

// Exchange the values of x and y.

x ^= y;

y ^= x;

x ^= y;

Desk Check

 

decimal

hexadecimal

binary

x

58

0000003a

0000 0000 0000 0000 0000 0000 0011 1010

y

−42

ffffffd6

1111 1111 1111 1111 1111 1111 1101 0110

x ^= y

−20

ffffffec

1111 1111 1111 1111 1111 1111 1110 1100

y ^= x

58

0000003a

0000 0000 0000 0000 0000 0000 0011 1010

x ^= y

−42

ffffffd6

1111 1111 1111 1111 1111 1111 1101 0110

Example 13: Encrypt and Decrypt a Short Message

// Encrypt the word "superb".

long plain = 0x0000737570657262L;  // ASCII values for "superb"

long key = 0x36a1804be2f359e1L;

long cipher = plain ^ key;    // encrypt

// The cipher text would be sent to a second party who

// would receive it and decrypt it using the same key.

long message = cipher ^ key;  // decrypt

Desk Check

 

hexadecimal

binary

plain

00007375 70657262

0000 0000 0000 0000 0111 0011 0111 0101 0111 0000 0110 0101 0111 0010 0110 0010

key

36a1804b e2f359e1

0011 0110 1010 0001 1000 0000 0100 1011 1110 0010 1111 0011 0101 1001 1110 0001

cipher

36a1f33e 92962b83

0011 0110 1010 0001 1111 0011 0011 1110 1001 0010 1001 0110 0010 1011 1000 0011

message

00007375 70657262

0000 0000 0000 0000 0111 0011 0111 0101 0111 0000 0110 0101 0111 0010 0110 0010

Now that you understand the bitwise operators, you are ready to learn the six population count algorithms.

79

Naive Bit Counting

A few years ago employees from a prominent software company would ask candidates during a job interview to write code to count the number of bits set to 1 in a 64-bit number. This is sometimes called the population count. Many candidates would write code similar to this naive Java code.

Example 14

/** Returns the number of bits set in a 64-bit word. */

public static int nbits(long word) {

    int n = 0;

    for (int i = 0;  i != 64;  ++i) {

        if ((word & 1) != 0) {

            ++n;

        }

        word >>>= 1;

    }

    return n;

}

Desk Check

word

word & 1

n

i

     

0011 0101

     
       
       
       
       
       
       
       

Improved Loop Termination

The interviewer would then show the candidate a faster way to solve the problem which was similar to this example.

Example 15

/** Returns the number of bits set in a 64-bit word. */

public static int nbits(long word) {

    int n = 0;

    while (word != 0) {

        if ((word & 1) != 0) {

            ++n;

        }

        word >>>= 1;

    }

    return n;

}

Desk Check

word

word & 1

n

   

0011 0101

   
     
     
     
     
     
   

Example 15 should be faster than example 14 because it eliminates incrementing the variable i, and it stops executing the loop when word reaches zero, meaning example 15 may not have to check all 64 bits. For example, consider this 64-bit number: 0x00 00 00 04 a0 3f 6e 8c. Example 15 will check only 35 bits and then stop, but example 14 will check all 64 bits before it finishes. Of course if a number has the highest order bit set, as is the case half of the time with random data, then example 15 isn’t much faster. So, example 15 is slightly faster than example 14 but only very slightly. There are several other algorithms for computing the population count which are substantially faster.

80

Addition Instead of If

Computers often execute if statements slowly because modern CPUs have a pipeline of instructions that are in the process of being decoded and executed. When the CPU begins executing an if-else statement, before it knows the value of the condition (true or false), it will predict that value (usually as true) and begin speculatively executing the statements in the corresponding part of the if-else statement. However, when the CPU finishes calculating the value of the condition, if it has predicted incorrectly, it must unload the instruction pipeline and begin executing the statements in the other part of the if-else statement which takes time. When an if statement does not have a matching else, the CPU must still predict whether the condition is true or false and speculatively execute statements according to its prediction. So to avoid the performance penalty of an if statement, example 16 replaces the ifstatement with a statement that adds 0 or 1. On my computer example 16 executes in about one third the time of example 15.

Example 16

/** Returns the number of bits set in a 64-bit word. */

public static int nbits(long word) {

    int n = 0;

    while (word != 0) {

        n += (int)word & 1;

        word >>>= 1;

    }

    return n;

}

Desk Check

word

word & 1

n

   

0011 0101

   
     
     
     
     
     
   

Skip Zero Bits

This next example takes advantage of a feature of two’s complement arithmetic, namely: for any integer x, x &= (x - 1) deletes the right-most set bit in x. This means example 17 simply skips the 0 bits. For random values ofword about half the bits will be 0, so example 17 is about twice as fast as example 16 for random data.

Example 17

/** Returns the number of bits set in a 64-bit word. */

public static int nbits(long word) {

    int n = 0;

    while (word != 0) {

        ++n;

        word &= word - 1;

    }

    return n;

}

Desk Check

word

n

word - 1

     

0011 0101

   
     
     
     
   

81

Combinatorial Algorithm

Example 18 is based on algorithm 1.3, page 3 of “Combinatorial Algorithms, Theory and Practice”, by Edward M. Reingold, Jurg Nievergelt, and Narsingh Deo. This code is an example of a combinatorial algorithm or one that repeatedly combines intermediate results to arrive at the answer. Notice how this code repeatedly

1.  shifts an integer to the right and uses a mask to extract every other bit (or every two bits or every four bits, etc),

2.  uses the same mask to extract the other bits in the number, and

3.  adds the bits, thus repeatedly combining intermediate results.

In essence, the code in example 18 uses a 64-bit processor as 32 2-bit processors, then 16 4-bit processors, then 8 8-bit processors, then 4 16-bit processors, 2 32-bits processors, and finally 1 64-bit processor to arrive at the answer. It runs in almost one fifth of the time of example 17.

Example 18

/** Returns the number of bits set in a 64-bit word. */

public static int nbits(long word) {

    final long

        ones  = 0x5555555555555555L,

        twos  = 0x3333333333333333L,

        fours = 0x0f0f0f0f0f0f0f0fL;

    word =  (word >>>  1 & ones) + (word & ones);

    word =  (word >>>  2 & twos) + (word & twos);

    word = ((word >>>  4) + word) & fours;

    word =  (word >>>  8) + word;

    word =  (word >>> 16) + word;

    word =  (word >>> 32) + word;

    return (int)word & 0xff;

}

Desk Check

word

word >>> 1

ones

word >>> 1 & ones

word & ones

0011 0101

       
 
 

word >>> 2

twos

word >>> 2 & twos

word & twos

         
 
 

word >>> 4

 

(word >>> 4) + word

fours

         
         
         

82

Look Up Table

Finally example 19 separates a 64-bit number into 8 bytes and uses a look-up table to determine how many bits are set in each byte and adds those 8 quantities together. This example runs approximately as fast as example 18 but requires 256 bytes of memory for storing the look-up table.

Example 19

/** The number of bits set in a byte. */

private static final byte onbits[] = {

    0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,

    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,

    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,

    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,

    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,

    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,

    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,

    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,

    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,

    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,

    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,

    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,

    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,

    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,

    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,

    4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8

};

/** Returns the number of bits set in a 64-bit word. */

public static int nbits(long word) {

    return onbits[(int)(word >>> 56)] +

        onbits[(int)(word >>> 48) & 0xff] +

        onbits[(int)(word >>> 40) & 0xff] +

        onbits[(int)(word >>> 32) & 0xff] +

        onbits[(int)(word >>> 24) & 0xff] +

        onbits[(int)(word >>> 16) & 0xff] +

        onbits[(int)(word >>>  8) & 0xff] +

        onbits[(int) word & 0xff];

}

Desk Check

word

(int)word & 0xff

onbits[(int)word & 0xff]

0011 0101

   

83

Algorithm Comparison

The following table compares all six of the algorithms presented in this chapter. Notice that the combinatorial and look up algorithms are 21 times faster than the naive algorithm. From the table, it is clear that example 18, the combinatorial algorithm, is the best overall choice of algorithms to count how many bits are set to 1 within an integer. Within the Java 1.6 libraries, the method Long.bitCount uses a slightly modified version of example 18.

Comparison of Bit Counting Algorithms on 64-bit Random Data

Example

Name

Relative Speed
(1 is best)

Speed Depends
on the Input

Data Storage
(bytes)

14

Naive

21

no

0

15

Improved Loop Termination

21

yes

0

16

Addition Instead of If

9

yes

0

17

Skip Zero Bits

5

yes

0

18

Combinatorial

1

no

24

19

Look up

1

no

256

and so on.