Foundations of Coding: Compression, Encryption, Error Correction (2015)

Chapter 1. Foundations of Coding

1.3 Block Ciphers, Algebra, and Arithmetic

In this section, we introduce most of the mathematical background used in coding theory and applications. It contains the bases of algebra, and efficient ways to compute in algebraic structures. Some arithmetics is also useful to construct large finite structures. All these notions are widely used in the book and become necessary as soon as block coding methods are envisaged.

Today, the Vernam cipher is the only symmetric cryptographic algorithm that has been proved unconditionally secure. Thus, all other known systems are theoretically breakable.

For these systems, we use almost secure encryption schemes: the knowledge of the ciphertext (or the knowledge of some couples plaintext/ciphertext) does not enable to recover the secret key or the plaintext in humanly reasonable time (see the orders of magnitude and computing limits in Section 1.1.7).

For instance, we can decide to choose a unique key and to reuse it in order to avoid too frequent key exchange protocols. This implies that we have to split the source messages into blocks of some size, depending on the size of the key. Block cipher is also a standard, which is widely used in error detection and correction.

This is also the principle of one of the most famous codes – the ASCII code (“ American Standard Code for Information Interchange”) – which is a numerical representation of the letters and signs of the Latin alphabet. In ASCII code, the source alphabet is the Latin alphabet, and the code alphabet is c1-math-0243. The set of codewords is the set of all the words of size 8 over V:

equation

Each one of the c1-math-0245 characters (uppercases, lowercases, special characters, and control characters) is represented with a word of size 8 over V according to an encoding function. The following Table 1.3 gives an extract of this function.

Table 1.3 Extract of the ASCII Code

A

01000001

J

01001010

S

01010011

B

01000010

K

01001011

T

01010100

C

01000011

L

01001100

U

01010101

D

01000100

M

01001101

V

01010110

E

01000101

N

01001110

W

01010111

F

01000110

O

01001111

X

01011000

G

01000111

P

01010000

Y

01011001

H

01001000

Q

01010001

Z

01011010

I

01001001

R

01010010

Space

00100000

For example, the ASCII code of the message: A KEY, is the string: 0100000100100000010010110100010101011001.

1.3.1 Blocks and Chaining Modes from CBC to CTR

It is possible to encode independently each block of a message with the same algorithm. This is called Electronic Code Book (ECB) cipher mode. More generally, the independence of encryption between the blocks is not required and the several ways of combining the blocks are called encryption modes.

ECB mode: Electronic Code Book In this mode, the message M is split into blocks c1-math-0246 of constant size. Each block is encrypted separately with a function c1-math-0247 (the key k being a parameter of the function) as illustrated in Figure 1.3.

1.3equation

nfg003

Figure 1.3 Block ciphers: ECB mode

Thus, a given block of the message c1-math-0249 will always be encrypted in the same way. This encryption mode is the easiest one but does not provide any security and is normally never used in cryptography.

CBC mode: Cipher Block Chaining The CBC mode was introduced to avoid encrypting a block in the same way in two different messages. We add some initial value c1-math-0250, possibly generated randomly. Each block c1-math-0251 is encoded by an XOR operation with the previous cipherblock c1-math-0252 before being encrypted. Figure 1.4 illustrates this mode.

nfg004

Figure 1.4 Block ciphers: CBC mode

1.4equation

This is the most widely used encryption mode. Decryption uses the inverse of the encoding function c1-math-0254.

CFB mode: Cipher FeedBack To avoid using the inverse function for decryption, it is possible to perform an XOR after the encryption. This is the principle of the CFB mode, which is illustrated in Figure 1.5.

nfg005

Figure 1.5 Block ciphers: CFB mode

1.5equation

The benefit of this mode is to avoid implementing the function c1-math-0256 for decryption: c1-math-0257. Thus, this mode is less secure than the CBC mode and is used in network encryption for example.

OFB mode: Output FeedBack (OFB) is a variant of the previous mode and it provides symmetric encryption and decryption. Figure 1.6 illustrates this scheme.

nfg006

Figure 1.6 Block ciphers: OFB mode

1.6equation

Decryption is performed by c1-math-0259. This mode is useful when one needs to minimize the number of embedded circuits, especially for communications in spacecrafts.

CTR mode: Counter Mode Encryption The CTR mode is also completely symmetric, but encryption can be perfomed in parallel. It uses the encryption of a counter of initial value T (Figure 1.7):

nfg007

Figure 1.7 Block ciphers: CTR mode

1.7equation

Decryption is performed in the same way: c1-math-0261. The advantage of such a mode is that the several computations are independent, as in the ECB mode, but a block is also never encrypted twice in the same way a priori.

Exercise 1.12

A message M is split into n blocks c1-math-0262 that are encrypted into c1-math-0263 using an encryption mode. Bob receives the blocks c1-math-0264, but unfortunately, he does not know that one (and only one) of the blocks was incorrectly transmitted (for example, some 0s became 1s and some 1s became 0s during the transmission of block c1-math-0265). Show that the number of miss-decrypted blocks is equal to 1 if ECB, OFB or CTR modes were used and equal to 2 if CBC, or CFB modes were used.

Solution on page 283.

1.3.2 Algebraic Structure of Codewords

Developing block ciphers implies that we have to be able to perform operations and computations over blocks. For example, the c1-math-0266 operation over a block of bits is a bitwise addition modulo 2 of two vectors. Furthermore, as encoding functions have to be reversible, we need structures for which we can easily compute the inverse of a block. In order to perform these computations using solid algebraic bases, let us recall some fundamental structures. All along the book, c1-math-0267 denotes the set of integers, and c1-math-0268 denotes the set of nonnegative integers.

1.3.2.1 Groups

group c1-math-0269 is a set equipped with an internal binary operator satisfying the following properties:

1.  c1-math-0270 is associative: for all c1-math-0271.

2.  There exists a neutral element c1-math-0272, such that for all c1-math-0273, one has c1-math-0274.

3.  Each element has an inverse: for all c1-math-0275, there exists c1-math-0276 such that c1-math-0277.

Moreover, if the law is commutative (c1-math-0278 for all c1-math-0279), G is called abelian.

A subset H of G is called a subgroup of G if the operations of G restricted to H give a group structure to H. For an element a of a group G, we denote by c1-math-0280 the repetition of the law c1-math-0281, performed on n terms equal to a for all c1-math-0282. Moreover, one has c1-math-0283 and c1-math-0284 for all c1-math-0285.

If an element c1-math-0286 is such that for all c1-math-0287, there exists c1-math-0288, such that c1-math-0289, then g is a generator of the group c1-math-0290 or a primitive root. A group is called cyclic if it has generator.

For example, for a given n fixed, the set of integers c1-math-0291, equipped with the law of addition modulo n is a cyclic group generated by 1; if c1-math-0292 and if we choose the multiplication modulo 7 as a law of composition, the set c1-math-0293 is a cyclic group generated by 3. Indeed c1-math-0294.

Let c1-math-0295 be a group and c1-math-0296. The set c1-math-0297 is a subgroup of G, denoted by c1-math-0298 or c1-math-0299. If this subgroup is finite, its cardinal number is the order ofa.

Property 5 (Lagrange)

If G is a finite group, the cardinal number of any subgroup of G divides the cardinal number of G. In particular, the order of any elements divides the cardinal number of G.

Proof

Let H be a subgroup of a finite group Gc1-math-0300 is the cardinal number of H and consider the sets c1-math-0301 for any c1-math-0302. First of all, all the sets aH have the same cardinal number: if c1-math-0303, then since a is invertible c1-math-0304, so that c1-math-0305. Then these sets form a partition of G: indeed take aH and bH with c1-math-0306 and suppose that there exist an element x in their intersection, that is, c1-math-0307. Then for any element c1-math-0308. But as H is a subgroup, c1-math-0309 and thus c1-math-0310. This proves that aH is included in bH. With the reverse argument, one can prove also that bH is included in aH. Therefore, two sets aH and bH are either equal or disjoint. Finally, any element x in G is in xH. Now, as the sets aH form a partition of G and they are all of cardinal number c1-math-0311, the cardinal order of G is a multiple of c1-math-0312.

Theorem 1 (Lagrange)

In a finite abelian group c1-math-0313 of cardinal number n, for all c1-math-0314.

Proof

Let a be any element in G. The set of the multiples of ac1-math-0315 is equal to G. Indeed, as a is invertible, for all c1-math-0316, we can define c1-math-0317. Reciprocally, if a and x are elements of G a group, then so is their product (c1-math-0318). Hence, the two sets being equal, the products of all their respective elements are also equal:

equation

Yet, as multiplication is commutative in an abelian group, we can then extract a from the product. Moreover, there are n elements in G and we thus obtain the following formula:

equation

In order to conclude, we use the fact that – all elements in G being invertible – so is their product and we can simplify the previous formula: c1-math-0321.

1.3.2.2 Rings

ring c1-math-0322 is a set equipped with two internal binary operators satisfying the following properties:

1.  c1-math-0323 is an abelian group.

2.  c1-math-0324 is associative: for all c1-math-0325.

3.  c1-math-0326 is distributive over c1-math-0327: for all c1-math-0328 and c1-math-0329.

Moreover, if there exists a neutral element for c1-math-0330 in AA is called unitary. This neutral element is noted c1-math-0331, or simply 1 if it is not ambiguous from the context. If c1-math-0332 is commutative, A is called commutative. All elements in A have an opposite, namely their inverse for the law +. However, they do not necessarily have an inverse for the law c1-math-0333. The set of invertible elements for the law c1-math-0334 is denoted by c1-math-0335.

For an element a in a ring A, one denotes by c1-math-0336 (or more simply na) the sum c1-math-0337 of n terms equal to a for all c1-math-0338.

If the set c1-math-0339 is not empty, the smallest element of this set is called the characteristic of the ring. Otherwise, the ring is said to be of characteristic 0. For example, c1-math-0340 is a unitary, commutative ring of characteristic 0.

Two rings c1-math-0341 and c1-math-0342 are isomorphic if there exists a bijection c1-math-0343 satisfying for all x and y in A:

1.8equation

If E is any set and c1-math-0345 is a ring such that there exists a bijection f from E to A, then E can be equipped with a ring structure:

1.9equation

The ring c1-math-0347 defined as such is obviously isomorphic to A. If two rings are isomorphic, one ring can be identified with the other.

An ideal I is a subgroup of a ring A for the law + and “ absorbing” for the law c1-math-0348: for c1-math-0349, the product c1-math-0350 remains in I for any element a of the ring A. For all c1-math-0351, the set c1-math-0352 is an ideal of A, which isgenerated by x. An ideal I of A is calledprincipal if there exists a generator x (such that c1-math-0353). A ring A is principal if an only if any ideal of A is principal.

Euclidean function c1-math-0354 is a mapping of all nonzero elements of a unitary ring to c1-math-0355. A unitary ring with a Euclidean function is Euclidean if it has the property that for every couple of elements a and b of the ring, there exist q and r such that c1-math-0356 and c1-math-0357. This operation is the Euclidean division, and the numbers q and r are, respectively, called the Euclidean quotient and the remainder

and are denoted by c1-math-0358 and c1-math-0359 (for a modulo b). Any Euclidean ring is principal. This implies that there exists a Greatest Common Divisor (GCD) for all couples of elements c1-math-0360. Any generator of the ideal c1-math-0361 is a gcd.

For example, the ring c1-math-0362 with the absolute value as Euclidean function, it is a Euclidean ring. The following famous theorem of Bézout extends the properties of Euclidean rings. It is true for any Euclidean ring, and its proof in c1-math-0363 will follow from the Euclidean algorithm page 37.

Theorem 2 (Bézout)

Let a and b be two nonzero elements of a Euclidean ring A, and let d be their GCD. There exist two elements x and y, called the Bézout's numbers, such that c1-math-0364 and c1-math-0365 satisfying

equation

The modulo operation allows to define a ring on c1-math-0367, the set of nonnegative integers strictly inferior to n, for c1-math-0368. The set c1-math-0369 equipped with the addition and the multiplication modulo n [that is, c1-math-0370 and c1-math-0371] is a (finite) ring. It is widely used in coding.

Exercise 1.13

Bézout's theorem is very useful to prove properties in number theory. In particular, use it to prove the famous Gauss's lemma stated as follows:

Lemma 2 (Gauss)

If an integer number a divides the product of two integers b and c, and if a and b are coprime, then a divides c. .

Solution on page 284

1.3.2.3 Fields

field c1-math-0372 is a set equipped with two internal binary operators satisfying the following properties:

1.  c1-math-0373 is an unitary ring.

2.  c1-math-0374 is a group.

If c1-math-0375 is commutative, the field is commutative; the inverse (or opposite) of x with regard to the law + is denoted by c1-math-0376; the inverse of x with regard to the law c1-math-0377 is denoted by c1-math-0378. The characteristic of a field is its characteristic when considered as a ring.

For instance, c1-math-0379 is a commutative field of characteristic 0.

As all the rings and fields we are dealing with are commutative, thereafter the word ring (respectively field) will refer to a unitary commutative ring (respectively a commutative field).

Two fields are isomorphic if they are isomorphic when considered as rings.

A subset W of a field V is called a subfield of V if the operations of V restricted to W give a field structure to W.

The standard notation is used for classical fields in this book: c1-math-0380 denotes the field of rational numbers, and c1-math-0381 denotes the field of real numbers.

If p is a prime number, the ring c1-math-0382 is a field of characteristic p.

Indeed, with Bézout's theorem (see page 32), we have for all couples of integers a and b, there exists two integers x and y such that

equation

If p is prime and a is a nonzero element of c1-math-0384, this identity applied to a and p gives c1-math-0385, hence c1-math-0386. Thus, a is invertible and x is its inverse. This field is denoted by c1-math-0387.

The field of rational numbers c1-math-0388 and the fields c1-math-0389 are called prime fields.

1.3.2.4 Vector Spaces and Linear Algebra

Rudiments of linear algebra are necessary for the reading of the major part of this book. Without any explicative pretention, here we define useful concepts and we introduce the main notation. A set c1-math-0390 is a vector space over a field V if it has one internal law of composition + and an external law of composition “.,” such that

1.  c1-math-0391 is a commutative group.

2.  For all c1-math-0392.

3.  For all c1-math-0393, and c1-math-0394.

4.  For all c1-math-0395, and c1-math-0396.

5.  For all c1-math-0397 and c1-math-0398.

An element of a vector space is called a vector, and the elements of the field V are called scalars. The set c1-math-0399 equipped with the addition c1-math-0400 and the multiplication c1-math-0401 is a commutative field denoted by c1-math-0402. Thus, the set of bit tables of size n can be equipped with a vector space structure. The set of codewords is then c1-math-0403.

A set of vectors c1-math-0404 is an independent set if for all scalars c1-math-0405 implies c1-math-0406.

The dimension of a vector space V, denoted by c1-math-0407, is the cardinal number of the greatest set of independent vectors of V.

For example, if V is a field, c1-math-0408 is a space of dimension n because the vectors c1-math-0409 are independent.

linear mapping is an application from one vector space to another, which preserves the laws of composition. Namely, if A and B are two vectors spaces, an application f is said to be linear if for all c1-math-0410 in Ac1-math-0411 and for all c1-math-0412 and c1-math-0413.

The image of a linear mapping f from a vector space c1-math-0414 to a vector space c1-math-0415, denoted by c1-math-0416, is the set of vectors c1-math-0417 such that there exists c1-math-0418 with c1-math-0419.

The kernel of a linear application f from a vector space c1-math-0420 to a vector space c1-math-0421, denoted by c1-math-0422, is the set of vectors c1-math-0423 such that c1-math-0424.

It is easy to verify that c1-math-0425 and c1-math-0426 are vector spaces.

If the dimension of c1-math-0427 is finite, this quantity is called the rank of the linear mapping and is denoted by c1-math-0428. Moreover, if the dimension of c1-math-0429 is also finite, then we have the following result: c1-math-0430.

matrix M of size c1-math-0431 is an element of the vector space c1-math-0432, represented by a table of m horizontal lines (rows) of size n or n vertical lines (columns) of size m. The element that lies in the ith row and the jth column of the matrix is written as c1-math-0433. Multiplication of a vector x of size n with such as matrix gives a vector y of size m satisfying c1-math-0434 for all i from 1 to m; multiplication is written as c1-math-0435.

Each matrix M is associated to a linear application f from c1-math-0436 to c1-math-0437, defined by c1-math-0438. Reciprocally, any linear application can be written using a matrix. The coding processes studied throughout this book – mainly in Chapter 4 – often use linear applications, which enable us to illustrate the properties of these functions.

Depending on the chosen structure, codewords can be manipulated with additions and multiplications over integers or vectors. All these structures are very general and common in algebra. Codes are particular as they are finite sets. Finite groups and finite fields have some additional properties, which will be widely used throughout this work.

1.3.3 Bijective Encoding of a Block

Now that we have some structures, we are able to perform additions, multiplications, and Euclidean divisions on blocks. We can also compute their inverses. Here, we give some fundamental examples of computations that can be performed on sets with good algebraic structure. As blocks are of finite size, we willmanipulate finite sets in this section.

1.3.3.1 Modular Inverse: Euclidean Algorithm

Bézout's theorem (see page 32) guarantees the existence of Bézout numbers and thus the existence of the inverse of a number modulo a prime number in c1-math-0439. The Euclidean algorithm makes it possible to compute these coefficients efficiently.

In its fundamental version, the Euclidean algorithm computes the Greatest Common Divisor (GCD) of two integers according to the following principle: assuming that c1-math-0440,

equation

where c1-math-0442 are the remainder of the Euclidean division of a by b. Indeed, if a and b have a common divisor d, then c1-math-0443 are also divisible by d.

The recursive principle appears:

ngr004

Example 1.3 (The c1-math-0444 of 522 and 453)

We compute

equation

The c1-math-0446 of 522 and 453 is equal to 3.

Extended Euclidean algorithm The “ extended” version of the Euclidean algorithm – the one we will use most of the time in this book – enables us to compute the c1-math-0447 of two numbers and a pair of Bézout numbers.

This algorithm is also extended because it is meant to be more general: It can be not only applied to number sets but also applied to any Euclidean ring. This is the case for polynomials, as we see in the following sections.

The principle of the algorithm is to iterate with the following function G:

equation

Example 1.4

We wish to find x and y such that c1-math-0449. We write the matrices corresponding to the iterations with the function G starting from

equation

One first computes c1-math-0451 with c1-math-0452 and c1-math-0453 and one gets the matrix

equation

Then, one iterates the algorithm with c1-math-0455 Thus, at the end, one gets

equation

Hence, we have c1-math-0457. Thus, c1-math-0458, and the Bézout's numbers are c1-math-0459 and c1-math-0460.

Here is a version of the extended Euclidean algorithm that performs this computation while storing only the first line of G. It modifies the variables c1-math-0461, and d in order to verify at the end of each iteration that c1-math-0462 and c1-math-0463.

In order to resolve it “ by hand,” we can calculate recursively the following equations c1-math-0464 (we stop when c1-math-0465):

equation

The following exercise gives examples of resolution using this method.

Exercise 1.14 (Extended Euclidean Algorithm)

Find pairs of Bézout numbers for the following integers:

·     c1-math-0467

·     c1-math-0468

·     c1-math-0469

Solution on page 284.

Now let us prove that this algorithm is correct if the Euclidean ring is the set c1-math-0470 of integers. This will also provide a constructive proof of Bézout's theorem (see page 32) in c1-math-0471.

Theorem 3

The extended Euclidean algorithm is correct in c1-math-0472.

Proof

First of all, let us show that the sequence of remainders is always divisible by c1-math-0473: recursively, if c1-math-0474 and c1-math-0475, then c1-math-0476 and thus c1-math-0477. Moreover, the sequence of positive remainders c1-math-0478 is monotonically decreasing and is bounded below by 0. Hence, the algorithm ends.

Besides, after a finite number of steps, one has c1-math-0479. Thus, there exists an index i such that c1-math-0480. In that case, c1-math-0481 and the previous remark indicates that c1-math-0482 is the gcd we are looking for.

Finally, we need to prove that the Bézout numbers are correct. Let us do it recursively. Obviously, the initial case c1-math-0483 is correct and so is the algorithm. Then, let us denote c1-math-0484 and c1-math-0485. Hence c1-math-0486. Recursively, and using the notation introduced in Algorithm 1.4, we have c1-math-0487 with c1-math-0488 and c1-math-0489. This relation implies that c1-math-0490 with c1-math-0491 and c1-math-0492. Thus, the algorithm is correct.

ngr005

Exercise 1.15 (Modular Computation)

The extended Euclidean algorithm also enables one to solve linear modular equations in c1-math-0493. Give a method to solve the following equations:

1.  c1-math-0494

2.  c1-math-0495

3.  c1-math-0496

Solution on page 284.

Complexity of the Euclidean algorithm At each step, the greatest number is, at least, divided by two, hence its size decreases by 1 bit, at least. Thus, the number of steps is bounded by c1-math-0497. At each step, we also compute the remainder of the Euclidean division. The algorithms for Euclidean division we learned in elementary school have a cost c1-math-0498. Finally, the overall cost is bounded by c1-math-0499 if n is the size of the data. However, a closer study of the algorithm can make this complexity more precise. Indeed, the cost of the Euclidean algorithm is rather of the order c1-math-0500 !

The proof is technical, but the idea is rather simple: either there are actually about c1-math-0501 steps, thus each quotient is very small and then each division and multiplication can be performed with c1-math-0502 operations, or the quotients are large and thus each division and multiplication has to be performed with c1-math-0503 operations (but then the number of steps is small).

Exercise 1.16

Implement the extended Euclidean algorithm on the set of integer numbers with your favorite programming language (solution coded in C++).

Solution on page 285.

1.3.3.2 Euler's Totient Function and Fermat's theorem

Let c1-math-0504 be an integer. We denote by c1-math-0505 the set of positive integers lower than n and coprime with n:

equation

The cardinal number of c1-math-0507 is denoted by c1-math-0508. The function c1-math-0509 is called Euler's totient function. For example, c1-math-0510. Moreover, if p is prime, c1-math-0511. Exercise 1.19 focuses on a more general formula.

Each element in c1-math-0512 has an inverse: indeed, as x is coprime with n, Bézout's identity guarantees the existence of two integers u and v of opposite sign (c1-math-0513 and c1-math-0514), such that

equation

Thus, c1-math-0516, that is, c1-math-0517 mod n. The integer u is nonzero and not equal to n because of the relation c1-math-0518 and c1-math-0519. So it is an element of c1-math-0520 and is called the inverse of x modulo n. Computation of u is performed with the extended Euclidean algorithm.

Theorem 4 (Euler)

Let a be any element in c1-math-0521. One has c1-math-0522.

Proof

c1-math-0523 is a finite multiplicative and abelian group, with neutral element 1 and of cardinal number c1-math-0524. Therefore, Lagrange Theorem 1, page 31, applies directly to give c1-math-0525.

Fermat's theorem can be immediately deduced from Euler's theorem when n is prime.

Theorem 5 (Fermat)

If p is prime, then any c1-math-0526 satisfies the following result: c1-math-0527.

Proof

If a is invertible, then Euler's theorem gives c1-math-0528. We multiply the equation by a in order to obtain the relation. The only noninvertible element in c1-math-0529 (if p is prime) is 0. In that case, we have obviously c1-math-0530.

The Chinese Remainder Theorem – which was first formulated by Chinese mathematician Qin Jiu-Shao during the c1-math-0531 century – enables one to combine several congruences modulo pairwise coprime numbers in order to obtain a congruence modulo the product of these numbers.

Theorem 6 (Chinese Remainder Theorem)

Let c1-math-0532 be positive pairwise coprime numbers and c1-math-0533. Then, for all set of integers c1-math-0534, there exists a unique solution c1-math-0535 to the modular equation system c1-math-0536, for c1-math-0537. If we call c1-math-0538, this unique solution is given by

equation

where c1-math-0540 is the inverse of c1-math-0541 modulo c1-math-0542.

Proof

Let us proceed in two steps: first, we prove the existence of x, and then we prove the uniqueness of x. As c1-math-0543 are pairwise coprime, c1-math-0544 and c1-math-0545 are coprime. Bézout's theorem gives us the existence of the inverse of c1-math-0546 modulo c1-math-0547, which is written c1-math-0548. Let c1-math-0549. It is easy to check that x is a solution of the system of congruences ! Indeed, for all i, as c1-math-0550 divides all c1-math-0551 (with c1-math-0552), c1-math-0553. According to the definition of c1-math-0554, we have c1-math-0555.

Now let us prove the uniqueness of x. Let us suppose that there exist two solutions c1-math-0556 and c1-math-0557. Then c1-math-0558 and c1-math-0559. Thus, c1-math-0560 for some c1-math-0561 and c1-math-0562. Hence, c1-math-0563 divides c1-math-0564. Yet, c1-math-0565 and c1-math-0566 are coprime, thus c1-math-0567 also divides c1-math-0568; hence c1-math-0569. Processing recursively, as c1-math-0570 is coprime with the product c1-math-0571, we can deduce that c1-math-0572, or c1-math-0573, which gives us the uniqueness of the solution.

Exercise 1.17

Find all integers x such that c1-math-0574 and c1-math-0575. Deduce the inverse of 49 modulo 55.

Solution on page 285.

Exercise 1.18

Find all integers x whose remainders after division by 2, 3, 4, 5, 6 are, Respectively, 1, 2, 3, 4, 5.

Solution on page 285.

Exercise 1.19 (A formula for Euler's Totient Function)

1.  Let p be a prime number, c1-math-0576. Compute c1-math-0577 with c1-math-0578 and c1-math-0579.

2.  Show that c1-math-0580 is multiplicative, that is, if m and n are coprime, then c1-math-0581.

3.  Deduce a general formula for Euler's previous theorem, using the prime factor decomposition.

Solution on page 286.

Exercise 1.20 (The Chinese Remainder Theorem)

Let c1-math-0582 be pairwise coprime integers and c1-math-0583. We consider the following application:

equation

1.  Prove that c1-math-0585 is a ring isomorphism.

2.           Characterize the function c1-math-0586

Hint: Use c1-math-0587 and notice that c1-math-0588.

3.  Give the unique solution modulo N of the system:

4.  equation

Solution on page 286.

Exercise 1.21 (Application: The Chinese Remainder Problem)

This exercise is a typical application of the Chinese Remainder Theorem. A group of 17 pirates has laid hands on a booty composed of golden coins of equal value. They decide to share it equally and to give the rest to the Chinese cook. The latter would receive three coins. However, the pirates quarrel and six of them are killed. The cook would receive four coins. Unfortunately, the boat sinks and only six pirates, the treasure and the cook are saved. The sharing would give five golden coins to the cook. What is the minimal fortune the latter can expect if he decides to poison the rest of the pirates ?

Note: one may use the following equalities:

·     c1-math-0590 and c1-math-0591

·     c1-math-0592 and c1-math-0593

·     c1-math-0594

Solution on page 287.

1.3.3.3 Modular Exponentiation and Discrete Logarithm

Modular exponentiation is a form of coding widely used in modern encryption methods. Let a be a generator of c1-math-0595. Consider the function

equation

It is associated with a decoding function. As a is a generator, every element c in c1-math-0597 may be written as a power of a. The lowest positive integer b such that c1-math-0598 is called the discrete logarithm (or the index) in base a of b modulo n. We denote c1-math-0599.

equation

The coding function is easy to compute. The method is called the exponentiation by squaring (or binary exponentiation, or even square-and-multiply). It consists in writing b as successive squares.

For example,

equation

With this principle, the computation of c1-math-0602 only requires five multiplications: three exponentiations by squaring and two multiplications.

More generally, the complexity bound of Algorithm 1.5 is c1-math-0603 multiplications modulo n.

ngr006

Indeed, at each call, the exponent b is divided by 2. Hence, there are at most c1-math-0604 recursive calls. At each call, we perform at most two multiplications: a squaring and possibly a multiplication by a. These operations are performed modulo n, that is to say on numbers of c1-math-0605 bits. Even using the naive multiplication algorithms (those we have seen in elementary school), the cost of such multiplication is c1-math-0606.

Thus, the overall cost of the algorithm is c1-math-0607. This cost is reasonable with regard to c1-math-0608, which is the time required to read a or write the result.

Exercise 1.22 (Computations of the Inverse)

1.           Propose an algorithm for the computation of the inverse in c1-math-0609 whenever it exists, based on Euler's theorem.

Application: compute (quickly) c1-math-0610 and c1-math-0611. One can use the following results: c1-math-0612.

2.  Give three different algorithms for the computation of the inverse of y modulo c1-math-0613, with c1-math-0614 distinct prime integers.

Solution on page 287.

The Discrete Logarithm Problem (DLP) deals with the computation of the inverse of the modular power. We have seen that modular exponentiation can be computed in reasonable time. However, this is not the case for discrete logarithms. This skewness is a fundamental principle in cryptography.

The following result is called the discrete logarithm theorem. Recall that a generator of the set c1-math-0615 is a number g such that c1-math-0616.

Theorem 7 (Discrete Logarithm)

If g is a generator of c1-math-0617, then for all c1-math-0618c1-math-0619

Proof

(c1-math-0620) If c1-math-0621, one has c1-math-0622. Yet, c1-math-0623, hence c1-math-0624.

(c1-math-0625) As the sequence of powers of g is periodic of period c1-math-0626, then c1-math-0627.

However, this does not enable one to compute the discrete logarithm with reasonable complexity. Given y, it is difficult to compute x such that c1-math-0628. The only simple method consists in enumerating exhaustively all possible x and it takes a time c1-math-0629. No polynomial time algorithm in c1-math-0630 (size of the input) is known for this problem.

Thereby, if c1-math-0631, modular exponentiation requires less than c1-math-0632 operations, and it takes less than a second for a PC. On the contrary, exhaustive enumeration for computing the discrete logarithm requires c1-math-0633 operations, which is unimaginable in reasonable time according to what we have seen before !!!

In practice, one can apply principles similar to those used in factorization algorithms in order to attempt to solve the discrete logarithm problem.

This kind of function – for which one way can be easily computed but not the other one – is crucial in coding, especially for public key cryptography.

1.3.3.4 One-Way Functions

In cryptosystems called public key cryptosystems, the “ encoding system” has to be known while the decoding has to remain unknown. In this example of encoding with modular exponentiation and decoding with the discrete logarithm, the point of having the encoding function E known and the decoding function D unknown seems contradictory: if one knows E, one inevitably knows D as c1-math-0634.

Actually, replacing “ unknown” by “ extremely difficult to compute on a computer” (i.e., several years for instance), the functions E and D of a public key cryptosystem must satisfy:

·     c1-math-0635 in order to insure c1-math-0636;

·     it is easy (i.e., it can be done quickly) to compute c1-math-0637 from M; and

·     it is difficult (i.e., it takes a very long time) to recover M from c1-math-0638.

In other words, the problem is to find an encryption function E, which is fast to compute but is long to invert. Such a function is called a one-way function (also known as OWF). This notion is absolutely crucial in cryptography and all modern codes are based on it. The principle is illustrated in Figure 1.8.

nfg008

Figure 1.8 Principle of a one-way function

Adding a key to the functions will make decoding easier if one has the key and will make it more difficult if one does not have the key.

Good OWFs are functions such that the research of x given c1-math-0639 is a mathematical problem that is putatively difficult.

There are two interests in calculating in modular arithmetic. First of all, computations “ modulo n” are quite fast: their cost is c1-math-0640 using the most naive algorithms. Moreover, iterations of a function F – even a simple one – computed using arithmetic modulo ntend to have some random behavior. We see in Section 1.3.7 that this kind of computation is used in the major part of pseudo-random generators. Knowing F and n large, it seems difficult to solve the equation: find x such that c1-math-0641, hence to invert the function F.

1.3.4 Construction of Prime Fields and Finite Fields

We have mentioned that we try to give field structures to our codes when possible in order to make operations easier. Now we have a first method of generating good codes: prime fields. It is sufficient to choose a prime number p, and to equip the set c1-math-0642with the addition and the multiplication modulo p. However, “ finding a prime number” is not easy. It is a full-fledged field of research of which we will give a survey in order to leave no algorithmic void in our coding methods.

1.3.4.1 Primality Tests and Prime Number Generation

Even if one does not know a polynomial time algorithm for the factoring of an integer n (polynomial with respect to the size c1-math-0643 of n), it is still possible to quickly generate a prime number p. In coding, it is very useful to be able to generate prime numbers, both for building structured codes such as fields – which can easily be manipulated for error correction – and for setting up secured cryptosystems. For this, one uses primality tests, namely algorithms that determine whether a given number is prime or not. Taking a large odd number n, and applying the test on it, if n is “ composite” one can restart the algorithm with c1-math-0644 until one finds a prime number. The number of prime numbers less than n is asymptotically c1-math-0645. One deduces that – starting from n odd – on average c1-math-0646iterations are sufficient to find a prime number by adding 2 to n at each step.

The most used primality test was proposed by Miller and was made efficient in practice by Rabin. The Miller–Rabin test is an algorithm that determines whether a given number is probably prime or not. Therefore, the response given by the computation is only a probabilistic one, and it might be erroneous. Nevertheless, if one repeats the test a sufficient number of times and if the latter constantly gives thesame response, the error probability will become smaller and smaller and eventually negligible.

Miller-Rabin test Let n be odd and let s and t such that c1-math-0647 with t odd. For any integer c1-math-0648, one has

equation

If n is prime, according to Fermat's theorem, c1-math-0650; therefore

·     Either c1-math-0651;

·     Or c1-math-0652 for some ic1-math-0653.

The Miller–Rabin composition test is based on this property.

One says that a has succeeded in the Miller–Rabin composition test for n if c1-math-0654 and c1-math-0655 for all c1-math-0656.

If n is odd and composite, there are less than c1-math-0657 integers a, which fail in the Miller–Rabin composition test. Therefore, by choosing a randomly in c1-math-0658, the error probability is lower than c1-math-0659.

This test can be efficiently used in order to build a probable prime number with an error probability lower than c1-math-0660. One proceeds as follows:

1.           Choose randomly an odd integer n.

ngr007

2.  Choose randomly k distinct numbers c1-math-0661. Apply the Miller–Rabin composition test for each integer c1-math-0662.

3.  If no c1-math-0663 succeeds in the composition test, we deduce that n is prime; the error probability is lower than c1-math-0664;

4.  Otherwise, repeat the loop using c1-math-0665 instead of n.

The complexity bound of the Miller–Rabin test is similar to that of modular exponentiation, namely c1-math-0666; and c1-math-0667 if one wants an error probability of around c1-math-0668.

Thus, the average arithmetic cost of the generation of prime numbers is bounded by c1-math-0669. Indeed, as there are about c1-math-0670 prime numbers less than n, it would take an average of c1-math-0671 tries to find a prime number less than n.

In practice, using this test, it is easy to generate a 1000 digit prime number with an error probability arbitrarily low.

Besides, it is possible to make the Miller–Rabin algorithm deterministic by testing a sufficient number of integers a. For example, Burgess proved that testing all integers a lower than c1-math-0672 was enough to obtain a prime number with certainty. However, the test would then become exponential in the size of n.

Finally, in 1990, a theorem proved that, assuming the generalized Riemann hypothesis, one of the most famous conjectures in mathematics, it is enough to test the c1-math-0673 first integers. Thus, theoretical studies show that this test is efficient and reliable.

Agrawal–Kayal–Saxena (AKS) primality test In order to have an overall survey of this topic, let us mention a new primality test – the AKS test – which was built by Agrawal, Kayal, and Saxena. In 2002, they proved the existence of a polynomial time deterministic algorithm that determines whether a number is prime or not without using the Riemann hypothesis. So far, despite this important theoretical result, in practice, one prefers probabilistic algorithms because of their efficiency.

The idea is close to Mille-r-Rabin's and uses the formalism of polynomials: if n is prime, then for all a

1.10equation

The AKS algorithm checks this equality for some values of a, by developing explicitly c1-math-0675. For this test to have a polynomial algorithmic complexity bound, one needs to reduce the size of the polynomials (this is done by performing the test modulo c1-math-0676 withr satisfying some properties3) and to use a sufficient number of witnesses a, but only of the order of c1-math-0679, that is, a polynomial of the size of n.

We only give a rough idea of the justification of the AKS test, all the more as we have not introduced polynomials yet. This is what we are going to do now, because they constitute a necessary formalism to construct finite fields and codes.

We know how to build large prime fields by just computing large prime numbers. However, these are not the only existing finite fields. In order to build finite fields of any size (even if this size is not a prime number) – and provided that there exist such fields – we now have to introduce the ring of polynomials over any field.

1.3.4.2 Arithmetic of Polynomials

Let V be a field. We call a sequence a mapping of c1-math-0680 onto V. The image of c1-math-0681 in a sequence a is denoted by c1-math-0682. The support of a sequence a is the number of nonzero elements in the image of a. A polynomial P on V is a sequence with a finite support. The numbers c1-math-0683 are called the coefficients of P. The degree of a polynomial P is the highest i, such that c1-math-0684 is nonzero and is denoted by c1-math-0685. The coefficient c1-math-0686 is then called the leading coefficient. A polynomial is monic if its leading coefficient is equal to the neutral element for the multiplication in V.

The addition of two polynomials P and Q with coefficients c1-math-0687and c1-math-0688 results in the polynomial c1-math-0689 with coefficients c1-math-0690 for all c1-math-0691. The multiplication of the two polynomials P and Q results in the polynomial c1-math-0692, with coefficients c1-math-0693 for all c1-math-0694.

Let X be the polynomial such that c1-math-0695, and c1-math-0696 for all c1-math-0697. Any polynomial P of degree d may be written as

equation

The utility of such a notation is among others to define a function P for any polynomial P: to each element x of Vc1-math-0699. Now to efficiently evaluate the latter expression for an element x, one would use the Horner scheme of Algorithm 1.7.

ngr008

The set of all polynomials with these operations is a ring, denoted by c1-math-0700. The null element is the all-zero sequence and the neutral element is the polynomial with coefficients c1-math-0701 and c1-math-0702 for all c1-math-0703. It is a principal and Euclidean ring, with the degree as an Euclidean function. Indeed, one can define a Euclidean division: for all nonnull polynomials A and B, there exist two unique polynomials Q and R with c1-math-0704 such that

equation

The polynomial c1-math-0706 is the quotient in the Euclidean division of A by B; also the remainder R is denoted c1-math-0707.

The notion of Greatest Common Divisor (GCD) is then defined; the extended Euclidean algorithm can be applied to two nonzero polynomials A and B and provides a polynomial of maximum degree (it is unique if monic) that divides both A and B. Besides, Bézout's identity is valid. In other words, if A and B are two polynomials in c1-math-0708 and c1-math-0709 is their c1-math-0710, there exist two polynomials S and T in c1-math-0711 such that c1-math-0712 and c1-math-0713 and

equation

If A and B are different from the polynomial 1 (the neutral element), the extended Euclidean algorithm enables one to compute two polynomials S and T whose respective degrees are strictly lower than those of c1-math-0715 and c1-math-0716.

Two polynomials A and B are said to be coprime if their GCD is equal to the polynomial 1; in other words, A and B have no common factor of nonzero degree. One says that the nonconstant polynomial P of c1-math-0717 is an irreducible polynomial over V if P is coprime with all nonconstant polynomials of degree lower than c1-math-0718.

As for the prime factor decomposition of any integer, any monic polynomial of nonzero degree has a unique factorization in powers of monic irreducible factors over V (up to a constant); one says that c1-math-0719 is aunique factorization domain. In other words, it is possible to decompose any polynomial of nonzero degree A of c1-math-0720 in the form

equation

where the c1-math-0722 are nonzero integers and the polynomials c1-math-0723 irreducible over V. If A is monic, the c1-math-0724 factors can be chosen monic: the decomposition is then unique, up to a permutation of the indices.

An element c1-math-0725 of V is a root of c1-math-0726 if c1-math-0727, where c1-math-0728 is the value of the function associated to the polynomial A evaluated in c1-math-0729.

If c1-math-0730 is a root of A, then c1-math-0731 divides A. Let B be the polynomial such that c1-math-0732. One says that c1-math-0733 is a simple root of A if c1-math-0734 is not a root of B, that is, c1-math-0735. Otherwise, if c1-math-0736, one says that c1-math-0737 is a multiple root of A.

Example 1.5

In c1-math-0738, the polynomial c1-math-0739 can be factorized into c1-math-0740. One can easily check that c1-math-0741 is irreducible (the only irreducible polynomials in c1-math-0742 of nonzero degree lower than 2 are X and c1-math-0743; and neither 0 nor 1 are roots of c1-math-0744).

1.3.4.3 The Ring c1-math-0745 and Finite Fields

Let c1-math-0746 be a field and let P be a polynomial of degree c1-math-0747. One denotes by c1-math-0748 the set of polynomials of degree strictly lower than d equipped with the addition and multiplication modulo P. Namely, for all polynomials c1-math-0749 in c1-math-0750, with c1-math-0751 and c1-math-0752:

equation

This is a commutative monic ring of neutral elements 0 and 1 with regard to the laws + and c1-math-0754. This ring is called the quotient ring of c1-math-0755 modulo P.

If P is an irreducible polynomial, then c1-math-0756 is a field. Indeed, if Q is a nonzero polynomial of degree lower than c1-math-0757, then Q and P are coprime and with Bézout's identity c1-math-0758, one obtains c1-math-0759. In other words, Q is invertible in the quotient ring c1-math-0760.

Example 1.6 (Over the Field c1-math-0761)

If c1-math-0762 (a nonirreducible polynomial), the ring c1-math-0763 is such that:

equation

This ring is not a field because c1-math-0765 proves that c1-math-0766 is not invertible. On the other hand, if one considers c1-math-0767 (an irreducible polynomial), the ring c1-math-0768 is such that c1-math-0769. This ring is a field as c1-math-0770.

Therefore, we now have finite fields that are more general than prime fields. Indeed, our last example provided us with a field of four elements, which is not a prime field.

Finite fields are called Galois fields. They are denoted by c1-math-0771, where q is the cardinal number of the field. The next property enables us to handle all finite fields by this construction principle and to explain the notation c1-math-0772 for “ the” finite field of cardinal number q.

Property 6

Two finite fields of same cardinal number are isomorphic.

One can use an irreducible polynomial in order to build a finite field. As for prime number generation, looking for irreducible polynomials is a fully fledged domain of which we will give a survey.

1.3.4.4 Irreducible Polynomials

In order to build finite fields, we need some irreducible polynomials, as we needed prime numbers in order to build prime fields.

In the same way as we have seen primality tests for numbers in Section 1.3.4.1, we begin by giving a test that enables to recognize irreducible polynomials.

The first test which is easy to perform is to make sure that the polynomial is square-free, namely that it does not have for divisor the square of another polynomial. This can be done over a finite field as for any other field by considering the derivative c1-math-0773 of P. For c1-math-0774, we note c1-math-0775 its derivative polynomial.

Property 7

A polynomial P is square free if and only if c1-math-0776.

Proof

If P is divisible by a square, then c1-math-0777 for some polynomials h and g. It follows that c1-math-0778 and thus g divides the GCD of P and c1-math-0779. Reciprocally, if c1-math-0780, let us consider an irreducible factor c1-math-0781 of g of degree at least 1. Then c1-math-0782 and c1-math-0783 with f and c1-math-0784 polynomials. By differentiating P, one obtains c1-math-0785, or c1-math-0786. The polynomial c1-math-0787 being irreducible and c1-math-0788 being of degree strictly lower than the degree of c1-math-0789, and c1-math-0790 are coprime. By Gauss's lemma (see page 33), c1-math-0791 necessarily divides c1-math-0792 and c1-math-0793 divides P.

The principle of the irreducibility test is given by the following property.

Proposition 1

For p prime and c1-math-0794, in c1-math-0795, the polynomial c1-math-0796 is the product of all irreducible, monic polynomials whose degree divides d.

In order to prove this proposition, we will need the following lemma:

Lemma 3

c1-math-0797 divides c1-math-0798 if and only if c1-math-0799 divides c1-math-0800.

Proof

If r divides d, then c1-math-0801. Reciprocally, one has c1-math-0802. Let us suppose that c1-math-0803, with c1-math-0804. Then, one has c1-math-0805. Yet, c1-math-0806, thus one obtains c1-math-0807 over the integers, which implies that c1-math-0808. Thus, r divides d.

Proof. [of Proposition 1]

Let P be irreducible of degree r, such that r divides d. Then c1-math-0809 is a field. In V, the order of any nonzero element divides the cardinal number of the group of its invertible elements, namely c1-math-0810 (Section 1.3.2.1). One applies this property to c1-math-0811, so that c1-math-0812. According to the lemma, c1-math-0813 divides c1-math-0814, hence c1-math-0815, and thus P divides c1-math-0816.

Reciprocally, let P be an irreducible divisor of c1-math-0817 of degree r. Then, one has c1-math-0818. Now, set c1-math-0819 of maximum order c1-math-0820 in the group of invertible elements of the field c1-math-0821 (there always exists at least one such element, see page 56). One then applies Equation (1.10 ) d times in order to obtain c1-math-0822. Now, c1-math-0823 and thus c1-math-0824 or c1-math-0825. Hence, c1-math-0826 is necessarily a multiple of c1-math-0827, the order of G. The lemma enables to conclude that r actually divides d.

One then just needs show that no square divides c1-math-0828. Indeed, its derivative polynomial is c1-math-0829 and the polynomial c1-math-0830 is coprime with any other polynomial.

Thus, the factors of c1-math-0831 are all the irreducible, monic polynomials whose degree divides d. If a polynomial of degree d has no common factor with c1-math-0832 for c1-math-0833, it is irreducible. From this property, we can build a test called Ben-Or's irreducibility test (Algorithm 1.8).

ngr009

Therefore, we may generate random polynomials and, using this test, see if they are irreducible. One denotes by c1-math-0834 the number of irreducible, monic polynomials of degree r in c1-math-0835. As c1-math-0836 is the product of all irreducible, monic polynomials whose degree divides r, one obtains

1.11equation

Indeed, c1-math-0838 is the degree of c1-math-0839, thus c1-math-0840. Hence c1-math-0841. On the other hand, c1-math-0842 implies that c1-math-0843. The latter is a geometric series whose value is c1-math-0844. Finally, c1-math-0845.

This statement shows that among all polynomials of degree r, about one over r is irreducible. One wishes to build an irreducible polynomial. At first sight, it is possible to choose a polynomial randomly, test its irreducibility, and restart the process until one chances on an irreducible polynomial. On average, one needs r draws to find an appropriate polynomial. However, in order to make computations with polynomials easier, it is interesting to obtain sparse polynomials, namely polynomials with very few nonzero coefficients. In this case, exhaustive research might turn out to be more efficient in practice.

We propose the following hybrid Algorithm 1.9 that produces an irreducible polynomial – preferably a sparse one. It is based on the idea of taking polynomials in the form c1-math-0846 with c1-math-0847 chosen randomly of degree significantly lower than r.

ngr010

1.3.4.5 Construction of Finite Fields

Now, we have all the elements necessary to build a finite field of size c1-math-0848 with p a prime number. The method of building finite fields is contained in the proof of the following result:

Theorem 8

For all prime number p and all integers c1-math-0849, there exists a field K of c1-math-0850 elements. This field is unique up to isomorphism.

Proof

Let p be a prime number and let c1-math-0851 be the field of polynomials with coefficients in c1-math-0852. For c1-math-0853 is a field with p elements. For c1-math-0854, Proposition 1 guarantees the existence of at least one irreducible polynomial as there are p irreducible polynomials of degree strictly less than 2 and c1-math-0855, the degree of c1-math-0856, satisfies c1-math-0857. For larger d, Equation (1.11 ) shows that c1-math-0858 and thus there exists at least one irreducible polynomial P of degree d in c1-math-0859. Then, the quotient ring c1-math-0860 is a field.

As P is of degree d and c1-math-0861, there are c1-math-0862 possible remainders. Thus c1-math-0863.

According to Property 6 on page 48, any field of cardinal number c1-math-0864 is isomorphic to V.

Remark 1

The isomorphism between c1-math-0865 and c1-math-0866 equips c1-math-0867 with a field structure.

Indeed, to any vector c1-math-0868 in the vector space c1-math-0869, one can associate in a bijective way the polynomial c1-math-0870. Moreover, one has the following property:

equation

Hence, c1-math-0872 is an isomorphism between c1-math-0873 and c1-math-0874. This equips c1-math-0875 with a field structure in which multiplication is defined by

equation

Hence, one can use a field structure with the vectors in c1-math-0877.

Exercise 1.23

Let K be a finite field of cardinal number c1-math-0878. Using the map c1-math-0879, defined by

equation

prove that there exists a unique prime number p (called the characteristic of K), such that for all c1-math-0881

Solution on page 288.

Exercise 1.24

(Sequel of the previous exercise)

Deduce from the previous exercise that the cardinal number of K is a power of p using the fact that K is a vector space over its subfields. Hint: One may obtain a subfield of K isomorphic to c1-math-0882.

Solution on page 289.

Exercise 1.25 (Construction of the Field c1-math-0883)

1.  Give a necessary and sufficient condition for a polynomial in c1-math-0884 of degree c1-math-0885 to be irreducible. From this condition, deduce all irreducible polynomials of degrees 2 and 3.

2.  Deduce all irreducible polynomials of degree 4.

3.  Set c1-math-0886 with c1-math-0887 the neutral element for the addition and c1-math-0888 the neutral element for the multiplication. Using the first question, write the operation tables (c1-math-0889 inverse) in c1-math-0890.

Solution on page 289.

1.3.5 Implementation of Finite Fields

1.3.5.1 Operations on Polynomials

A typical construction of arithmetic in a finite field c1-math-0891 is – for a given prime number p – to look for some irreducible polynomial P in c1-math-0892 of degree d, then to write the elements of c1-math-0893 as polynomials, or as vectors, and finally to implement the arithmetic operations in c1-math-0894.

Example 1.7 (Construction of the Field c1-math-0895)

There exists a field with 16 elements as c1-math-0896. In order to build the field c1-math-0897, one first looks for some monic irreducible polynomial P of degree 4 in c1-math-0898. Then one establishes the rules of calculation in c1-math-0899.

·              Finding P.

The irreducible polynomial P is written as c1-math-0900 with ab, and c in c1-math-0901. In order to determine P, let us examine all possible values for the triplet c1-math-0902. One cannot have c1-math-0903 as for all these cases, 1 is a root of P. Thus, the triplet c1-math-0904 is to be searched for in the set c1-math-0905. The only irreducible polynomials over c1-math-0906 of degree at most 2 are Xc1-math-0907, and c1-math-0908. To see whether P is irreducible, it is sufficient to compute the GCD of P and c1-math-0909. The calculation (with the Euclidean algorithm for example) of these GCDs! (GDSs!)shows that the only values of c1-math-0910 for which P is irreducible are c1-math-0911, and c1-math-0912. Thus, c1-math-0913 is a possible choice for P. Let us make this choice.

·              Operations on polynomials.

Thus, the elements of the field are c1-math-0914. Therefore, the operations are performed modulo P. For example, c1-math-0915.

1.3.5.2 Use of Generators

There exist other ways to implement finite fields in which the multiplication will be performed much more quickly.

The idea is to use the property of finite fields according to which the multiplicative group of invertible elements of a finite field is cyclic. Namely, there exists at least one generator and the nonzero elements of the field are generatedby this element. Hence, if g is a generator of the multiplicative group of a finite field c1-math-0916, all invertible elements can be written as c1-math-0917.

One can choose to represent each invertible element c1-math-0918 simply by using its index i and represent zero by a special index. This construction – in which one represents the elements using their logarithm – is called exponential representation or cyclic representation, or Zech's construction. Then, typical arithmetic operations are greatly simplified using the following proposition:

Proposition 2

Let c1-math-0919 be a finite field and let g be a generator of c1-math-0920. Then c1-math-0921. In addition, if the characteristic of c1-math-0922 is odd, then c1-math-0923. Otherwise, c1-math-0924.

Proof

Clearly, one has c1-math-0925. If the field is of characteristic 2, then, as in c1-math-0926, one has c1-math-0927. Otherwise c1-math-0928 thus c1-math-0929. Yet, as we consider a field, the equation c1-math-0930 has at most two roots, 1 and c1-math-0931g is a generator, thus the order of g is c1-math-0932 rather than c1-math-0933. The only remaining possibility is c1-math-0934.

This statement gives the following encoding for an element c1-math-0935, if c1-math-0936 is generated by g:

equation

In particular, in our encoding scheme, let us denote by c1-math-0938 the codeword associated with c1-math-0939. We will also denote by c1-math-0940 the index of c1-math-0941; it is equal to c1-math-0942 if the characteristic of c1-math-0943 is odd and equal to c1-math-0944 otherwise. This enables one to write in a simple way all arithmetic operations.

·     Multiplication and division of invertible elements are, respectively, implemented as an addition and a subtraction of indexes modulo c1-math-0945.

·     Therefore, negation (taking the opposite) is simply the identity in characteristic 2 or an addition with c1-math-0946 modulo c1-math-0947 if the characteristic is odd.

·     Addition is the most complex operation. One must implement it using other operations. For example, it is possible to do so the following way: if c1-math-0948 and c1-math-0949 (with c1-math-0950) are two nonzero elements in a finite field, c1-math-0951. This requires to store the index of c1-math-0952 for all k. This is done by precomputing a table, c1-math-0953, of size q, containing the index of the successor of each element in the field. Eventually, addition is implemented with one subtraction of indexes (c1-math-0954), one access to a table (c1-math-0955) and one addition of indices (c1-math-0956).

Table 1.4 Zech's Construction on Invertible Elements in Odd Characteristic

     

Average Cost

Operation

Elements

Indexes

+/-

Tests

Access

           

Multiplication

c1-math-0957

c1-math-0958

1.5

1

0

Division

c1-math-0959

c1-math-0960

1.5

1

0

Negation

c1-math-0961

c1-math-0962

1.5

1

0

Addition

c1-math-0963

c1-math-0964

     
   

c1-math-0965

3

2

1

Subtraction

c1-math-0966

c1-math-0967

     
   

c1-math-0968

3.75

2.75

1

In Table 1.4, we study the calculation of these operations over the indices, assuming the existence of a single “table of successors” of size q. Here, we focus on the complexity of the calculation using the least amount of memory possible, considering random elements. We indicate the cost of the computations taking the mean number of additions and subtraction (+/-), the number of tests, and the number of times we use the table.

Exercise 1.26

Check that the polynomial X is a generator of the field c1-math-0969, constructed with the irreducible polynomial c1-math-0970. Then for the two polynomials c1-math-0971 and c1-math-0972, perform c1-math-0973 and c1-math-0974 using the operations described in Table 1.4.

Solution on page 289.

1.3.5.3 Primitive Roots

In order to put this implementation into practice, we need to find a way of producing generators of finite fields in the same way as we needed a way of producing prime numbers in order to build prime fields or irreducible polynomials to build nonprime fields.

Generators of prime fields A generator of the group of invertible elements in c1-math-0975 is called a primitive root of n. The least primitive root of m is denoted by c1-math-0976.

If p is a prime number, then c1-math-0977 always has exactly c1-math-0978 primitive roots. Indeed, by Lagrange's theorem (Proposition 5 page 5), the order of an element of a group divides the number of elements in the group. This means that the order of any nonzero element of c1-math-0979 divides c1-math-0980. Now suppose that there exists one primitive root g, that is, g generates the group of invertible elements of c1-math-0981. Then for any nonzero element x, there exists an index j such that c1-math-0982. Then, the order of x is c1-math-0983, that is, x is also a generator primitive root, if and only if its index is coprime with c1-math-0984. Now, one has to compute at least one of these c1-math-0985 primitive roots. For this, one uses the following test, which checks whether the order of an element taken at random is c1-math-0986 or not. The main difficulty is to factorize c1-math-0987, at least partially, and we see how to do this in Section 1.4.3.5.

ngr011

Theorem 9

Algorithm Test Primitive Root is correct.

Proof

One uses the result of Section 1.3.2.1: if a is an integer, of order k modulo p, then c1-math-0988 if and only if c1-math-0989.

One deduces that if the order of a is lower than c1-math-0990, as it divides c1-math-0991, then necessarily one of the values c1-math-0992 will be a multiple of the order of a. Otherwise, the only possible value for the order of a is c1-math-0993.

Therefore, a first method of finding a primitive root is to test all integers lower than p one after the other, which are not equal to 1, nor to c1-math-0994, nor any power of an integer; in this way, one is able to find the least primitive root of p. Numerous theoretical results exist, proving that it does not take a great number of attempts to find this first primitive root. It is generally of the order of

equation

with r the number of distinct prime factors of c1-math-0996. Another method is to draw random integers lower than p and to test whether they are primitive roots or not. As that there are c1-math-0997 primitive roots, the probability of success is c1-math-0998; thus, the expected value for the number of draws before finding a primitive root is c1-math-0999. This gives us a better chance than the brute-force method (trying all possibilities).

Generators of finite fields Now we know how to find a generator for a prime field. Let us consider the finite fields c1-math-1000. In order to build them, let us recall that, one has first to build c1-math-1001 and then to find an irreducible polynomial over this field whose degree is k. The question is how to find a generator polynomial of this field in order to encode elements with their index rather than using polynomials. Encoding and arithmetic operations are then the same as those of prime fields.

Once again, we use a probabilistic approach. First of all, let us consider an algorithm testing whether a polynomial is a generator in c1-math-1002. This algorithm is similar to the one we have seen for primitive roots in c1-math-1003.

ngr012

Therefore, once the field is built, an algorithm looking randomly for a generator is easy to implement. Besides, one can start the search into the set of polynomials of small degree (c1-math-1004). However, in order to manipulate sparse polynomials, it is useful to find an irreducible polynomial for which X is a primitive root. Such a polynomial is called X-irreducible, or primitive and in general can be quickly found. In practice, for finite fields of size between 4 and c1-math-1005, it is possible to show that more than one irreducible polynomial in c1-math-1006 is X-irreducible! Therefore, an algorithm looking randomly for an X-irreducible polynomial requires less than c1-math-1007 attempts on average. Thus, an algorithm for finding an irreducible polynomial having X as generator is a simple modification of Algorithm 1.9. If Algorithm 1.11 returns that X is not a generator. one does not select the irreducible polynomial found in Algorithm 1.9.

Example 1.8

Let us return to the example of the field c1-math-1008, which we built with the irreducible polynomial c1-math-1009.

Algorithm 1.11 performed on X returns that X is a generator. Therefore, one can perform computations using the powers of X (Exercise 1.26).

Recall the identification of the elements in c1-math-1010 and operation rules: c1-math-1011c1-math-1012c1-math-1013c1-math-1014c1-math-1015c1-math-1016c1-math-1017c1-math-1018c1-math-1019c1-math-1020c1-math-1021c1-math-1022c1-math-1023c1-math-1024; and c1-math-1025.

With c1-math-1026 written in form c1-math-1027, 1, Xc1-math-1028, multiplication and inverse calculation in c1-math-1029 are performed more easily. Addition is also much easier considering that c1-math-1030 for all k and t in c1-math-1031 such that c1-math-1032.

1.3.6 Curves Over Finite Fields

The exponentiation over finite fields is a good example of a one-way function, and we now have almost all tools to construct and efficiently compute in those fields. Onone hand, the field structure provides many tools for the construction of codes. On the other hand, this structure itself allows more possibilities for code breakers in cryptography. It is possible to define this type of one-way function in a more general structure, a group, so that cryptanalysis is even more difficult. An example of such a group structure is the set of points of a curve defined over a finite field.

In a generic group, we denote by + the group law (which is the multiplication in the group of invertible elements of a finite field c1-math-1033 for example). Then, the multiplication by an integer (i.e., that integer number of calls to the group law, which is the exponentiation by an integer in c1-math-1034) can be used as a one-way function. The discrete logarithm problem, in this general formulation, is to find the number of times one has to add a given generator of the group in order to obtain a given element of the group. We denote this as c1-math-1035 for some scalar (integer) n and an element P of a group (Table 1.5).

Table 1.5 Discrete Logarithm and Exponentiation in Finite Fields and Generic Groups

Group

Exponentiation

DLP

c1-math-1036

c1-math-1037 with c1-math-1038 and c1-math-1039

Find c1-math-1040 s.t. c1-math-1041

G

c1-math-1042 times, that is, c1-math-1043

Find c1-math-1044 s.t. c1-math-1045

1.3.6.1 Weierstrass Model

Let c1-math-1046 be a prime number, c1-math-1047, and let c1-math-1048 such that the polynomial c1-math-1049 does not have multiple roots and consider the equation

1.12equation

If c1-math-1051 is a solution of (1.12 ), then any multiple c1-math-1052 is also a solution. Two solutions are called equivalent if they are equal up to a constant multiplicative factor. This defines an equivalence relation. The elliptic curve c1-math-1053 is the set of equivalence classes of solutions of (1.12 ), which are called points of the curve. For one equivalence class, noted c1-math-1054, if c1-math-1055, there exists a representative of the class of the form c1-math-1056. Indeed, just set c1-math-1057 and c1-math-1058. On the other hand, If c1-math-1059 then xmust also be zero, and there is exactly one equivalence class of solutions with this form. It is denoted by c1-math-1060 and its chosen representative is usually c1-math-1061. In summary the set of points is entirely defined by the cases c1-math-1062 and c1-math-1063; therefore, the definition of an elliptic curve can be simplified to

1.13equation

In fact, the general form of the equation of an ellipse is c1-math-1065. Now if the characteristic of the field is neither 2 nor 3, then the change of variable c1-math-1066 yields an isomorphic curve c1-math-1067 and then a second change of variable c1-math-1068 enables to simplify the equation to (1.13 ). This can be generalized for fields of characteristics 2 and 3:

1.  If c1-math-1069, use c1-math-1070 to get c1-math-1071, followed by c1-math-1072, to obtain c1-math-1073.

2.  Else, c1-math-1074 and c1-math-1075 gives c1-math-1076.

3.  If c1-math-1077, use c1-math-1078 to get c1-math-1079, followed by c1-math-1080 to obtain c1-math-1081.

4.  Else, c1-math-1082 and c1-math-1083 gives c1-math-1084.

Exercise 1.27

Verify that the given variable changes are correct.

Solution on page 289.

To make an exhaustive search for a discrete logarithm impossible in practice, the group of points in an elliptic curve has to be large enough. The following theorem states that the number of points is of the order of the size of the involved finite field.

Theorem 10 (Hasse)

For any prime power c1-math-1085, let c1-math-1086 be the number of points of c1-math-1087, then

equation

1.3.6.2 The Group of Points of an Elliptic Curve

Now that we have defined the points in an elliptic curve, we need to provide a group law. We first give the abstract definition.

Theorem 11

Let c1-math-1089 be a field of characteristic greater than 5 and c1-math-1090 an elliptic curve. Then c1-math-1091 with the following rules for addition is an abelian group:

·     c1-math-1092 is the neutral element for c1-math-1093.

·     For c1-math-1094 is the opposite of P for c1-math-1095.

·     For c1-math-1096 and c1-math-1097 then:

·     if c1-math-1098, then c1-math-1099 and

equation

·     else, c1-math-1101 and:

§  if also c1-math-1102, then c1-math-1103,

equation

§  else c1-math-1105 and c1-math-1106.

The rules of addition derives from the representation of elliptic curves over the real field: if P and Q are two different points on the curve, then c1-math-1107 is the symmetric (with respect to the x-axis) of the third intersection of the curve with the line PQ. In the same manner, c1-math-1108 is the symmetric (with respect to the x-axis) intersection of the tangent in P with the curve, as shown on Figure 1.9. In both cases, c1-math-1109 is the slope of the line and the y-intercept can naturally be recovered as, for example, c1-math-1110.

Exercise 1.28

Let c1-math-1111 and c1-math-1112.

1.  Check that c1-math-1113.

2.  Check that c1-math-1114.

3.  Compute c1-math-1116 and check that it belongs to the curve.

4.  Compute c1-math-1117, the doubling of P, and check that it belongs to the curve.

5.  Compute c1-math-1118, the doubling of Q, and check that it belongs to the curve.

Solution on page 290.

nfg009

Figure 1.9 (a) Elliptic curve addition and (b) doubling on c1-math-1115

Once again, this law of addition can be generalized in characteristics 2 and 3 as given in Table 1.6.

Table 1.6 Group Laws in Characteristics 2 and 3 with c1-math-1119 and c1-math-1120

p

Curve

Addition

Doubling

2

c1-math-1121

c1-math-1122

c1-math-1123

   

c1-math-1124

c1-math-1125

 

opposite: c1-math-1126

c1-math-1127

c1-math-1128

 

c1-math-1129

c1-math-1130

c1-math-1131

   

c1-math-1132

c1-math-1133

 

opposite: c1-math-1134

c1-math-1135

c1-math-1136

3

c1-math-1137

c1-math-1138

c1-math-1139

   

c1-math-1140

c1-math-1141

 

opposite: c1-math-1142

c1-math-1143

c1-math-1144

 

c1-math-1145

c1-math-1146

c1-math-1147

   

c1-math-1148

c1-math-1149

 

opposite: c1-math-1150

c1-math-1151

c1-math-1152

Moreover, note that using any of these addition laws, the algorithm for multiplication by an integer remains almost exactly Algorithm 1.5, page 41, where multiplications are replaced by c1-math-1153 and squarings are replaced by doublings. In this setting, exponentiation by squaring (or square-and-multiply) is often called double-and-add.

Exercise 1.29

Let c1-math-1154 and c1-math-1155, compute c1-math-1156 with only three operations.

Solution on page 290.

Note that there exists many other coordinate systems, such as projective and Jacobian, which differ in the number of multiplications, squarings, additions, or inversions in the base field, for the same group law. The choice of system depends on the respective speed of these operations and the target architecture. Note also that for certain subset of curves, such as Edwards curves, the coordinate system can be simplified, often leading to practical enhancements.

The US National Institute of Standards and Technology (NIST) recommends some elliptic curves4, which contains a large number of points, with sizes ranging from c1-math-1158 to c1-math-1159. Many other databases exist, let us mention Bernstein and Lange's Explicit-Formula database5 and the Acrypta database6, which contains some Edwards curves.

1.3.7 Pseudo-Random Number Generators (PRNG)

Generation of random numbers is widely used in all the methods we have just seen and will be often used in the sequel. In particular, generating numbers randomly is a condition for the perfection of Vernam's OTP scheme (see page 15). Now, it is time to look more deeply at this problem which needs some development.

The definition of randomness is crucial in coding. Indeed, any message presenting some kind of organization (organization is supposed to be the opposite of randomness) is an angle of attack for compression and code breaking. Therefore, one should rely on a solid theory concerning randomness in order to build secure and efficient codes.

Producing a truly random event is unsolvable by computers – which by definition only respond to determined and predictable processes. In order to obtain values produced by “ true” randomness (even if this notion is not completely absolute and refers to what one can observe and predict), one has to call upon assumed unpredictable physical phenomena, such as thermal noise or the description of the Brownian motion of electrons in a resistor.

However, this production of numbers may be called randomness only because we are not able – given our current knowledge in these areas – to explain their mechanisms and because only probabilistic theories enable us to grasp them. With computers, we wish to proceed in the same way in order to generate random numbers. We apply some procedures that make the result unpredictable in practice. This is what we call pseudo-random generation.

Production of random numbers is a very complicated task that has attracted the attention of both machine designers (“ hardware” components, such as thermal noise) and software designers (“ software” products, see examples in the next section) for some time. One must pay attention to this main issue because there exist some efficient methods (we will study some of them) that enable one to detect nonrandom features in a sequence, which is supposed to be random, to recover the method that produced it, and then to break a randomly generated key. If the machine that generates the winning lotto combination were not based on some good random generation process, one would be able to predict the next winning combination.

One often generates a sequence of pseudo-random numbers by computing each number from the previous one (which obviously makes the process completely deterministic and eliminates all randomness), in such a significantly complicated way that – examining the sequence without knowing the method – one could believe that it was truly generated randomly.

A generator must satisfy certain properties to be called a pseudo-random generator. All generated numbers have to be independent from each other. Moreover, they must have a great entropy, and hopefully no rule can be recovered from the sequence of generated numbers. There are several ways to determine whether a generator is acceptable. First of all, one has to make it pass some statistical tests in order to check that the distribution it produces does not significantly differ from the one expected from a theoretical model of randomness. Besides, one can also use algorithmic complexity principles: that is show that in reasonable time no algorithm will be able to predict the mechanisms of the generator.

For example, one can build a generator based on the model of Fibonacci's sequence, by producing the numbers c1-math-1161m being the maximum integer that can be produced. The main advantages of this method are the following: it is very easy to implement, very fast in execution, and “modulo” operation enables one to obtain some hard-to-predict behavior for the generator. However, this generator – like most typical and simple generators – has drawbacks, and it is possible to recover its behavior based on statistical analysis.

The requirements for a pseudo-random generator are very similar to the properties one expects from a ciphertext. Indeed, it must be impossible, when receiving a message or a number, to find out the way it was produced without knowing the key. That is why some methods for random number generation look like cryptographic methods or use them.

1.3.7.1 Congruential Generators

One says that a generator is a Linear Congruential Generator (LCG) if it satisfies the following principle: if c1-math-1162 is the sequence of generated random numbers, one calculates c1-math-1163 from its predecessor: c1-math-1164, with m a large number, and c1-math-1165. The generator is called multiplicative if c1-math-1166. Such a sequence is always periodic; thus, one will have to choose c1-math-1167 such that the period is significantly high. For example, for c1-math-1168, the period is 4. Hence, this generator is not satisfactory at all.

The maximum period is obviously m. There exists a result providing a description of all generators of period m with c1-math-1169:

Theorem 12

The Linear Congruential Generator defined by c1-math-1170 is of period m if and only if c1-math-1171 is coprime with m and a is a primitive root of m.

One usually chooses m as the greatest prime number which can be given by a machine (we have seen how to generate such a number on page 44). Obviously, a large period is not a sufficient criterion for the production of random generators (consider the choice c1-math-1172). Exercise 1.42, on page 85, is an approach to methods of attacking LCGs.

1.3.7.2 Linear Feedback Shift Register (LFSR)

One can generalize Linear Congruential Generators using not only the previous value to build the next element in the sequence but also several previous values, namely c1-math-1173 is computed from linear combinations of c1-math-1174. In other words:

equation

These generators are particularly interesting if m is a prime number because their maximum period is then c1-math-1176, and this maximum is reached, see Theorem 13. Hence, it is possible, even with a small modulo, to have very large periods.

For example, in order to generate random sequences of bits, one chooses c1-math-1177. In this case, the operations can be performed very quickly on a machine with “ eXclusive ORs” (xor) for the addition modulo 2 and with shifts on the bits c1-math-1178 to generate the next bits. There even exist specialized chips performing the necessary operations. Then, one talks about Linear Feedback Shift Register (LFSR). Figure 1.10 summarizes their mechanisms.

nfg010

Figure 1.10 Functional diagram of an LFSR

For some computations, it is interesting to write an LFSR in a polynomial form: set c1-math-1179.

Hence, c1-math-1180 refers to the infinite sequence of bits c1-math-1181 linearly generated by the polynomial c1-math-1182, having the first k initial values set to c1-math-1183.

Exercise 1.30

Write the first eight values generated by the following shift register:

equation

Solution on page 290.

Finally, we have the equivalent of Theorem 12 for LCG with the primitive root replaced by a primitive polynomial.

Theorem 13

For some polynomial c1-math-1185 of degree k, the c1-math-1186 modulo a prime number p is of maximum period c1-math-1187 if and only if c1-math-1188 is a primitive polynomial in c1-math-1189.

These generators are quite fast. Besides, they have also a very large period. However, we will see in Section 1.4.3.2 that the Berlekamp–Massey algorithm enables one to predict the following bits without knowing the generator polynomial, provided that c1-math-1190 successive values have been intercepted.

These generators are used in practice to generate quickly some bits with good statistical properties but they have to be combined with other generators to be cryptographically secure.

Example 1.9 (Securing the Bluetooth Protocol)

Bluetooth is a short-range wireless technology whose aim is to simplify the connections between electronic equipment. It was designed to replace the wires between computers and their devices such as printers, scanners, mice, cell-phones, pocket-PCs, or even numerical cameras.

In order to make this protocol safe, one uses some kind of Vernam encryption scheme (Section 1.2.1) but with a pseudo-random generator based on LFSR: the encryption algorithm uses four LFSRs of respective length c1-math-1191, and c1-math-1192 for an overall c1-math-1193 bits. The 128 bits of the initial value represent the secret key of Bluetooth encryption. Figure 1.11 shows the functional diagram of this encryption scheme.

We notice that the four polynomials that are used are as follows:

·     c1-math-1194;

·     c1-math-1195;

·     c1-math-1196;

·     c1-math-1197.

These four polynomials are primitive polynomials in c1-math-1198 for an overall period of c1-math-1199.

The 4 bits c1-math-1200 produced by these four successive LFSRs are then combined using a nonlinear discrete function f which produces the next bit c1-math-1201 on the output, from its initial state (c1-math-1202) and the successive values of the LFSR, according to the following algorithm:

1.  c1-math-1203 (operations over c1-math-1204);

2.  c1-math-1205 (operations over c1-math-1206);

3.  c1-math-1207 and c1-math-1208 are the 2 bits encoding c1-math-1209;

4.  c1-math-1210 (operations over c1-math-1211); and

5.  c1-math-1212 (operations over c1-math-1213).

nfg011

Figure 1.11 Bluetooth encryption

1.3.7.3 Cryptographically Secure Generators

One can use the principle of a one-way function, namely a function easy to compute but difficult to invert (computation in unreasonable time), to determine the quality of a generator.

The formal definition of a good quality generator is as follows. Given a generator and a finite sequence of bits it has generated, if it is possible, without knowing the method, to predict with good probability and in reasonable time the next bit of the sequence, then the generator cannot be considered as a random generator. Here, a good probability means significantly greater than a random guess, that is, c1-math-1214 for a sequence of bits.

If one is able to prove that there exists no efficient algorithm performing this prediction, then the generator is called cryptographic or cryptographically secure.

For example, the Blum–Micali generator proceeds in the following way:

·     Generate a large prime number p.

·     Let c1-math-1215 be a primitive root of p (a generator of the group of invertible elements in c1-math-1216).

·     Let f be the modular exponentiation function c1-math-1217.

·     Let B be the function with values in c1-math-1218 defined by:

·     c1-math-1219 if c1-math-1220;

·     c1-math-1221 if c1-math-1222.

The pseudo-random sequence of bits c1-math-1223 is then computed from the sequence c1-math-1224, with c1-math-1225 any nonzero element in c1-math-1226 and c1-math-1227 for c1-math-1228. One sets c1-math-1229.

The function B is easy to compute: we have seen in the previous subsections how to generate a prime number, how to find a primitive root, and how to carry out modular exponentiation. Finally, when computing B, one has the value of c1-math-1230 using c1-math-1231that has just been calculated. However, without the sequence c1-math-1232, one can prove that finding c1-math-1233 is as difficult as computing the discrete logarithm. As we do not know any efficient algorithm solving the discrete logarithm problem, it is a difficult problem. Therefore, the generator is cryptographically secure.

1.3.7.4 Several Statistical Tests

The previous methods are based on the well-known algorithmic difficulty of some problems, which makes it impossible to predict the behavior of generators. In order to measure the quality of a generator, one can also examine the sequences generated, and test whether they diverge from what we expect from a truly random generator. This is a difficult task as the criteria are numerous and are not necessarily trivial.

Statistics provide us with an adequate tool for these tests. For example, the c1-math-1234 test enables one to measure the deviance with respect to an expected uniform discrete law.

For all characters c1-math-1235 in the alphabet V, one has the expected probability c1-math-1236 and the number c1-math-1237 of occurrences in the generated sequence of size n. The expected frequencies are never exactly equal to the empiric frequencies. Therefore, one has to set the threshold of divergence from which the generator is no longer considered as random.

One has to keep in mind that, when considering a random generator, all sequences are possible a priori, even those whose distribution is completely different from the expected one because the generator is actually random. These sequences are only very unlikely to appear. If one observes such sequences in the output of a generator, then the generator is probably not so good (even if these sequences can theoretically appear in the output of a good generator). Here is how the c1-math-1238 test works.

One measures the gap between the expected distribution and the observed distribution using the parameter:

equation

Now, it remains to determine the acceptable values for the parameter K. They are given by the tables of the c1-math-1240, of which we give an extract in Table 1.7.

Table 1.7 c1-math-1241 Table (Extract)

Degrees of liberty

c1-math-1242

c1-math-1243

c1-math-1244

9

11.39

16.92

21.67

10

12.55

18.31

23.21

11

13.70

19.68

24.72

12

14.85

21.03

26.22

15

18.25

25.00

30.58

20

23.83

31.41

37.57

30

34.80

43.77

50.89

In this table, the first column gives the number of “ degrees of liberty.” One sets this number to the value c1-math-1245. Namely, for an alphabet of size 21, the line number 20. The first line gives the probability of having the value K lower than the value of the table. For example, the probability of having K greater than 24.72 for an alphabet of size 12 is 0.01.

Exercise 1.31 (c1-math-1246 test)

A pseudo-random generator which is supposed to generate numbers between 0 and 10 according to a uniform law gives the sequence: 0 0 5 2 3 6 4 2 0 2 3 8 9 5 1 2 2 3 4 1 2. Perform the c1-math-1247 test on this sequence. What do you think of this generator? What do you think of the test?

Solution on page 290.

Obviously, such a test – although it can be useful and sufficient to reject a generator – is not sufficient to accept a generator. For example, it will not be able to reject the sequence c1-math-1248, whereas a not-so-drilled eye will notice the regularity in it (although one should distrust the impression of regularity one can have looking at a sequence, as it can be biased by false intuitions on what is actually true randomness).

One can, for example, strengthen this test by applying it to the extensions of the source induced by the message (Section 1.2.3.4). There exist numerous statistical tests reinforcing the trust one could have concerning a generator. It is important to notice that each test enables one to reject a generator, but only the set of all tests will enable one to accept it (besides, without mathematical rigor). There is no guarantee that – after succeeding in x tests – the generator will not reveal itself to be weak under the c1-math-1249 test.