Foundations of Coding: Compression, Encryption, Error Correction (2015)

Chapter 1. Foundations of Coding

1.4 Decoding, Decryption, Attacks

To conclude this introductory chapter, we develop encoding methods adopting the point of view of the inverse operation, decoding. We have already seen that, for many reasons, decoding – which consists in inverting the encoding functions – is not a trivial task:

·     as in the fax code we detailed at the beginning of this chapter, if the ciphertext is a sequence of bits, recovering the source message without ambiguity by parsing the sequence into blocks requires a particular form of the code;

·     an exact decoding is sometimes not even completely attainable, if the encodingmethod does not include all the initial information, when performing a compression for example; we have seen that the fax image loses in quality; there exist many encoding methods “ with loss,” which makes encoding and decoding asymmetrics;

·     we have seen that the principle of one-way functions makes the computation of the encoding function and the decoding function completely different; it may happen that there is an efficient algorithm for one of them but not for the other.

We are going to develop all these aspects, using the word decoding as a general term allowing one to recover the source from a codeword, the word decryption for cryptographic decoding, and the words breaking or attack for an unauthorized decryption, namely when the recipient is the only one possessing the information.

1.4.1 Decoding without Ambiguity

The first virtue of a code is its ability to be decoded. This is obvious, but not necessarily a trivial issue.

Let us suppose that the code is a bijective function, which transforms the message written by the sender into a message transmitted through the channel. For a source message c1-math-1250, a string over any source alphabet, and for a code alphabet V, let us denote by f the encoding function. Then, one has the codeword c1-math-1251, with c1-math-1252 for all i. The code, seen as the set of all codewords, is then the image of the encoding function f. However, f being bijective is not enough for the message to be decoded without ambiguity by the recipient.

As an example, let us consider the encoding of the c1-math-1253 alphabet letters c1-math-1254 using integers c1-math-1255 written in base 10:

equation

Then, the codeword 1209 may correspond to several messages: for example, BUJ, MAJ, or BCAJ.

Thus, in order to avoid such a problem, one has to add some constraints on the code for any message to be decoded without ambiguity. That is to say, when receiving a codeword, the recipient has to be able to recover a unique message from it. A code C over an alphabet V is called nonambiguous (one sometimes says uniquely decodable) if, for all c1-math-1257, there exists at most one sequence c1-math-1258 such that

equation

The following property is just a simple reformulation:

Property 8

A code C over an alphabet V is nonambiguous if and only if for all sequences c1-math-1260 and c1-math-1261 in c1-math-1262:

equation

Example 1.10 (Over the Alphabet c1-math-1264)

·     the code c1-math-1265 is not uniquely decodable.

·     the code c1-math-1266 is uniquely decodable.

·     the code c1-math-1267 is uniquely decodable.

The decoding constraint implies that all codewords should have a minimum length. Kraft's theorem gives a necessary and sufficient condition on the length of codewords to insure the existence of a uniquely decodable code.

Theorem 14 (Kraft)

There exists a uniquely decodable code over some alphabet V with n codewords of length c1-math-1268 if and only if

equation

Proof

(c1-math-1270) Let C be a uniquely decodable code, of arity q (the vocabulary of the code contains q characters). Let m be the length of the longest word in C. For c1-math-1271, let c1-math-1272 be the number of words of length k. One develops the following expression, for any integer u, with c1-math-1273:

equation

Once developed, each term of this sum is of the form c1-math-1275 Then, by regrouping for each value c1-math-1276, one obtains the terms c1-math-1277 Set c1-math-1278. The initial expression can be written as follows:

equation

Notice that c1-math-1280 is the number of combinations of words in C whose overall length is equal to s. As C is uniquely decodable, two combinations of words in C cannot be equal to the same word over the alphabet of C. As C is of arity q, and c1-math-1281 is lower than the overall number of messages of length s on this alphabet, one has c1-math-1282. This implies that

equation

Thus, c1-math-1284, and c1-math-1285 when u tends toward infinity. (c1-math-1286) The reciprocal proposition is a consequence of McMillan's theorem, which is studied later on in this chapter.

1.4.1.1 Prefix Property

One says that a code C over an alphabet V has the prefix property (one sometimes says that it is instantaneous, or irreducible) if and only if for all pairs of distinct codewords c1-math-1287 is not a prefix of c1-math-1288.

Example 1.11

c1-math-1289b is not a prefix of a. However, c is a prefix of a.

If the prefix property applies, one is able to decode the words of such a code as soon as one has received the whole word (instantaneousness), which is not always the case with uniquely decodable codes: for instance, if c1-math-1290, and one receives message c1-math-1291. Then one will have to wait for the next occurrence of a 0 to be able to decode the second word (0 or 01?).

Property 9

Any code having the prefix property is uniquely decodable.

Proof

Let C be a code over V that is not uniquely decodable and has the prefix property. Then there exists a string c1-math-1292 such that c1-math-1293, with c1-math-1294 and c1-math-1295 codewords of C and c1-math-1296 for at least one index i. Let us choose the minimum index isuch that c1-math-1297 (for all c1-math-1298). Then c1-math-1299, otherwise, given the choice of i, one would have c1-math-1300, which contradicts the definition of i. If c1-math-1301, then c1-math-1302 is a prefix of c1-math-1303. Otherwise, c1-math-1304 is a prefix of c1-math-1305. Thus, Cdoes not have the prefix property.

The reciprocal proposition is false: indeed, the code c1-math-1306 is uniquely decodable but it does not have the prefix property. The following property is obvious, but it insures the decoding ability for some widely used kinds of codes.

Property 10

If all the words of some code are of the same length, then it has the prefix property.

Exercise 1.32

Let S be the source of alphabet c1-math-1307 with probabilities:ngr020One encodes S using the following codes: ngr021

1.  Encode adbccab. Decode 1001101010.

2.  Is it an instantaneous code?

3.  Compute the entropy of the source.

Solution on page 291.

Exercise 1.33

We wish to build a binary compression code over a source c1-math-1310 (supposed to be infinite) where c1-math-1311 is the source alphabet and c1-math-1312 is the probability law of c1-math-1313.

One proposes the following code: one enumerates the number of occurrences of “0” before the appearance of “1.” The two encoding rules are as follows:

·     A string of four consecutive “0”s (without “1”) is encoded with 0.

·     If less than four “0s” appear before a symbol “1,” one encodes the string with the codeword “c1-math-1314,” c1-math-1315 being the binary representation of the number of “0s” before the symbol “1.”

For instance, the appearance of four consecutive zeros “0000” is encoded with “0,” whereas the string “001” is encoded with “110” because two “0”s appear before the symbol “1” (and “10” is the binary representation of 2).

1.  Write explicitly the five codewords of this compression code. Does this code have the prefix property?

2.  2. Knowing that the probability of appearance of two successive symbols c1-math-1316 and c1-math-1317 is – when supposing that the source is without memory – c1-math-1318, compute the probability of occurrence in c1-math-1319 of a string composed of k “0”s followed by a “1.”

3.  For each codeword, compute the number of bits of code required per bit of the source. Deduce the compression rate of this code, namely the mean length per bit of the source.

Solution on page 291.

1.4.1.2 Huffman Trees

A Huffman tree is an object that enables one to easily represent all codes having the prefix property, and this representation makes their manipulation a lot easier. Here, we give the definitions in the binary case. However, these can be extended to codes of any arity.

One calls a Huffman tree a binary tree such that any subtree has either 0 or 2 sons (the tree is locally complete). One assigns the symbol “1” to the edge connecting the local root to the left subtree and “0” to the edge connecting the local root to the right subtree.

nfg012

Figure 1.12 Example of a Huffman tree

To each leaf of a Huffman tree, one can associate a word in c1-math-1320: it is a string composed of the symbols marking the edges of the path from the root to the leaf.

The maximum length of the words in a Huffman tree is called the depth of the tree. One calls a Huffman code the set of words corresponding to the paths in a Huffman tree; the depth of this tree is also called depth of the code C.

Example 1.12 (Code corresponding to the tree of Figure 1.12)

equation

1.4.1.3 Representation of Instantaneous Codes

The introduction of Huffman trees is justified by the two following properties, which enable one to manipulate instantaneous codes with such trees.

Property 11

A Huffman code has the prefix property.

Proof

If a codeword c1-math-1322 is a prefix of c1-math-1323, then the path representing c1-math-1324 in the Huffman tree is included in the path representing c1-math-1325. As c1-math-1326 and c1-math-1327 are, by definition, associated with the leaves of the tree, c1-math-1328. Thus, there do not exist two different codewords such that one of them is a prefix of the other. Hence, the Huffman code has the prefix property.

Property 12

Any code having the prefix property is included in a Huffman code.

Proof

Let us consider a complete Huffman tree (all leaves are at the same distance from the root) of depth l (the length of the longest word in C). Each codeword c1-math-1329 in C is associated to a path from the root to a node. Then, one can prune the subtree having this node as a root (all the words that could be represented in the nodes of this subtree would have c1-math-1330 as a prefix). All other codewords in C remain in the nodes of the resulting tree. It is possible to perform the same operation for all the other words. One eventually obtain a Huffman tree containing all codewords in C.

1.4.1.4 McMillan's Theorem

We have seen that Huffman trees enable one to represent all instantaneous codes. However, they do not enable one to represent all uniquely decodable codes. McMillan's theorem insures that one can avoid the use of uniquely decodable codes (nonambiguous) not having the prefix property. Indeed, there always exists another code that has the prefix property with the same lengths of codewords. Therefore, nothing can be gained by using ambiguous codes.

Theorem 15 (McMillan)

Over an alphabet V, there exists a code having the prefix property whose codewords c1-math-1331 are of length c1-math-1332 if and only if

equation

Exercise 1.34

Using the representation of instantaneous codes with Huffman trees, give a proof of McMillan's theorem.

Solution on page 291.

Corollaire 1.81

If there exists a uniquely decodable code whose words are of length c1-math-1334, then there exists an instantaneous code whose words are of the same length.

This is a consequence of Kraft's and McMillan's theorems. All decodable codes not having the prefix property do not produce codes with shorter words than instantaneous codes; therefore, one can limit oneself to the latter codes for information compression (their properties make them easier to use).

1.4.2 Noninjective Codes

All codes do not insure a nonambiguous decoding and are not even bijective. It might happen, for several reasons, that encoding functions only process a digest (or a fingerprint) of a message, or only the information which is considered to be sufficient for the needs of transmission. For instance, fingerprints are used for error detection: when receiving a message and its fingerprint, the recipient is able to check that the overall message has not been modified during the transmission by recomputing the fingerprint from the message he has received and comparing it to the fingerprint that was transmitted.

Lossy compression is used, for example, in processing images or sounds. The information is encoded in a way that will enable one to retrieve maybe only a variation of the original data. The differences should be slight enough so that they are not perceptible (for human ear or eye) or so that the new data is still useful.

1.4.2.1 Fingerprint Integrity Check

The most simple principle of error detection is an example of fingerprint computation. We have seen an example of this kind of encoding with the fax code, even if the code we added to each line did not depend on the content of the line. Besides, this only had a limited detection capacity.

The first principle of fingerprints for error detection is the addition of a simple parity bit to the cipherblocks. For a word c1-math-1335, the parity bit is equal to c1-math-1336. Obviously, this equality is false when an odd number of bits change their value in the set “ message+fingerprint.” Hence, the addition of a parity bit enables one to detect errors on a odd number of bits. We will see this mechanism in more detail in Chapter 4, in particular in Figure 4.2 on page 212.

1.4.2.2 Hash Functions

The hash function follows the same principle, but it encodes a more evolved fingerprint, as it is meant to identify the message. This is the definition of a summary of the message, which will enable one not to recover it, but to identify it using a correspondence table. This works as a human fingerprint, which does not enable one to reconstitute the other characteristics of an individual but which enables one to identify him.

Hash functions are particularly useful in cryptography. They notably enable one to decrease the amount of information to be encrypted. If the image of x by the hash function is called the fingerprint of x, one can – for example – encrypt only the fingerprint. Moreover, they enable one to set up electronic signature and message authentication protocols (see Section 3.5.3) and also to check the integrity of a document, in the same way as the parity bit (which is a particular hash function). Formally, a hash function c1-math-1337 is an application that transforms a string of any size into a string of fixed size n, as illustrated in Figure 1.13.

nfg013

Figure 1.13 Principle of a hash function

One talks about a collision between x and c1-math-1338 when

equation

Considering that the input of a hash function can be of any size (in particular c1-math-1340), collisions are inevitable. If y is such that c1-math-1341, then x is called the preimage of y (one recalls that y is the fingerprint of x).

One of the basic constraints in setting up a hash function is efficiency: a fingerprint must be easy to compute. Besides, hash functions have a natural compression property.

Other properties can be expected:

·     Preimage resistant: given y, one cannot find – in reasonable time – some x such that c1-math-1342.

·     Second preimage resistant: given x, one cannot find – in reasonable time – c1-math-1343 such that c1-math-1344;

·     Collision resistant: one can not find in reasonable time – x and c1-math-1345 such that c1-math-1346;

one-way hash function is a hash function satisfying the properties of preimage resistance and second preimage resistance.

Exercise 1.35 (Security of Hash Functions)

Prove, using the contrapositive proposition, that collision resistance implies second preimage resistance, which implies preimage resistance.

Solution on page 292.

Exercise 1.36 (A Bad Hash Functionxs)

Let c1-math-1347 be any function. One proposes an iterative hash function c1-math-1348, such that, for some x of size c1-math-1349 bits, divided into two blocks c1-math-1350 and c1-math-1351, one has c1-math-1352 where c1-math-1353 is the concatenation of c1-math-1354 and c1-math-1355. Prove that g is not second preimage resistant.

Solution on page 292.

Hash functions can be used for

·     Manipulation Detection Code (MDC) that enable one to check the integrity of a message (in the manner of parity bits);

·     MAC that manage both the integrity and the authentication of the source of data.

We will see several examples of such applications in Chapter 3.

Construction of a Merkle–Damgård hash function One of the most famous constructions of hash functions relies on a compression function

equation

Such a function is illustrated in Figure 1.14.

nfg014

Figure 1.14 Compression function of a hash function

Message M is split into blocks of b bits c1-math-1357 (one will possibly have to add some padding bits for the size of M to be divisible by b).

One iterates the compression function h according to the scheme presented in Figure 1.15.

nfg015

Figure 1.15 Merkle–Damgård construction

IV (Initial Value) is a string (of size n) fixed by the algorithm or the implementation. A theorem – which was proved independently by Ralph Merkle and Ivan Damgård – enables one to describe the theoretical properties of such a construction:

Theorem 16 (Merkle–Damgård)

If h is collision resistant, then so is H (Figure 1.15).

It is this result that actually makes Merkle–Damgård construction the most used construction in fingerprint computation.

Exercise 1.37

Prove Merkle–Damgård's theorem using the contrapositive proposition.

Solution on page 292.

Exercise 1.38 (Construction by Composition)

Let c1-math-1358 be a hash function and let c1-math-1359 another hash function such that, if c1-math-1360, then c1-math-1361c1-math-1362 standing for the concatenation operation.

1.  Prove that if f is collision resistant, then so is h.

2.  What is the drawback of this construction ?

Solution on page 293.

Hence, one only needs to make explicit the construction of compression functions h, which are collision resistant. For instance, the Davies–Meyer construction (Figure 1.16) defines c1-math-1363, where c1-math-1364 is a symmetric block encryption function.

nfg016

Figure 1.16 Davies–Meyer construction

But an attack on preimage resistance was set up by Drew Dean in 1999, who exploited the existence of fixed points in this construction. Therefore, compression functions using this construction are less robust.

The Miyaguchi–Preneel construction (Figure 1.17) is an improvement on the previous construction and is particularly robust from a cryptographic point of view.

nfg017

Figure 1.17 Miyaguchi–Preneel construction

Function g adapts the construction to the size of the key of the encryption function E. Hence, one has c1-math-1365.

Galois hashing Another popular hash function is GHASH, for Galois hashing, which uses multiplication in the field with c1-math-1366 elements and Horner scheme. The idea is to choose an element h of c1-math-1367 where the field is usually build as polynomials modulo 2 and modulo the primitive polynomial c1-math-1368. Then, a message m is cut into c1-math-1369 blocks c1-math-1370 of c1-math-1371 bits and each block is considered as a coefficient of the reverse polynomial c1-math-1372. The hash value is obtained as the evaluation of this polynomial in hvia Algorithm 1.7, where each block c1-math-1373 is considered as the element c1-math-1374:

equation

Exercise 1.39 (Security of GHASH)

1.  Suppose that there exist i and j with c1-math-1376 such that for the chosen element of the c1-math-1377 multiplication, we have c1-math-1378. Deduce a way to build a collision in the hash function.

2.  We know that c1-math-1379, with c1-math-1380. How many possible distinct orders are there for elements of c1-math-1381?

3.  For a given h, with a given order o, what size of message would be required to obtain a collision using the first question?

4.  If you were to choose an element h for the hashing, which elements would be best?

5.  For a randomly chosen non zero element h, what is the probability that c1-math-1382 does not divide its order?

6.  If this is not the case what size of message would be required to obtain a collision using the first question?

Solution on page 293.

1.4.2.3 Lossy Transformations

Other transformation processes end with a message encoded with loss and make full decoding impossible. For example, a fax is just a bad copy of the original image, but one sees that it fulfills its duty of transmitting the information contained in the document. One can also encode an image (respectively a sound) without keeping all the information, provided that the loss is not noticeable to the naked-eye (respectively to the ear).

Numerical information is, in essence, discrete information. We have seen that continuous or analogical data, like sounds and images, can be easily digitized. As for sounds, one can encode at each instant a frequency and an amplitude. For images, one decomposes them into pixels and encodes a color for each pixel.

Yet, this natural digitization might not be the best model for many codes. In images, for example, the colors of two contiguous pixels are often not independent, and it will be more judicious to encode a set of pixels as a function rather than a single pixel as a value. Therefore, one encodes blocks of pixels with a periodic (or almost periodic) function.

Hence, encoding happens to be performed on functions rather than on discrete or numerical entities, and this is the principle of the following section. Therefore, encoding will be a particular function, which will be applied to functions, and that we will call rather atransform, or a transformation.

1.4.2.4 Fourier Transform and Discrete Fourier Transform (DFT)

Let us suppose that a message, or part of a message, can be formulated as a periodic and integrable function h (more precisely h should be in c1-math-1383), varying with respect to time t, and of period c1-math-1384. This happens with sounds. As we assume that h is periodic, the same message can be formulated as an amplitude H, varying with respect to a frequency f. The Fourier transform is an encoding process that enables one to switch from one representation another. Like any encoding process, it is associated to its inverse, decoding, and is formulated with the following formulas (Figure 1.18).

nfg018

Figure 1.18 Fourier transform

For a sound, even if the natural and immediate encoding is rather c1-math-1385, one often uses c1-math-1386 that encodes exactly the same information – as encoding is reversible – and is a lot cheaper, because it makes good use of the periodicity of h. Therefore, the Fourier transform is very efficient for compression.

The DFT follows the same principle but with discrete functions. This will obviously be very useful as – by essence – our numerical messages have to be encoded with discrete information. Now let us suppose that our functions c1-math-1387 and c1-math-1388 are discrete functions, that is to say some vectors c1-math-1389 and c1-math-1390 with discrete variables. One formulates the transformation by denoting c1-math-1391 an c1-math-1392 root of unity. c1-math-1393 satisfies the equalities: c1-math-1394.

nfg019

Figure 1.19 Discrete Fourier Transform (DFT)

In other words following Figure 1.19, if c1-math-1395, then

1.14equation

Decoding is correct as

equation

The Discrete Cosine Transform (DCT) is a direct consequence of the DFT for some discrete function h. But instead of being time-varying (which is a good model for a sound), it is space varying (which enables one to encode an image); hence, h is a two-variable discrete function c1-math-1398. For instance, it is the color of a pixel whose coordinates are x and y. In the same way, it is possible to represent differently the same information with a two-variable discrete function c1-math-1399 standing for a spectral analysis of the image. The DCT and its inverse are shown in Figure 1.20, where c1-math-1400 whenever c1-math-1401 and c1-math-1402.

nfg020

Figure 1.20 Discrete Cosine Transform (DCT)

DCT is also a good compression principle for images, as for any periodic (or almost periodic, i.e., periodic up to a small error) message.

These transformations are reversible. Moreover, not only do they prove themselves to be good compression processes but their interest also lies in the easiness of choosing and keeping only important information. Indeed, during such encoding, it is possible to keep only some coefficients of the DFT or the DCT – to reduce the size of information one has to encode – while not necessarily changing the audio/visual result. We will handle these principles in more detail in Chapter 2.

1.4.2.5 DFT Algorithm

One can write the DFT transformation as a matrix c1-math-1403 with c1-math-1404. Thus, the inverse transformation can be written as c1-math-1405.

Remark 2

In some fields, c1-math-1406 simply does not exist. It is therefore sometimes useful to define the transform with another constant factor: c1-math-1407 and c1-math-1408. For the sake of simplicity, in the following, we will avoid c1-math-1409 and use c1-math-1410 so that c1-math-1411and c1-math-1412.

An immediate algorithm for the calculation of this transform uses a matrix vector product and thus has a complexity of c1-math-1413.

A “ divide and conquer” algorithm decreases this complexity, which is extremely important for encoding. The “ divide and conquer” principle is to split the problem into equivalent subproblems of lower size. Here, one divides the expression of the transform into two parts. Assuming that n is even, and setting c1-math-1414, one has

equation

However, as c1-math-1416 is an c1-math-1417 root of unity, c1-math-1418 is equal to 1 or c1-math-1419 according to the parity of k.

If k is even, then one defines the vector

equation

and the even coefficients of H (the transform of h) are the coefficients of the transform c1-math-1421 of c1-math-1422, which is half the size of h:

equation

Now, if k is odd, one defines the vector

equation

and the odd coefficients of H (the transform of h) are the coefficients of the transform c1-math-1425 of c1-math-1426, which is half the size of h:

equation

One obtains Algorithm 1.12.

ngr013

The complexity of this algorithm is c1-math-1428, in consequence c1-math-1429. It is almost a linear complexity, thus an important improvement with respect to the initial algorithm. Moreover, the algorithm for the inverse Fourier transform is then straightforward, as

equation

1.4.2.6 DFT and c1-math-1431 Roots of Unity in a Finite Field

One considers the polynomial c1-math-1432 in a field c1-math-1433 with c1-math-1434. An c1-math-1435 root of unity in c1-math-1436 is a simple root, if there exists such root, of the polynomial c1-math-1437. The order of a root of unity c1-math-1438 is the least integer o such that c1-math-1439. As c1-math-1440 is a root of c1-math-1441, one has obviously c1-math-1442. Besides, o divides n. Indeed, if one sets c1-math-1443, then one has c1-math-1444. Thus c1-math-1445.

c1-math-1446 primitive root of unity is an c1-math-1447 root of unity of order n.

This notion is crucial for the application of the DFT: in order to compute the DFT in the field c1-math-1448, we used a particular c1-math-1449 root – c1-math-1450 – which is primitive in c1-math-1451.

Now, c1-math-1452 primitive roots are available for any n in c1-math-1453, whereas one is not even sure of their existence in a given finite field. Indeed, in c1-math-1454, a c1-math-1455 primitive root of unity is what we simply called a primitive root in Section 1.3.5.3. In the same way as we did for these roots, the following theorem enables one to determine the fields in which it is possible to make fast calculations on vectors of a given size n:

Theorem 17

Let q be a power of some prime number and let n be coprime with q. The finite field c1-math-1456 contains an c1-math-1457 primitive root of unity if and only if n divides c1-math-1458.

Proof

If a is an c1-math-1459 primitive root, then c1-math-1460 and n is the order of a. As a is also an element of the field with q elements, its order necessarily divides c1-math-1461. Reciprocally, one uses a generator g of the field (a c1-math-1462 primitive root) whose existence is ensured by the algorithm in Section 1.3.5.3. Hence, if c1-math-1463 then c1-math-1464 is an c1-math-1465 primitive root of unity.

One says that a field supports the DFT at order n if there exist c1-math-1466 primitive roots of unity in this field. All fields supporting DFT for n equal to a power of 2 are obviously very interesting for applying the fast divide and conquer algorithm above. For instance, as we will see in Algorithm 1.13, the field with c1-math-1467 elements enables one to multiply polynomials of degree up to c1-math-1468 with the fast algorithm as c1-math-1469.

Therefore, one has to compute such an c1-math-1470 root of unity. It is possible to use a generator, but one would rather use a variant of the algorithm presented in Section 1.3.5.3: draw randomly an c1-math-1471 root (a root of the polynomial c1-math-1472 in c1-math-1473) and test whether its order is actually n. The following corollary gives us the probability of success: c1-math-1474 chances over n.

Corollaire 1.90

If a finite field has at least one c1-math-1475 primitive root of unity, then it has exactly c1-math-1476 primitive roots.

Proof

Let c1-math-1477. Let g be a generator of the field. Thus, c1-math-1478 is an c1-math-1479 primitive root, as well as all c1-math-1480 for t between 1 and c1-math-1481 coprime with n. All these c1-math-1482 are distinct; otherwise g would not be a generator; and these are the only ones as c1-math-1483 with t not coprime to n is of order strictly lower than n. The c1-math-1484primitive roots are necessarily written as c1-math-1485: if c1-math-1486 is an c1-math-1487 primitive root, then c1-math-1488. As g is a generator, one has c1-math-1489. This proves that u is in the form tk.

Exercise 1.40

Find a c1-math-1490 primitive root of unity modulo 31.

Solution on page 293.

However, if the field c1-math-1491 does not contain an c1-math-1492 primitive root of unity, one may extend the field. In the same way as c1-math-1493 with respect to c1-math-1494, one can consider a field containing c1-math-1495 in which the polynomial c1-math-1496 can be completely factorized into polynomials of degree 1. This field is an extension of the field c1-math-1497 and it is called a splitting field of c1-math-1498.

Exercise 1.41

Find a c1-math-1499 primitive root of unity in a field of characteristic c1-math-1500.

Solution on page 293.

1.4.2.7 Fast Product of Polynomials Using DFT

As for the discrete Fourier transform (DFT), the product of two polynomials – which is often used in coding theory (see all constructions based on polynomials in this chapter) – has a naive algorithm of complexity bound c1-math-1501.

The DFT and the calculation algorithm we have just seen enables one to perform this computation in time c1-math-1502.

Given two polynomials c1-math-1503 and c1-math-1504, one denotes by c1-math-1505 and c1-math-1506 the respective discrete Fourier transform of vectors c1-math-1507 and c1-math-1508, where the coefficients of the polynomials are extended with zeros up to the degree c1-math-1509 (degree of the product). Then, from the definition of the transform, one can immediately write the coefficients as c1-math-1510 and c1-math-1511. By simply multiplying these two scalars and using the arithmetic of polynomials, one obtains the value of theproduct c1-math-1512 evaluated at c1-math-1513c1-math-1514; in other words c1-math-1515.

This property enables one to build Algorithm 1.13 that computes the product of two polynomials in time c1-math-1516.

ngr014

The complexity is truly c1-math-1517, as termwise multiplication is linear and the Fourier transform – as well as its inverse - has a complexity c1-math-1518.

1.4.3 Cryptanalysis

We have studied some skewness properties between encoding and decoding. One of them, probably the most important one in cryptography, is that which differentiates decryption (by the recipient) from breaking (by a third party). We dedicate a small part of this book to attack techniques based on the weaknesses of codes, which are developed too quickly.

Cryptographic codes use pseudo-random generators for secret key generation, hash functions for authentication, and one-way functions for public key techniques. We will present separately known attacks for each of these steps. Knowing these attack techniques is essential to generate codes that can resist them.

1.4.3.1 Attacks on Linear Congruential Generator

Linear congruential generators have been looked at in Section 1.3.7 A random number c1-math-1519 is generated as a function of the previously generated number c1-math-1520, using the formula c1-math-1521.

Exercise 1.42 (Attack on LCGs)

·     If m is prime, what is the maximum period of an LCG? In particular, Fishman and Moore studied generators modulo c1-math-1522 in 1986. They determined that if c1-math-1523 then the period is maximum and the generator has good statistical properties. What can you say about c1-math-1524?

·              For c1-math-1525 with p an odd prime, what is the maximum period of an LCG?

One can prove that if c1-math-1526 is the maximum period, then c1-math-1527 for c1-math-1528 and that c1-math-1529 if c1-math-1530 with c1-math-1531 distinct primes.

·     Assuming that m is known, how can one recover a and b?

·     Now, suppose that m is unknown. How can one find the generator if c1-math-1532Hint: one may study c1-math-1533. What happens if c1-math-1534?

·     What is the next integer in this list: 577,114,910,666,107?

Solution on page 294.

1.4.3.2 Berlekamp–Massey Algorithm for the Synthesis of LFSRs

The Berlekamp–Massey algorithm enables one to detect, for an infinite sequence c1-math-1535c1-math-1536, of elements in a field c1-math-1537, whether – beyond some rank – its elements are linear combinations of the previous elements. This is what we have called linear recurrent sequences.

This algorithm is very useful in coding theory, notably in order to perform the cryptanalysis of random generators and cryptographic keys and even to correct the errors of cyclic codes (Section 4.4.6.1). In particular, it enables one to recover the generator polynomial of an LFSR (Section 1.3.7.2) only knowing the first terms of the sequence generated by this LFSR.

The question is, for the sequence c1-math-1538, how to find coefficients c1-math-1539, if they exist, such that

equation

If one uses these constants in order to define the polynomial c1-math-1541, this polynomial is called an annihilator of the sequence.

The set of annihilators is an ideal in the ring c1-math-1542 of polynomials over c1-math-1543. As c1-math-1544 is a principal ideal ring, there exists a unitary annihilator polynomial of minimum degree, called the minimal polynomial of the sequence.

How does one find this polynomial only from the coefficients of the sequence? If one knows the degree d of this polynomial, one has to write d linear equations corresponding to the property of linear recurrence for c1-math-1545 coefficients. Then one has to solve the following linear system:

1.15equation

The only thing that remains to do is to determine the degree of the minimal polynomial. At first sight, one may try iteratively all possible degrees (starting from a constant polynomial of degree 0) and match each polynomial produced with the sequence in order to see whether it is an annihilator or not. If the sequence is truly infinite, this might never stop.

Otherwise, if the sequence is finite, one notices, when considering the system, that the maximum degree of the minimal polynomial is half the number of elements in the sequence. This algorithm implies that one should solve successively several linear systems. In practice, it is possible to take advantage of the symmetric structure of the system in order to solve it on-the-fly, while adding the elements of the sequence progressively. This gives the following Berlekamp–Massey algorithm that has a complexity bound of only c1-math-1547.

The main idea of this algorithm is to explicitly compute the coefficients of the polynomial. Thus, the update of c1-math-1548 is performed in two steps. The trick of the test c1-math-1549 is to enable one to perform each of these two steps alternately. Let us explain the algorithm by looking at the three first terms of the sequence. The degree of the minimal polynomial increases by one (at most) every time one adds two elements of the sequence. The c1-math-1550 are called discrepancies.

The first discrepancy is the first term of the sequence, c1-math-1551 and c1-math-1552, becomes c1-math-1553. Discrepancies correspond to the values taken by the polynomial in the sequel of the sequence. If the discrepancy is null, then the polynomial one considers is an annihilator of an additional part of the sequence. Hence, the second discrepancy is c1-math-1554 applied to the sequence c1-math-1555, namely c1-math-1556. Therefore, the update of c1-math-1557 is c1-math-1558, namely c1-math-1559, which is an annihilator of the sequence c1-math-1560. Then, the third discrepancy is equal to c1-math-1561 and the two polynomials c1-math-1562 and c1-math-1563 are, respectively, equal to c1-math-1564 and c1-math-1565. Hence, c1-math-1566 annihilates c1-math-1567 and c1-math-1568 annihilates c1-math-1569.

ngr015

Hence, as multiplication by X of these annihilator polynomials comes to shifting by one position their application in the initial sequence, one obtains c1-math-1570 and c1-math-1571. Then, it is possible to combine these two polynomials in order to also annihilate the sequence c1-math-1572, by c1-math-1573, which is exactly what the algorithm does. In the sequel, if all next discrepancies are null, this means that the polynomial we have obtained is an annihilator of the sequel of the sequence. One can still continue until one is sure of having the minimal polynomial of the c1-math-1574 terms of the sequence or one can stop the algorithm earlier without checking the last discrepancies (using the control variable c1-math-1575).

As for complexity, the loop ends after c1-math-1576 and at most c1-math-1577 iterations. In each iteration, computing the discrepancy operations and updating the polynomial require both c1-math-1578 operations, for an overall number of operation of c1-math-1579.

It is even possible to use a fast algorithm to reduce this complexity, at least asymptotically. The idea is to see the sequence as a polynomial too. Then, the minimal polynomial of the sequence is such that the product of c1-math-1580 and c1-math-1581 has only a finite number of nonzero terms, the terms of degree at most d. It is possible to rewrite this statement in the following way:

1.16equation

Hence, one notices that computing c1-math-1583Q, and R can be performed by the Euclidean algorithm interrupted in the middle of the computation, as the degree of R is lower than n. Thus, the complexity bound of the computation is the same as for Euclidean algorithm, namely c1-math-1584. However, in practice, the Berlekamp–Massey algorithm remains more efficient for values of d up to dozens of thousands.

1.4.3.3 The Birthday Paradox

The Birthday Paradox is a probability result, and it is called a paradox because it seems to go against the first intuition one could have. It is used in several attack methods in cryptanalysis. It also shows that one should distrust intuitions when talking about probabilities.

There are 365 days in a year and still, in a group of more than 23 people, there is more than one chance in two of having at least two of them with the same birth date !

Indeed, let us take a population of k people. Knowing that the number of days in a year is n, the number of combinations of k distinct birth dates is c1-math-1585. Therefore, the probability of having all people with distinct birth dates is c1-math-1586. Thus, the probability of having at least two people with the same birthday is

equation

Hence, when considering 365 days, this probability is around one chance in c1-math-1588 in a group of 9 people, more than one chance in 2 in a group of 23 people, and 99.4% in a group of 60 people. More generally, one has the following theorem:

Theorem 18

In a set of c1-math-1589 elements chosen randomly among n possibilities, the probability of collision is higher than 50%.

Proof

We have seen that the number of collisions, in a space of size c1-math-1590 with k draws, is c1-math-1591. One has to give an estimation of this probability: c1-math-1592. Yet, c1-math-1593, for x positive, thus

equation

Then, for this probability to be greater than c1-math-1595, it is sufficient to have c1-math-1596, namely, as k is positive, c1-math-1597. Hence, for c1-math-1598 (again one has, for c1-math-1599).

This kind of collision probability – that one's intuition would tend to weaken – enables one to build attacks against systems for which intuition would give a limited chance of success.

1.4.3.4 Yuval's Attack on Hash Functions

Resistance to collision of hash functions can be measured: one has to determine the probability of finding collisions, which is close to the probability of collision in birth dates (birth dates play the role of fingerprints for individuals).

That is why Yuval's attack on hash functions is also called a birthday attack. It is a question of transmitting some corrupted message c1-math-1600 instead of a legitimate message M, in such way that the corruption is unnoticeable for a hash function h. Then, one looks for c1-math-1601and c1-math-1602, such that c1-math-1603. After that, one can, for example, fraudulently change M into c1-math-1604, or send M and pretend to have sent c1-math-1605, which is precisely what h should prevent!

ngr016

As a consequence of Theorem 18, the expected number of draws of c1-math-1606 in Yuval's attack is c1-math-1607.

If one uses Yuval's attack to send c1-math-1608 and then to repudiate it later arguing that c1-math-1609 was actually sent, one has more than one chance in two of succeeding in c1-math-1610 attempts. This shows that brute force can be efficient if a hash function is not collision resistant.

But is this attack really feasible? A simple calculation is enough to be convinced: for a numerical fingerprint on 128 bits, one should perform around c1-math-1611 attempts, which is feasible on general public machines of today: a computer running at 3 GHz performs c1-math-1612 operations per day, thus it would take a little more than two months on the 1000 PCs of a company to find a collision.

But if one uses hash functions on 160 bits, the cost is multiplied by a factor c1-math-1613, which is unreachable so far.

1.4.3.5 Factoring Composite Numbers

It is quite easy to distinguish a composite number from a prime number. But knowing the numbers composing it seems to be a much more difficult problem. It is the factorization problem. Although it can be formulated in a quite simple way, so far there does not exist an efficient solution to it (for instance, the famous Sieve of Eratosthenes is useless for numbers of more than 10 digits).

The difficulty of this problem and the efficiency of attack methods are very important, as a lot of one-way functions rely on the difficulty of factorization or on the difficulty of equivalent problems, such as the discrete logarithm problem. Thus, looking for good factorization algorithms is almost a cryptanalysis method.

Many different algorithms do exist. The goal of this section is not to enumerate them all but rather to give an idea of the most efficient ones for numbers of different sizes.

Pollard's Rho algorithm (Numbers of few digits) The first class of target numbers is composed of “ everyday composite numbers,” namely numbers of less than 20 digits. Pollard's algorithm is very efficient for such numbers.

The algorithm only requires a few lines of code (around forty) and is very easy to implement. Let m be the composite number one wishes to factorize. First of all, one has to compute a sequence of the form c1-math-1614 of large period (the longer the c1-math-1615 are distinct the better).

Table 1.8 Distribution of the Multiples of p Modulo m

0

1

2

c1-math-1616

p

p+1

p+2

c1-math-1617

kp

kp+1

kp+2

c1-math-1618

m-1

   

c1-math-1619

c1-math-1620

 

c1-math-1621

   

c1-math-1622

 

c1-math-1623

   

Then, the idea is to notice that, if p is a factor of m, the distinct c1-math-1624 modulo m are less often distinct modulo p (Table 1.8). In this case, if c1-math-1625 then the GCD of m and c1-math-1626 is equal to kp and it is a nontrivial factor of m. If the c1-math-1627 are actually pairwise distinct, the computation ends in at most p steps. A first version of the algorithm consists in producing some c1-math-1628 and, when adding a new element, computing the GCD with all previous c1-math-1629's. This version has two major drawbacks: first of all, one has to store around pelements; also, it takes c1-math-1630 GCD computations if i and c1-math-1631 are the smallest indexes such that c1-math-1632. The second trick of Pollard is to use Floyd's cycle detection. It consists in storing only the c1-math-1633 such that k is a power of 2. Indeed, as the c1-math-1634 are generated by a function, if c1-math-1635, then for all c1-math-1636 and a cycle is created modulo p, even if it is not directly noticeable.

When only storing powers of 2, the cycle will only be detected for c1-math-1637 with c1-math-1638, as illustrated in Figure 1.21

nfg021

Figure 1.21 Floyd's cycle detection, in the form of a rho

Yet, c1-math-1639. Hence, one performs at most twice the number of requested operations. One single GCD is computed at each step and one single additional element is stored throughout the algorithm. This gives the very fast algorithm presented in Algorithm 1.16.

If one takes c1-math-1640, so that the c1-math-1641 is actually pairwise distinct modulo m, for example, it will take at worst c1-math-1642 iterations if p is the smallest factor of m, and even less in general:

Theorem 19

Pollard's Rho algorithm has more than one chance in two of succeeding in c1-math-1643 steps.

Proof

Once again, the proof is similar to that of the birthday paradox. If k distinct values c1-math-1644 are drawn randomly, then there are c1-math-1645 combinations without collisions between the c1-math-1646 for an overall c1-math-1647. For the probability of finding a nontrivial factor to be greater than c1-math-1648, one must have – according to Theorem 18 - c1-math-1649.

ngr017

In practice, this algorithm factorizes numbers of 1 up to around 25 digits within seconds (with factors of 12 or 13 digits, this gives approximately 10 million operations!) but very quickly, it becomes useless when considering factors of more than 15 digits.

Elliptic curves (Numbers with a few dozen of digits) To go further, one can use a method based on elliptic curves, designed by Pollard and Lenstra.

This method uses elliptic curves of the form c1-math-1650, defined in Section 1.3.6. The idea is to consider the set of solutions of the latter equation modulo the number m to be factorized and to try to add them as if this set was a group. As the addition of points (see Theorem 11) requires to invert coordinates modulo m, if the inversion of say c1-math-1651 fails then it means that c1-math-1652 is not invertible. In other words c1-math-1653 contains a proper factor of m, which can be revealed by computing c1-math-1654. To simplify, suppose c1-math-1655 is the product of only two distinct primes. Then, the curve equation defines two proper elliptic curves, one modulo p and one modulo q. In consequence for any point P, Lagrange's theorem (Theorem 1, page 31), ensures that c1-math-1656, say modulo q, only if k divides c1-math-1657, the number of points of the curve modulo q. If the curve is chosen randomly, then c1-math-1658 and c1-math-1659 are close to p and q, and not necessarily primes. Hence, some of their prime factors will differ with high probability. Therefore, if we compute c1-math-1660, for a small e, it is likely thate will divide one of c1-math-1661 or c1-math-1662, but less likely that it divides both numbers at the same time. When this is the case, itmeans that c1-math-1663 is not on the curve modulo m and therefore that its computation will crash. The overall procedure is thus to try many small prime factors e. A way to perform this efficiently is to compute c1-math-1664 with a not too large B. The algorithm is detailed as Algorithm 1.17.

ngr018

The computational procedure remains simple (around 150 lines of code) and we can give a few ideas concerning the properties of this algorithm: it is conjectured that, in order to factorize an integer m of smallest factor p, this algorithm requires an average number of operations of the order of

equation

In practice, this algorithm factorizes numbers of 25 to 40 digits (with two factors of similar sizes) within seconds. Besides, if one is lucky and chooses some particular elliptic curve, the latter might enable one to factorize very quickly: the project ECMNET (Elliptic Curve Method on the Net) provides an implementation of this algorithm, available on the Internet. This project has allowed the factorization of numbers with factors up to c1-math-1666 digits.

The main issue is that good elliptic curves vary for each number one wishes to factorize and so far there does not exist a method of finding the appropriate curve for a given number. Notwithstanding, the rapidity of computation when one has found a “ good” elliptic curve is at the origin of the factorization program on the Internet: indeed, many Internet users can retrieve the ECM program and launch it in order to try several elliptic curves. Hence, numerous elliptic curves can be studied at the same time and possibly speed up the search of prime factors.

Number Field Sieve (World champion) Finally, the current champion of RSA key factorization (product of two large prime numbers, see Section 3.4.2) is the Number Field Sieve algorithm, which seems to require – in order to factorize some number m, product of two factors of similar sizes – an average number of operations of the order of

equation

A number field is an extension of the field of rational numbers (one considers the infinite field c1-math-1668).

The number field sieve is a generalization of the quadratic sieve when considering the field of all integers modulo m. For the sake of simplicity, we only present the main idea of the latter.

The aim is to find couples of numbers whose squares are congruent modulo c1-math-1669. Then c1-math-1670 is a multiple of m. Now, let c1-math-1671, if one is lucky, c1-math-1672 so that c1-math-1673 and thus d is a nontrivial factor of m.

Example 1.13

Let us try to factorize c1-math-1674. We compute randomly some squares: c1-math-1675 and c1-math-1676. Yet, c1-math-1677 and c1-math-1678 are small with respect to c1-math-1679 and thus easier to factorize, for example using Pollard's method. One obtains c1-math-1680and c1-math-1681. Therefore, one can write c1-math-1682. We have found a relation of the form c1-math-1683, which gives us a factor of c1-math-1684c1-math-1685, and c1-math-1686.

The whole difficulty of the algorithm is to find such integers x and y. For this, one has to compute several squares and store those whose remainders modulo m are small enough to be factorized using another method. Then, one has to find a linear combination of these squares that would give another square: in a matrix, let us store the exponents in the columns and the squares in the lines (Table 1.9).

Table 1.9 Quadratic Sieve for c1-math-1687

 

Exponent of 2

Exponent of 3

of 5

of 7

of c1-math-1688

c1-math-1689

2

3

1

0

c1-math-1690

c1-math-1691

2

0

1

1

c1-math-1692

c1-math-1693

0

2

1

1

c1-math-1694

c1-math-1695

         

According to this table, c1-math-1696 is a square if and only if the line of c1-math-1697 added to the line of c1-math-1698 only gives even exponents. In other words, if M is the matrix of exponents (rows and columns of Table 1.9), one has to find some binary vector x such that c1-math-1699 is even or to find a solution modulo 2 to the associated linear system: x such that c1-math-1700.

Although the basic idea is quite simple, the calculation is a little more delicate than for the previous algorithms. Still, this algorithm holds the current records. In particular, it enabled one to factorize a 200 digit (665 bits) RSA key in 2005. The computing time for this last factorization was gigantic: more than a year and a half of computation on more than 80 machines !

1.4.3.6 Strong Prime Numbers

RSA cryptography (see Chapter 3) is based on the use of two prime numbers p and q and on the difficulty of factorizing their product m. In order to resist various factorization methods (some of them are presented in the form of exercises in Chapter 3, on page 173), the prime numbers one uses must satisfy several properties:

·     In order to resist factorization based on elliptic curves, p and q must have similar sizes and must be large enough. For instance, in order to work with numbers on 1024 bits, they should both have a size of around 512 bits.

·     c1-math-1701 must be large enough: otherwise, it is sufficient to try as a value for p or q all integers close to c1-math-1702 (Fermat's square root attack).

·     In order to resist Pollard's algorithms c1-math-1703 and c1-math-1704 (which take advantage of the factorization of c1-math-1705 and c1-math-1706 when possible), p and q have to be strong prime numbers, that is, each of them must satisfy the conditions:

·     c1-math-1707 has a large factor, denoted by r.

·     c1-math-1708 has a large factor.

·     c1-math-1709 has a large factor.

Gordon's Algorithm 1.18 enables one to generate strong prime numbers:

Exercise 1.43

Prove that the output of Gordon's algorithm is a strong prime number.

Solution on page 294.

ngr019

1.4.3.7 Solving the Discrete Logarithm Problem

Alongside modular exponentiation, the other major class of one-way functions relies on the discrete logarithm problem.

Let G be a group of size n admitting a generator (i.e., G is cyclic). For instance, one could consider the group of invertible elements modulo some prime number pc1-math-1710.

Given a primitive root g and an element b in G, the problem is to find the discrete logarithm of b in base g, namely to find x such that c1-math-1711.

The naive algorithm for solving this problem is to try all possible x until one finds the right one. The worst and the average complexity bounds are c1-math-1712, thus an exponential complexity with respect to c1-math-1713, the size of n.

So far, the best known algorithms for solving this problem have a complexity bound c1-math-1714 in the general case. Most of the time, these algorithms are based on factorization algorithms: we will see that variants of Pollard's Rho – c1-math-1715 – and index calculus – c1-math-1716 – algorithms can be applied to groups. However, complexities are raised to the square with respect to factorization. Indeed, if one considers numbers modulo some composite number – a product of two prime numbers – c1-math-1717, factorization algorithms have a complexity bound that depends on the smallest prime factor, roughly: c1-math-1718. That is why the discrete logarithm method enables one to consider numbers half the size as those used for factorization-based methods with the same level of security. These sizes are even further reduced if one considers more generic groups, such as the group of points of an elliptic curve, for which the best discrete logarithm methods do not apply.

Baby step, Giant step This method was developed by Shanks, and it is divided into two phases: the baby step, tests from c1-math-1719 to c1-math-1720 for all x in some interval, and the giant step, jumps from c1-math-1721 to c1-math-1722.

The idea is to decompose x into two pieces c1-math-1723, with i and j between 1 and c1-math-1724. Hence, one can write c1-math-1725, or c1-math-1726. Thus, one has to compute all possible c1-math-1727 (baby step), and all possible c1-math-1728 in order to checks if one of these values has been computed in the baby step (giant step).

Although computing all these values only takes c1-math-1729 multiplications, looking for the correspondences with a naive method implies that one has to try c1-math-1730 possibilities in the worst case.

The trick that decreases this complexity is to sort the c1-math-1731 in increasing order (complexity c1-math-1732) in order to be able to perform comparisons with a dichotomous research with only c1-math-1733 tests!

Therefore, the time complexity is improved; unfortunately, the space complexity is such that this algorithm is not practical: one has to store all c1-math-1734 integers. Even for a reasonable number of operations (around c1-math-1735 nowadays), the required memory space is then of the order of several billion gigabytes.

Pollard's Rho returns Pollard's Rho algorithm enables one to modify the baby step, giant step method and to introduce Floyd's cycle detection. Hence, one can preserve the time complexity bound c1-math-1736, while reducing significantly the memory complexity bound down to only c1-math-1737 bytes. In comparison with the factorization algorithm, one has to modify the generator function of the sequence in the following way:

Build three subsets c1-math-1738, and c1-math-1739 of G of similar sizes forming a partition of G (for example, in c1-math-1740, with c1-math-1741 one can always take c1-math-1742, and c1-math-1743).

Then, one defines the generator function f such that

equation

Hence, each element of the sequence can be written in the form c1-math-1745 for some c1-math-1746 and some c1-math-1747. Yet, in the same way as the Rho algorithm for factorization, the c1-math-1748 are more or less equally spread modulo p. Therefore, a collision c1-math-1749 occurs on the average after c1-math-1750 draws, even if c1-math-1751 and c1-math-1752. The function f insures, as for factorization, that this collision will be constantly reproduced after k steps; thus, it is possible to find y such that c1-math-1753 thanks to Floyd's algorithm with only a memory complexity bound of several integers of size c1-math-1754.

Then, one has c1-math-1755 and – at the same time – c1-math-1756. In this case, one obtains c1-math-1757, which means that, in the space of indexes, one directly finds x (one recalls that c1-math-1758):

equation

Be careful: we are solving logarithms in c1-math-1760 but the latter equation lies in c1-math-1761 where c1-math-1762; thus although c1-math-1763 is nonzero, it is not necessarily invertible modulo n. If this is not the case, then restart the algorithm.

Coppersmith's Index Calculus The same way as we have just modified Pollard's algorithm and adapted it to the computation of discrete logarithms, it is possible to modify sieve algorithms. Then, the obtained method works for the discrete logarithm over the group of invertible of a finite field, but not for any group. In particular, cryptology over curves are still resisting this kind of attack. Let us show anyway how this attack works over finite fields with an example.

Example 1.14

How to find x, such that c1-math-1764? One recalls that c1-math-1765 is prime and that c1-math-1766 is a primitive root of c1-math-1767 for c1-math-1768. The idea is to draw randomly some values of i such that c1-math-1769 can be easily factorized (i.e., it only has small prime factors, or smooth). In practice, one is given a basis of prime factors, for example c1-math-1770.

Then, for each prime number c1-math-1771, one divides c1-math-1772 with the greatest possible power of c1-math-1773. At the end of the process, if one obtains the value 1, then c1-math-1774 can be factorized in the basis B.

After several random draws, we keep the values c1-math-1775, and c1-math-1776:

equation

When considering the logarithms, Theorem 7 guaranties that one obtains a linear system in the space of exponents whose unknown values are the discrete logarithms of 2, 3, and 5, now modulo c1-math-1778:

equation

This gives c1-math-1780 (or in other words c1-math-1781), then c1-math-1782 and

equation

One has to find a number in the form c1-math-1784 whose remainder can be factorized in the basis c1-math-1785. Still with the same method, one finds – for example – after several random attempts: c1-math-1786. Thus, c1-math-1787 satisfies c1-math-1788.

Such index calculus algorithms are more efficient in some particular fields. The record is held by a French team which managed – in November 2005 – to compute a discrete logarithm in the field c1-math-1789 in only 17 days on the 64 processors of the Bull supercalculator Teranova.

Now we have at our disposal enough tools to start specializing. We are able to build the objects we will use throughout this book, and we have described the common algorithms that are the foundations of coding. Each of the following chapters will refer to the work in this chapter, depending on needs of the specific objectives: compression, encryption, or correction.

1 The word code in natural languages can have several meanings. In our context, we will see that it applies to transformations and their results, but in computer science, it can also mean a computer program. This apparent confusion actually enables one to figure out the link between various mathematical and computer processes. That is why we will keep the word with its multiple meanings.2 Analogous definition can be made for space complexity to measure the amount of space, or memory required by the algorithm.3 r must be coprime with n, the greatest prime factor q of r must satisfy c1-math-0677, and n must also satisfy c1-math-0678.4 http://csrc.nist.gov/publications/fips/fips186-3/fips_186-3.pdf5 http://hyperelliptic.org/EFD/index.html6 http://galg.acrypta.com/telechargements/ARCANA_ECDB.tgz