CSCI 4116 Dalhousie Linear Block Cipher With Block Length N and Alphabet Ques

User Generated

uksmf

Mathematics

CSCI 4116

Dalhousie University

CSCI

Description

I will provide the note file for my cryptography assignment. Please refer to the formula inside and provide clear steps for explanation for each questions. All the questions should be explained with formula and calculation or explanation.

Unformatted Attachment Preview

MATH/CSCI 4116 Cryptography Assignment 6 1. Consider the linear block cipher with block length n and alphabet {0, 1}. On the key space of matrices A ∈ {0, 1}(n,n) with det(A) ≡ 1 (mod 2), choose the uniform distribution. Show that this cryptosystem does not have perfect secrecy. (Hint: Use the definition of perfect secrecy, and find a counterexample with n = 2, for instance.) 2. Let n be a positive integer. A Latin square of order n is an n × n array L of the integers 1, 2, . . . , n such that every one of the n integers occurs exactly once in each row and each column of L. An example of a Latin square of order 3 is as follows: 1 2 3 312 . 231 Given any Latin square L of order n, we can define the following cryptosystem: Take P = C = K = {1, 2, . . . , n}. For 1 ≤ i ≤ n, the encryption function Ei is defined to be Ei (j) = L(i, j). (Hence each row of L gives rise to one encryption function.) Give a complete proof that this Latin square cryptosystem achieves perfect secrecy. (Hint: Use Shannon’s theorem). 3. (a) Show that the only irreducible polynomials in (Z/2Z)[x] of degree at most 2 are x, x + 1, and x2 + x + 1. (b) Show that x4 + x + 1 is irreducible in (Z/2Z)[x]. (Hint: If it factors, it must have at least one factor of degree at most 2.) 4. (a) Show that, in (Z/2Z)[x], x4 ≡ x + 1 (mod x4 + x + 1), x8 ≡ x2 + 1 (mod x4 + x + 1), x16 ≡ x (mod x4 + x + 1). (b) Show that x15 ≡ 1 (mod x4 + x + 1) in (Z/2Z)[x]. 5. Which of the following polynomials are irreducible in (Z/2Z)[x]? x5 + x2 + 1, x5 + x4 + 1, x5 + x3 + x2 + 1. Due: Monday, March 8, 2021, 11:30 pm Cryptography Course Notes for CSCI/MATH 4116 Author: Karl Dilcher Institution: Dalhousie University Preface The course MATH 4116, Cryptography, cross-listed as CSCI 4116, has been taught at Dalhousie on an annual basis since 2001. These are the notes for this course, first prepared when I taught the course for the first time, but regularly updated since then. These notes represent an almost word-by-word transcript of what I plan to cover in class. For this reason they are quite “skeletal", and are not meant to replace class attendance. Instead, these notes are intended to serve as a record of the material covered and expected of the students to be mastered. It may also alleviate the chore of note-taking, and free up some energy for following the development of the subject matter in class. I would like to point out again, as I usually do in class, that studying these notes alone is not sufficient to master the material. Doing the weekly assignments is essential. Depending on how quickly I will be able to move through the material in class, some topics may be dropped, or may be downplayed and not tested; if this should be the case, it will be announced in class. Various different sources have been used in the preparation of these notes, but a great deal of material is based on J. Buchmann’s book (2). Section 6 is mainly based on the book by Trappe and Washington (3). Since this is only a one-semester course, not all topics that constitute modern cryptography could be included. Some interesting topics that had to be left out are: Modern factorization methods, security protocols, digital currency, secret sharing schemes, error correcting codes, quantum techniques, among others. The interested student is encouraged to read up on these topics in the references given at the end. The bibliography is very selective and was not meant to be complete. For serious further study or research, the bibliographies in recent books should be consulted. For instance, (6) has an annotated section on “recommended reading" at the end of each chapter, and the handbook (10) has a very extensive bibliography of research papers. Finally, I thank the students of past versions of this course for their enthusiasm and positive feedback. Halifax December, 2020 Karl Dilcher dilcher@mathstat.dal.ca Picture credits: The illustrations in these notes were taken from the following publications: The figure on p. 3 from (1), the ones on pp. 19, 21, 23, 24, and 25 from (2), the ones on pp. 45, 46, and 48 from a now defunct webpage of the NIST. The figure on p. 60 is from https://www.eng.tau. ac.il/~yash/crypto-netsec/rijndael.htm, and the figure on p. 86 is from the website http://crypto.stackexchange.com/. The cover picture: Old digraphic substitution by Giovanni Battista Porta, 1563. (From Reference (11), p. 56.) iv Contents 1 2 3 4 5 Introduction 1 1.1 Generalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 More Terminology, Classification . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Basic Concepts of Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Some History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Classical Cryptography 6 2.1 Cryptosystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Mathematical Background I . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 The Affine Cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.4 Block Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.5 Modes of Block Cipher Operation . . . . . . . . . . . . . . . . . . . . . . . . 19 2.6 Stream Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.7 Mathematical Background II: Linear Algebra . . . . . . . . . . . . . . . . . . 25 2.8 Affine Linear Block Cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Probability and Perfect Secrecy 33 3.1 Basics of Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.2 Perfect Secrecy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.3 Random and Pseudo-random Numbers . . . . . . . . . . . . . . . . . . . . . . 41 Modern Classical Cryptosystems 42 4.1 The Data Encryption Standard (DES) . . . . . . . . . . . . . . . . . . . . . . 42 4.2 Mathematical Background III: Finite Fields . . . . . . . . . . . . . . . . . . . 50 4.3 The Advanced Encryption Standard (AES): Rijndael . . . . . . . . . . . . . . 56 Public-Key Cryptography 61 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.2 Mathematical Background IV: Some Number Theory . . . . . . . . . . . . . . 63 5.3 The RSA Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5.4 Prime Number Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.5 The Diffie-Hellman Key Exchange . . . . . . . . . . . . . . . . . . . . . . . . 78 5.6 The ElGamal Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.7 Elliptic Curve Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 CONTENTS 6 Some Additional Topics 90 6.1 Cryptographic Hash Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 90 6.2 Digital Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 vi CONTENTS List of Symbols It is assumed that the reader is familiar with all the basic mathematical notations, such as N (the set of natural numbers 1, 2, 3, . . .), Z (the set of all integers), etc., and notation that is usually found in beginning calculus and linear algebra courses. Here is a list of more specialized notations. The page numbers given refer to first occurrences. Symbol Description page P Plaintext space 6 C Ciphertext space 6 K Key space 6 Ek Encryption function 6 Dk Decryption function 6 Zm The set {0, 1, . . . , m 1} 6 a divides b 8 ⌘ Congruence 8 Z/mZ Residue classes modulo m 8 Fn Fermat number 9 Operation on a set 9 a|b 1 Multiplicative inverse of a 11 R⇤ Group of units of a ring R 12 gcd(a, b) Greatest common divisor of a and b 12 a-b a does not divide b 12 '(m) Euler’s phi-function 13 ⌃ Denotes an alphabet (‘Sigma’) 17 ⌃⇤ Set of all words over ⌃ 17 " The empty string (‘epsilon’) 17 S(X ) Group of permutations of set X 17 Sn Group of permutations of {1, 2, . . . , n} 18 Exclusive or (XOR) 20 Set of k ⇥ n matrices over R 26 n-dimensional vectors over R 26 In Identity matrix 26 det A Determinant of matrix A 27 adj A Adjoint of matrix A 27 P(S) Power set of a set S 33 p( A) Probability of an event A 33 p( A | B) Conditional probability 35 Set of polynomials over R 51 a R (k,n) Rn R[x] vii CONTENTS deg f Degree of the polynomial f 51 F Denotes a field 52 GF(pn ) 54 ✓ Galois field of degree n over Z/pZ p The golden mean (‘theta’), (1 + 5)/2 ordG g The order of g in G 67 hgi The subgroup generated by g 68 bxc 75 ⇡(x) The greatest integer  x dlogg A Number of primes  x (‘pi of x’) Discrete logarithm of A to the base g 79 viii 66 76 Chapter 1 Introduction 1.1 Generalities 1.1.1 What is Cryptography? “The study of principles and techniques by which information can be concealed in ciphers and later revealed by legitimate users employing a secret key (i.e., a piece of information known only to them), but in which it is either impossible or computationally infeasible for an unauthorized person to do so”. (Encyclopedia Britannica) • The word comes from Greek kryptós – “hidden”, and gráphein – “to write”. • A more inclusive term is cryptology, from Greek kryptós and lógos – “word”. “The science of secure (generally secret) communications”. (Encyclopedia Britannica) • Apart from Cryptography, the term Cryptology includes Cryptanalysis: From Greek kryptós and anal˝ein – “to loosen, to untie”. “The science (and art) of recovering information from ciphers without knowledge of the key”. (Encyclopedia Britannica) 1.1.2 The difference between ciphers and codes Cipher: From French chiffre and Arabic sifr (“nothing” – also related to zero). “1. Any system of secret writing that uses a prearranged scheme or key. — 2. A message in cipher, also, its key.” (Funk & Wagnall’s Standard Desk Dictionary). Code: From Latin codex – “writing tablet”. “A set of signals, characters, or symbols used in communication”. (Funk & Wagnall’s) The main topic of this course will be ciphers rather than codes. Some examples of codes: • Morse code • ASCII (American Standard Code for Information Interchange) • The “pigpen cipher” (really a code) • Sherlock Holmes’ “Dancing Men” 1.1.3 About this course The main purpose of this course: • To give the mathematical foundations of modern cryptography and of the most important cryptosystems. • To give means to judge the security of a cryptosystem. 1.2 More Terminology, Classification What this course does not cover (or only in passing): • Implementations, protocols, etc. • Related to this: Network security, e-commerce, etc. • Political and social issues (“escrow systems”, the CLIPPER chip, etc.) • Ethical and human-rights issues • Steganography (covert secret writing; from Greek steganos – “covered”) 1.2 More Terminology, Classification 1.2.1 Terminology • Cryptosystem (or cryptographic system: A more “technical” term for cipher. • Plaintext: An English (French, German, . . . ) text, numerical data, other information; may already have been encoded (for instance, in ASCII). • Encryption: Performed by the sender; intended to render the message unintelligible to any eavesdropper. • Ciphertext: The result of an encryption; usually a string of symbols, digits, bits, . . . • Decryption: Performed by the intended receiver; the recovering of plaintext from ciphertext. 1.2.2 Kerckhoffs’ principle “It is assumed that the cryptosystem in use, and the workings of this system, are public knowledge, or at least known to the potential eavesdropper”. (Auguste Kerckhoffs, 1883) Reasons: • It is unreasonable to depend on the secrecy of the cryptosystem. • Adherence to this principle makes standardization of algorithms and large-scale communication easier. Thus, the only secret is the key. Therefore, key distribution and key management are fundamental issues in cryptography. 1.2.3 Classification Cryptosystems can be (roughly) classified according to three different criteria: 1. The types of operations used for encryption: • Substitution: Each element in the plaintext (bit, letter, group of bits or letters) is mapped to another element. • Transposition: Elements in the plaintext are rearranged. 2 1.3 Basic Concepts of Cryptanalysis Most cryptosystems employ multiple stages of substitutions and transpositions (“product systems”). 2. The number of keys used: • Symmetric (or single-key, secret-key, conventional) encryption: Both sender and receiver use the same key. • Asymmetric (or two-key, or public key) encryption: Sender and receiver each uses a different key. 3. The way in which the plaintext is processed: • Block cipher: Input is processed one block of elements at a time, producing an output block for each input block. • Stream cipher: Input elements are processed continuously, producing an output of one element at a time, as it goes along. 1.2.4 Model of conventional cryptosystems The basic communications scenario can be depicted as follows: There are two parties, usually called Alice and Bob (for A and B) who wish to communicate securely over an insecure channel. We assume that a third party has access to the insecure channel and thus to the ciphertext and, in certain scenarios, will also have access to the corresponding plaintext and might even be able to alter the ciphertext and/or impersonate Alice or Bob. In accordance with Kerckhoffs’ principle, we also assume that this party will know which cryptosystem is used and will know its inner workings. This party is often called Oscar (for “opponent” or “observer”), or Eve (for “eavesdropper”). 3 1.3 Basic Concepts of Cryptanalysis 1.3 Basic Concepts of Cryptanalysis Cryptanalysis is as much an art as it is a science. Although some ciphers may be computationally intractable, human ingenuity and intuition may help to crack them. (There are many examples of this throughout history). 1.3.1 Means of cryptographic attacks 1. Mathematical: • Statistics (frequency distributions, etc.). • Number theory. • Group theory. • Combinatorics and graph theory, etc. 2. Side information: • Linguistic (human language is full of redundancy). • Formatting (such as “Dear Dr. Dilcher”). • Subject. • Known words or phrases (such as military ranks). 3. “Cloak & Dagger” methods: • Theft, bribery, blackmail, sex, violence. 1.3.2 Types of cryptographic attacks 1. Ciphertext only: • The cryptanalyst has only the ciphertext to work with. • Easiest type to defend against. • Example of this type of attack: “exhaustive search”. 2. Known plaintext: • Often the cryptanalyst has one or more plaintext messages along with corresponding encryptions. • Closely related: “Probable word attack”. 3. Chosen ciphertext: • The cryptanalyst can cause a ciphertext of his choice to be decoded. • Both ciphertext and corresponding plaintext segments are then known. • Similar: “Chosen plaintext attack”. 4 1.4 Some History 1.4 Some History This is a selective list of some key dates (pun intended) and events in the history of cryptography. • Ancient Egyptians, Hebrews, Babylonians, Assyrians used protocryptographic systems. • ca. 400 BC: Spartans used the scytale for communications between military commanders. – First transposition cipher. • 4th century BC: Chapter on cryptography in a book by Aeneas. – Earliest treatise on the subject. • Same time: Polybios encoded letters into pairs of symbols. – First biliteral substitution. • ca. 50 BC: “Caesar cipher”: simple shift cipher. • Arab culture: First clear understanding of the principles of cryptology. 1412: Treatment of several cryptosystems in an encyclopedia by al-Kalkashandi. First use of frequency analysis. • 1379: First European manual on cryptography, by Gabriele de Lavinde of Parma. • 1470: First cipher disk, by Leon Battista Alberti. • 1586: Blaise de Vigenère – The “Vigenère table”. • 17th – 19th centuries: Rapid development of techniques. • 1920s: Electromechanical devices used by most powers. • Both world wars: Cryptanalytical successes of greatest importance to (mostly) the Allied forces. • 1950s on: Use of electronic computers. • 1976: Public key cryptography described by Diffie and Hellman. • 1977: Data Encryption Standard (DES) adopted. • 1978: RSA (Rivest, Shamir, Adleman). • 1980s: Elliptic curve cryptosystems. • 2000/2001: Replacement of DES by the Rijndael algorithm as new “Advanced Encryption Standard” (AES). 5 Chapter 2 Classical Cryptography In this chapter we are going to study some basic principles of cryptography, along with several historical and insecure cryptosystems as examples. We will also consider modern block ciphers and stream ciphers, and two sections contain mathematical background that will be required in this chapter and throughout the course. 2.1 Cryptosystems 2.1.1 Basics Definition 2.1. A cryptosystem is a 5-tuple (P, C, K , E, D) with the following properties: 1. P is a set, called the plaintext space. Its elements are called plaintexts. 2. C is a set, called the ciphertext space. Its elements are called ciphertexts. 3. K is a set, called the key space. Its elements are called keys. 4. E = {Ek : k 2 K } is a family of functions Ek : P ! C. Its elements are called encryption functions. 5. D = {Dk : k 2 K } is a family of functions Dk : C ! P. Its elements are called decryption functions. 6. For each e 2 K there is a d 2 K such that Dd (Ee (p)) = p for all p 2 P. | Example 1. Shift ciphers. Let P = C = K = { A, B, C, . . . , Z }. We identify this set with ⌃ = {0, 1, 2, . . . , 25} according to the table A B C ... Z 0 1 2 ... 25 Notation: Zm is the set {0, 1, 2, . . . , m 1} along with the two operations + and ⇥ which are carried out as in N, except that we “reduce modulo 26”. (More about this in the next section). For e 2 Z26 we define the encryption function Ee by Ee : ⌃ ! ⌃, x 7! (x + e) (mod 26). For d 2 Z26 the decryption function Dd is defined by Dd : ⌃ ! ⌃, x 7! (x d) (mod 26). The decryption key belonging to the encryption key e is d = e. Example: Apply the shift cipher with key e = 5 to the word cryptography. We get HWDUYTLWFUMD. 2.2 Mathematical Background I Remarks: 1. When the key is e = 3, this is called the Caesar cipher. 2. The key space has only 26 elements. To cryptanalyze, we need only try all keys on a segment of the ciphertext, until it “makes sense”. 3. Shift ciphers can be modified as follows: Let P and C be the set of all sequences w = (w1, w2, . . . , wn ), with wi 2 ⌃, 1  i  n. Again K = Z26 . Now the function Ee replaces each wi with wi + e (mod 26), 1  i  n. 4. The symbol ⌃ is the capital Greek letter “Sigma", and is pronounced this way. 5. Obviously, all this can be done with Zm , for any m 2. 6. Recall that d = e. We can use this as an alternative definition of a symmetric, resp. an asymmetric cryptosystem: A cryptosystem is called – symmetric if d = e, – asymmetric if d , e, and the computation of d from e is not feasible. 2.1.2 Requirements for a “good” cryptosystem A. “Classical” requirements (Claude Shannon, 1940): A good cryptosystem should 1. be highly secure against its designer, that is, the designer of the system should not have any advantage in decryption; 2. use short, easily-changed key; 3. require only simple encryption/decryption; 4. not introduce error propagation; 5. not entail message expansion. Remark: Requirements 2 – 5 reflect the fact that encryption/decryption used to be done manually by error-prone cipher clerks. They are less important with the use of computers. B. Modern requirements: 1. As before. 2. As before – keys must be small (but “small” is a relative term; keys can be many digits long). 3. As before – operations must be relatively simple (what computers do best), but we can do many of them. 4. Some error propagation may even be desirable. 5. Moderate message expansion is now acceptable. 7 2.2 Mathematical Background I 2.2 Mathematical Background I 2.2.1 Congruences Modular arithmetic and the basics of (abstract) algebra are of fundamental importance for cryptographic algorithms. Definition 2.2. Given a, b 2 Z, we say that a is congruent to b modulo m (written a ⌘ b (mod m)) if m divides b a (written m | b Examples: 2 ⌘ 19 (mod 21), a). | 10 ⌘ 0 (mod 2). Remark: Congruence modulo m is an equivalence relation, that is, it satisfies the following relations: 1. a ⌘ a (mod m) (“reflexivity”), 2. a ⌘ b (mod m) implies b ⌘ a (mod m) (“symmetry”); 3. a ⌘ b (mod m) and b ⌘ c (mod m) implies a ⌘ c (mod m) (“transitivity”). The following characterization is also useful. Lemma 2.1. The following statements are equivalent: 1. a ⌘ b (mod m). 2. There is a k 2 Z with a = b + km. 3. When divided by m, both a and b leave the same remainder. ~ We are now ready to define a basic concept in modular arithmetic: Definition 2.3. The equivalence class of a 2 Z is the set of all integers obtained from a by adding integer multiples of m, that is, {b 2 Z | b ⌘ a (mod m)} = a + mZ. This is called the residue class of a (mod m). | Examples: 1. The residue class of 1 (mod 4) is {1, 1 ± 4, 1 ± 2 · 4, . . .} = {1, 3, 5, 7, 9, . . .}. 2. The residue class of 0 (mod 2) is the set of all even integers; the residue class of 1 (mod 2) is the set of all odd integers. 3. The residue classes (mod 4) are 0 + 4Z , 1 + 4Z , 2 + 4Z , 3 + 4Z. Notation: The set of residue classes (mod m) is denoted by Z/mZ. It has m elements since 0, 1, 2, . . . , m 1 are all the possible remainders upon division by m. A set of representatives is a set of integers that contains exactly one element of each residue class (mod m). 8 2.2 Mathematical Background I Examples: 1. {0, 1, 2}, {3, 2, 5}, and {9, 16, 14} are different sets of representatives of Z/3Z. 2. For any m 2 N, the set {0, 1, 2, . . . , m 1} is a set of representatives (mod m), called the least nonnegative residues (mod m), denoted by Zm . 3. {1, 2, . . . , m} is a set of representatives, the least positive residues (mod m). Proposition 2.1. Suppose that a ⌘ b (mod m) and c ⌘ d (mod m). Then (a) a ⌘ b (mod m), (b) a + c ⌘ b + d (mod m), (c) ac ⌘ bd (mod m).  Proof (a) Since m divides a b, it also divides a + b, and therefore a ⌘ b (mod m). (b), (c): exercise. ⇤ Example: The integer Fn = 22 + 1 is called a Fermat number. Pierre de Fermat (1601–1665) n observed that F0 = 3, F1 = 5, F2 = 17, F3 = 257, and F4 = 65 537 are all prime. He conjectured that all Fn are prime. However, Leonhard Euler (1707–1786) showed that 641 | F5 . Proof It is clear that 641 = 640 + 1 = 5 · 27 + 1, so 5 · 27 ⌘ 1 (mod 641). By Proposition 2.1(c) we get by multiplying both sides of this congruence with itself 4-times over, 54 · 228 ⌘ 1 (mod 641). (2.1) On the other hand, 641 = 625 + 16 = 54 + 24 , so 54 ⌘ 24 (mod 641). Substituting this into (2.1), we get 5 24 · 228 ⌘ 1 (mod 641), or 232 ⌘ 1 (mod 641), i.e., 22 + 1 ⌘ 0 (mod 641). ⇤ 2.2.2 Some abstract algebra Definition 2.4. If X is a set, a map element x 1 : X ⇥ X ! X which sends a pair (x 1, x 2 ) of elements from X to the x 2 is called an operation on X. | Example: Addition and multiplication are operations on X = R. Definition 2.5. Given the residue classes a + mZ and b + mZ, (i) their sum is (a + mZ) + (b + mZ) = (a + b) + mZ; (ii) their product is (a + mZ) · (b + mZ) = (a · b) + mZ. 9 | 2.2 Mathematical Background I Remark: This definition uses representatives; however, by Proposition 2.1 sum and product are independent of representatives, that is, they are well defined. Example: Let m = 5. Then (3 + 5Z) + (2 + 5Z) = 5 + 5Z = 5Z; Or: (3 + 5Z) · (2 + 5Z) = 6 + 5Z = 1 + 5Z. 3 + 2 ⌘ 0 (mod 5); 3 · 2 ⌘ 1 (mod 5). Definition 2.6. A pair (H, ) consisting of a set H and an operation that satisfies (a b) c = a (b c) for all a, b, c 2 H (i.e., the operation is associative,) is called a semigroup. If, in addition, a b=b a for all a, b 2 H, the semigroup is called commutative or abelian. A neutral element of the semigroup (H, ) is an element e 2 H which satisfies e a= a e = a for all a 2 H. A semigroup with a neutral element is called a monoid. | Example: (Z, +), (Z, ·), (Z/mZ, +), and (Z/mZ, ·) are commutative semigroups. In fact, they are monoids. Some properties: Let (H, ) be a semigroup, and set a1 = a and a n+1 = a n 2 N. Then an a m = a n+m, a n for a 2 H and (a n ) m = a nm for a 2 H and n, m 2 N. If a, b 2 H and a b = b a, then (a b) n = a n bn . (This is true in general if the semigroup is commutative). Definition 2.7. Let (H, ) be a semigroup and e its neutral element. Given an a 2 H, an element b 2 H is called an inverse of a if a b = e = b a. An element a is called invertible in H if it has an inverse. | Remarks: (a) A semigroup has at most one neutral element. (b) In a monoid each element has at most one inverse. Examples: (Z, +): Neutral element is 0. Inverse of a is a. (Z, ·): Neutral element is 1. The only invertible elements are 1 and 1. (Z/mZ, +): Neutral element is the residue class mZ. Inverse of a + mZ is a + mZ. 10 2.2 Mathematical Background I (Z/mZ, ·): Neutral element is the residue class 1 + mZ. For the invertible elements, see the next subsection. Definition 2.8. A group is a monoid in which every element is invertible. A group is called commutative or abelian if the monoid is commutative. | Examples: (Z, +) and (Z/mZ, +) are abelian groups. (Z, ·) is not a group because not every element is invertible. Some properties: Let (G, ·) be a group. Denote by a for all n 2 N. Then a n · a m = a n+m, 1 the inverse of a 2 G, and set a n = (a 1 ) n (a n ) m = a nm holds for all a 2 G and n, m 2 Z. If the group is abelian, then (a · b) n = a n · bn holds for all n 2 Z. An important property of a group is that we can “cancel”: Proposition 2.2. Let (G, ·) be a group and a, b, c 2 G. Then ca = cb implies a = b, and ac = bc implies a = b.  Definition 2.9. The order of a group is the number of its elements. | Examples: (a) The additive group Z has infinite order. (b) The additive group Z/mZ has order m. Definition 2.10. A ring is a triplet (R, +, ·) such that (a) (R, +) is an abelian group, (b) (R, ·) is a semigroup, (c) For all x, y, z 2 R we have x · (y + z) = (x · y) + (x · z), (x + y) · z = (x · z) + (y · z). The ring is called commutative if the semigroup (R, ·) is commutative. An identity (or unit element) of the ring is a neutral element of the semigroup (R, ·). The property in (c) is called distributivity. | Examples: 11 2.2 Mathematical Background I (1) (Z, +, ·) is a commutative ring with identity 1. This implies: (2) (Z/mZ, +, ·) is a commutative ring with identity 1 + mZ; it is called the residue class ring modulo m. Remark: If it is clear which operations are meant, we often write simply R for (R, +, ·); for instance, we write Z/mZ for the residue class ring (mod m). Definition 2.11. Let R be a ring with identity. An element a 2 R is called invertible, or a unit, if it is invertible in the multiplicative semigroup of R. An element a 2 R is called a zero divisor if a , 0 and there is a b , 0, b 2 R, with ab = 0 or ba = 0. | Remark: The units of a commutative ring form a group, the group of units of R, denoted by R⇤ . Examples: (1) Z⇤ = { 1, 1}. (2) Z has no zero divisors. (3) Claim: The zero divisors of the residue class ring Z/mZ are the residue classes a + mZ with 1 < gcd (a, m) < m. Proof (i) If a + mZ is a zero divisor of Z/mZ, then there is a b 2 Z with ab ⌘ 0 (mod m), but a . 0 (mod m) and b . 0 (mod m). Hence m | ab, but m - a and m - b. So gcd (a, m) must lie strictly between 1 and m. (ii) Suppose that 1 < gcd (a, m) < m, and set b := m/ gcd (a, m). Then a . 0 (mod m), ab = (a/gcd(a, m)) · m ⌘ 0 (mod m), and b . 0 (mod m). Hence a + mZ is a zero divisor of ⇤ Z/mZ. Examples: When m = 6, then 2 · 3 ⌘ 0 (mod 6); hence 2 + 6Z is a zero divisor. Consequence: If m is a prime, then Z/mZ has no zero divisors. Definition 2.12. A field is a commutative ring in which every nonzero element is invertible. | Examples: Q, R, C are fields. Z is not a field. 2.2.3 Division in the residue class ring Divisibility in rings is defined as in Z: Definition 2.13. Let R be a ring and a, n 2 R. We say that a divides n if there is an element b 2 R such that n = a · b. In this case a is called a divisor of n, and n a multiple of a, and we write a | n (otherwise, a - n). | We will now restrict our attention to R = Z/mZ. Which elements in Z/mZ are invertible? Recall: The class a + mZ is invertible if and only if the congruence ax ⌘ 1 (mod m) 12 (2.2) 2.2 Mathematical Background I is solvable. When is this the case? Proposition 2.3. The residue class a + mZ is invertible in Z/mZ (i.e., the congruence (2.2) is solvable) if and only if gcd(a, m) = 1. If gcd(a, m) = 1, then the inverse of a + mZ is uniquely determined (i.e., the solution of (2.2) is is uniquely determined modulo m.)  Proof Later. Remarks: (a) A residue class a + mZ with gcd(a, m) = 1 is called an invertible residue class modulo m. By Proposition 2.3, a residue class a + mZ, 1  a < m, is either a zero divisor or an invertible residue class (i.e., a unit in the residue class ring Z/mZ). (b) The solution of (2.2), and thus the inverse of a + mZ (when gcd(a, m) = 1), can be computed efficiently with the “Euclidean algorithm” (see later). Example: Let m = 10. The class a + 10Z is invertible if and only if gcd(a, 10) = 1, i.e., when a = 1, 3, 7, 9. To obtain the inverses, note that 3·7⌘1 (mod 10), 7·3⌘1 Proposition 2.4. (mod 10), 9·9⌘1 (mod 10). Z/mZ is a field if and only if m is a prime number.  Proof By Proposition 2.3, Z/mZ is a field if and only if gcd(k, m) = 1 for all k, 1  k < m. But this is the case if and only if m is a prime. ⇤ Corollary 2.1. The set of all invertible residue classes modulo m is a finite abelian group with respect to multiplication. ~ Remark: This group is called the multiplicative group of residues modulo m, denoted by (Z/mZ) ⇤ . Definition 2.14. The Euler '-function ' : N ! N is defined as follows: If m 2 N, then '(m) is the order of (Z/mZ) ⇤ . In other words: '(m) is the number of integers a 2 {1, 2, . . . , m} with gcd(a, m) = 1. | Remarks: (a) In place of “'-function” you will sometimes see “phi-function”; in either case, it is pronounced like “fee”-function. The name Euler is pronounced like “oiler”. (b) The first few values of the '-function are m 1 2 3 4 5 6 7 8 9 10 11 ... '(m) 1 1 2 2 4 2 6 4 6 4 10 ... In particular: If p is prime, then '(p) = p 1. 13 2.3 The Affine Cipher The Euler '-function will be studied in greater detail later. For now, here are two formulas for calculating '(n); they will have been derived in most Discrete Mathematics courses: ! Y 1 '(n) = n 1 , p p |n (2.3) where the product is taken over all primes dividing n. If n = p1e1 · . . . · pkek , (the canonical form), then '(n) = (p1e1 p1e1 1 ) · . . . · (pkek pkek 1 ). (2.4) For both formulas it is essential that n be completely factored. Examples: (a) Let n = 2000. The only prime divisors of n are 2 and 5. so by (2.3) we have ! ! 1 1 1 4 '(2000) = 2000 1 1 = 2000 · · = 800. 2 5 2 5 (b) Let n = 4116. The factorization is 4116 = 22 · 3 · 73 . This time we use (2.4) to obtain '(4116) = (22 2)(3 1)(73 72 ) = 1176. We close this section with one last result on division in modular arithmetic. We have seen that addition, subtraction and multiplication modulo m are no problem. Also, when m is prime, we can “cancel” modulo m. But what happens when m is not prime? Example: Starting with the congruence 3 · 8 ⌘ 3 · 4 (mod 12), we see that 8 ⌘ 4 (mod 12) is certainly wrong. However, we do have 8 ⌘ 4 (mod 4). This is explained by the following result. Proposition 2.5. xa ⌘ xb (mod m) if and only if a ⌘ b (mod m gcd(x,m) ).  Proof Two directions need to be shown: (a) “)”: The first congruence can be rewritten as m | (xa xb), and then m | x(a Now denote d := gcd(x, m) and divide the last relation by d. Then x gcd( m d , d ) = 1, we have m d | (a | x d (a b). b), and since b), which is equivalent to the second congruence. (b) “(”: The second congruence is equivalent to and also m | x(a m d m d | (a b), which implies m | d(a b) since x is a multiple of d. This last relation is the same as m | (xa and this is equivalent to the first congruence. b) xb), ⇤ 2.3 The Affine Cipher In this section we present a cipher that is easily broken using the methods from the previous sections. It is also the basis of some historical ciphers which we will consider later, in Section 2.8. 14 2.3 The Affine Cipher 2.3.1 Description of the cipher In Section 2.1 we considered the shift cipher as a special example of a substitution cipher. Its encryption function was Ek (x) = x + b (mod m), x 2 Zm, with key k = b 2 Zm . We now define the more general affine cipher by the encryption function Ek (x) = ax + b (mod m), x 2 Zm, with key k = (a, b) 2 Z2m , and gcd(a, m) = 1. Why do we have this last condition? Suppose that gcd(a, m) = d > 1. We have seen (see Example (3) after Definition 11) that a + mZ is a zero divisor in Z/mZ, so the congruence ax ⌘ 0 (mod m) has at least two solutions, namely x = 0 and x = m/d. Hence we would have Ek (0) = b = Ek (m/d), so the encryption function is not injective, which is not allowed. Suppose now that we do have gcd(a, m) = 1. What is the decryption function? Solve the congruence ax + b ⌘ y (mod m) ax ⌘ y (mod m). for x. First, b Now, since gcd(a, m) = 1, there is a multiplicative inverse a (mod m). So a 1 ax ⌘ a 1 (y b) 1 2 Zm such that a 1 a ⌘ 1 (mod m), or x ⌘ a 1 (y b) (mod m). Therefore, given the encryption key k = (a, b), the decryption function is Dk (y) = a 1 (y b) (mod m). Example: Let m = 26. Alice wants to encipher the word bald with the key k = (a, b) = (7, 3). Note that gcd(7, 26) = 1. We use the translation table from Section 2.1; the steps are summarized in the following table: plaintext b a l d x 1 0 11 3 y ⌘ 7x + 3 (mod 26) 10 3 2 24 K D C Y ciphertext 15 2.4 Block Ciphers To decipher, Bob has to find a 1 . By inspection: 15 · 7 = 105 ⌘ 1 (mod 26) (later we will meet an algorithm for doing this). So a 1 = 15, and the decryption function is Dk (y) = 15(y 3) (mod 26). The steps are now: ciphertext K D C Y y 10 3 2 24 1 0 11 3 b a l d x = 15(y 3) (mod 26) plaintext 2.3.2 Cryptanalysis of the affine cipher 1. Ciphertext only: Consider an affine cipher with m = 26. Given a key k = (a, b), b can take on any value b 2 Z26 , while for a we require gcd(a, 26) = 1, so there are '(26) = 12 possibilities. In total: the key space has only 12 · 26 = 312 elements; an exhaustive search is easily possible. A different kind of ciphertext only attack can be found in the assignments. 2. Known plaintext or known ciphertext: Example: Suppose that the eavesdropper Oscar knows that the affine cipher (with m = 26) with key k = (a, b) maps e to R and s to H. Then he obtains the following congruences 4a + b ⌘ 17 18a + b ⌘ 7 (mod 26) (mod 26) (from e 7! R), (from s 7! H). Subtract the first congruence from the second one: 14a ⌘ 10 ⌘ 16 (mod 26), or, upon dividing by 2 (using Proposition 2.5), 7a ⌘ 8 (mod 13). If we now multiply by 2 and note that 14 ⌘ 1 (mod 13), we get a ⌘ 16 ⌘ 3 (mod 13). This means that either a ⌘ 3 (mod 26), or a ⌘ 16 (mod 26). The second case is not possible since gcd(16, 26) = 2, while the first case is allowable and leads to b ⌘ 17 4a = 5 (mod 26). So, finally, the encryption key was k = (3, 5), and Oscar has broken the cipher. 2.4 Block Ciphers Recall from the introduction: One criterion for classifying cryptosystems was “block ciphers” versus “stream ciphers”. Here we are going to discuss block ciphers in general. First we need some more background material. 16 2.4 Block Ciphers 2.4.1 Alphabets and words Definition 2.15. An alphabet is a finite nonempty set ⌃. The length of ⌃ is the number of elements in ⌃. The elements of ⌃ are called symbols or letters. | Examples: (1) ⌃ = { A, B, . . . , Z }; length 26. (2) ⌃ = {0, 1}; length 2. (3) The ASCII symbols; length 128. If an alphabet has length m, we can (and often will) identify it with Zm = {0, 1, . . . , m 1}. Definition 2.16. A finite sequence is a finite ordered set (a1, a2, . . . , an ); its elements (some of which may be identical) are called components. Sometimes we also write a1 a2 . . . an . The empty sequence () has no components. | Example: (2, 3, 1, 2, 3), or 23123. Definition 2.17. Let ⌃ be an alphabet. (1) A word or string over ⌃ is a finite sequence of symbols from ⌃ including the empty sequence, which is denoted by " and called the empty string. (2) The length of a word w over ⌃ is the number of its components, and is denoted by |w|. Also, |"| = 0. (3) The set of all words over ⌃ including the empty string is denoted by ⌃⇤ . (4) If v, w 2 ⌃⇤ , then vw = v w is the string that is obtained by concatenating v and w; it is called the concatenation of v and w. In particular, v " = " v = v. (5) For n 2 N, ⌃ n denotes the set of all words of length n over ⌃. | Example: Let ⌃ = { A, B, . . . , Z }, v = DAL, w = HOUSIE. Then v w = DALHOUSIE. 2.4.2 Permutations Definition 2.18. Let X be a set. A permutation of X is a bijective map f : X ! X. The set of all permutations of X is denoted by S(X ). | Example: Let X = {0, 1, . . . , 5}. An element in the first row of the following matrix is mapped to the number below: *.0 1 2 3 4 5+/ ,1 2 4 3 5 017 2.4 Block Ciphers Permutations can always be represented like this. Remark: The set S(X ), together with composition, forms a group (in general non-commutative). If n 2 N, Sn denotes the group of permutations of the set {1, 2, . . . , n}. ⇣ ⌘ ⇣ ⌘ Example: S2 consists of the two elements 11 22 , 12 21 . Proposition 2.6. The group Sn has order n! = 1 · 2 · . . . · n. Proof  By induction on n. (Usually covered in Abstract Algebra and Discrete Mathematics courses.) ⇤ Example: Let X = {0, 1} n , the set of all bitstrings of length n. A bit permutation is a permutation on X in which just the positions of the bits are permuted. Choose a permutation ⇡ 2 Sn . Then put f : {0, 1} n ! {0, 1} n, b1 b2 . . . bn 7! b⇡ (1) b⇡ (2) . . . b⇡ (n) . Every bit permutation can be uniquely written in this way. Hence there are n! bit permutations of bitstrings of length n. Example: “Circular leftshift of i positions”: The bitstring (b0, b1 . . . bn 1 ) is mapped to (bi (mod n), bi+1 (mod n), . . . , bi+n 1 (mod n) ). “Circular rightshifts” are defined analogously. 2.4.3 Block ciphers Definition 2.19. A cryptosystem is called a block cipher if its plaintext space and ciphertext space are ⌃ n (n 2 N), the words of a fixed length n over an alphabet ⌃. The integer n is called the block length. | Example: A shift cipher is a block cipher of length 1. In general, block ciphers of length 1 are called substitution ciphers. Proposition 2.7. The encryption functions of a block cipher are permutations.  Proof We know that encryption functions are injective (Assignment 1). But an injective map from a finite set A ! A (here A = ⌃ n ) must be bijective, i.e., a permutation. The most general block cipher can be described as follows: • Fix block length n and alphabet ⌃. • Use P = C = ⌃ n . • Key space is K = S(⌃ n ). • Encryption function for a key ⇡ 2 S(⌃ n ): E⇡ : ⌃ n ! ⌃ n, v 7! ⇡(v). 18 ⇤ 2.5 Modes of Block Cipher Operation • Decryption function: D⇡ : ⌃ n ! ⌃ n, v 7! ⇡ 1 (v). The key space is very large; it contains (|⌃| n )! elements. The problem with this is that it is very inefficient to use; it is not clear how to represent and evaluate a permutation ⇡ 2 S(⌃ n ) efficiently. In practice, therefore, only a subset of all possible permutations of ⌃ n is used; it is chosen such that the permutations are easy to represent and to evaluate. Example: The permutation cipher. We use only permutations that permute positions of the symbols. For ⇡ 2 Sn , set E⇡ : ⌃ n ! ⌃ n, (v1, . . . , vn ) 7! (v⇡ (1), . . . , v⇡ (n) ); so the key space is the group of permutations Sn . Decryption function: D⇡ : ⌃ n ! ⌃ n, (x 1, . . . , x n ) 7! (x ⇡ 1 (1) , . . . , x⇡ 1 (n) ). Remarks: (a) If ⌃ = {0, 1}, these are the bit permutations. (b) The key space has n! elements. 2.5 Modes of Block Cipher Operation 2.5.1 ECB mode We use a block cipher with alphabet ⌃ and block length n. Let K be the key space and Ek , Dk the encryption and decryption functions for k 2 K . The electronic codebook mode (ECB mode) is used as follows: • An arbitrarily long plaintext is decomposed into blocks of length n; if necessary, supple- ment plaintext (for instance, by randomly chosen symbols) such that its length is divisible by n. 19 2.5 Modes of Block Cipher Operation • Each block of length n is encrypted using the encryption function Ee . • The ciphertext is decrypted by applying the decryption function Dd to the blocks of length n, where the decryption key d corresponds to the encryption key e. • In short, this mode amounts to each block being “looked up in a codebook”. ECB mode is illustrated in the diagram above Example: Take ⌃ = {0, 1}, block length n = 4, and the permutation cipher. So K = S4 , and for ⇡ 2 S4 , E⇡ : {0, 1}4 ! {0, 1}4, b1 b2 b3 b4 7! b⇡ (1) b⇡ (2) b⇡ (3) b⇡ (4) . Suppose we have the plaintext m = 101100010100101. Decompose it into blocks of length 4. The last block has length 3, so we add a random bit, say 0. So we have the blocks m1 = 1011, m2 = 0001, m3 = 0100, m4 = 1010. ⇣ ⌘ We use the key ⇡ = 12 23 34 41 , so b1 b2 b3 b4 7! b2 b3 b4 b1 . The blocks are encrypted separately, and we get the ciphertext blocks c1 = E⇡ (m1 ) = 0111, c2 = E⇡ (m2 ) = 0010, c3 = E⇡ (m3 ) = 1000, c4 = E⇡ (m4 ) = 0101. So the ciphertext is c = 0111001010000101. The main weaknesses of ECB mode are as follows: • The same plaintext blocks give rise to the same ciphertext blocks. Differences in frequencies may therefore be apparent, and be exploited by the eavesdropper Oscar. • Oscar can also change the ciphertext and not be detected. 2.5.2 CBC mode The cipherblock chaining (CBC) mode avoids the weaknesses of ECB mode. Here: encryption of a block depends not only on the key, but also on the previous block. In other words: Encryption is “context dependent”, in the sense that equal texts in different contexts are enciphered differently. The ciphertext can therefore not be manipulated; manipulations will be detected. We need the following fundamental operation: Definition 2.20. (a) The map : {0, 1}2 ! {0, 1}, (b, c) 7! b XOR, is defined by the following table: 20 c, called the exclusive or of two bits, or 2.5 Modes of Block Cipher Operation b c b c 0 0 0 1 0 1 0 1 1 1 1 0 (b) If k 2 N, b = (b1, b2, . . . , bk ), c = (c1, c2, . . . ck ) 2 {0, 1} k , then we set b (b1 c1, b2 c2, . . . , bk ck ). c = | Remark: If {0, 1} is a system of representatives of Z/2Z, then Example: If b = 0100 and c = 1101; then b is the same as addition in Z/2Z. c = 1001. CBC mode works as follows: As before, use a block cipher with alphabet ⌃ = {0, 1}, block length n, key space K , and encryption and decryption functions Ek and Dk for k 2 K . • Use a fixed initialization vector IV 2 ⌃ n which need not be kept secret. • The plaintext is decomposed into blocks of length n (as in ECB mode), say m1, m2, . . . , mt . • Alice obtains the ciphertext by setting c0 = IV, and c j = Ee (c j 1 mj ) for 1  j  t; the ciphertext is then c = c1 c2 . . . ct . • To decipher, Bob uses the decryption key d that satisfies Dd (Ee (w)) = w for all plaintext blocks w. Then he computes c0 = IV, and m j = cj CBC mode can be illustrated as follows: 21 1 Dd (c j ) for 1  j  t. (2.5) 2.5 Modes of Block Cipher Operation Remark: Why does this work? For 1  j  t, set cj 1 Dd (c j ) = c j 1 Dd (Ee (c j = cj 1 (c j = 0 1 m j )) 1 m j ) = (c j 1 cj 1 ) mj mj = mj . Example: Use the same block cipher, same plaintext, and same key as in the previous example, ⇣ ⌘ namely plaintext blocks m1 = 1011, m2 = 0001, m3 = 0100, m4 = 1010, and key ⇡ = 12 23 34 41 . We choose the initialization vector IV = 1010. Then c1 = E⇡ (c0 m1 ) = E⇡ (0001) = 0010, c2 = E⇡ (c1 m2 ) = E⇡ (0011) = 0110, c3 = E⇡ (c2 m3 ) = E⇡ (0010) = 0100, c4 = E⇡ (c3 m4 ) = E⇡ (1110) = 1101. So the ciphertext is c = 0010011001001101. The decryption works as follows (note that here we have Dd = E⇡ 1 , the inverse in the group of permutations): m1 = c0 E⇡ 1 (c1 ) = 1010 0001 = 1011, m2 = c1 E⇡ 1 (c2 ) = 0010 0011 = 0001, m3 = c2 E⇡ 1 (c3 ) = 0110 0010 = 0100, m4 = c3 E⇡ 1 (c4 ) = 0100 1110 = 1010, and we (i.e., Bob) have recovered the plaintext. Remark: What are the effects of transmission errors? Consider the decryption process (2.5). If the ciphertext block c j is transmitted incorrectly, then m j and m j+1 may be incorrect, but everything after that will not be influenced. 2.5.3 CFB mode A disadvantage of the CBC mode is that encryption and decryption of blocks are done sequentially. Encryption and decryption functions may be expensive to compute, which may lead to time lags. For some applications, however, (for instance, secure telephone communications), real-time encryption and decryption are necessary. This time-lag problem is reduced in cipher feedback mode (or CFB mode). The main difference to CBC mode is that the encryption function is – not used directly for encrypting plaintext blocks, but – used for generating a sequence of key blocks. CFB mode works as follows: • As before, we use a block cipher with alphabet ⌃ = {0, 1}, block length n, key space K , and encryption function Ek for k 2 K . 22 2.5 Modes of Block Cipher Operation • Use a fixed initialization vector IV 2 ⌃ n . • Choose an integer r, 1  r  n, and decompose the plaintext into blocks of length r. • To encrypt the sequence m1, m2, . . . , mu of plaintext blocks, Alice sets I1 = IV , and for 1  j  u: (1) O j = Ek (I j ), (2) t j is the string consisting of the first r bits of O j , (3) c j = m j (4) I j+1 = 2r I tj, j + c j (mod 2n ), i.e., I j+1 is obtained by deleting the first r bits in I j and appending c j . • The ciphertext is the sequence c1, c2, . . . , cn . • To decrypt, Bob sets I1 = IV , and for 1  j  u: (1) O j = Ek (I j ), (2) t j is the string consisting of the first r bits of O j , (3) m j = c j (4) I j+1 = 2r I tj, j + c j (mod 2n ) (as before). CFB mode can be illustrated as follows: Remarks: (1) Why does this work? We have cj t j = (m j tj) t j = mj (t j t j ) = mj . (2) Both Alice and Bob compute the string t j+1 as soon as they know the ciphertext block c j . (3) Only the XORs are computed sequentially; this is fast. (4) CFB mode cannot be used with public-key cryptography since both Alice and Bob use the same key. (5) Transmission errors: Decryption is spoiled as long as parts of the wrong ciphertext block are in the vector I j . 23 2.6 Stream Ciphers 2.5.4 OFB mode The output feedback mode (OFB mode) is very similar to the CFB mode. Everything is the same up to For 1  j  u: (1) O j = Ek (I j ), (2) t j is the string consisting of the first r bits of O j , (3) c j = m j tj, (4) I j+1 = O j . Decryption works analogously again, with Step (3) replaced by m j = c j tj. OFB mode can be illustrated as follows: Remarks: (1) The key block t j depends only on IV and the key k; it can be computed simultaneously by Alice and Bob. (This is even more efficient than in CFB mode). (2) Weakness: While encryption of a plaintext block does depend on its position, it does not depend on the previous plaintext blocks. This makes ciphertext manipulation easier than in CFB mode. (3) If a bit of ciphertext is incorrectly transmitted, only this one bit of plaintext will be affected. 2.6 Stream Ciphers A general theory of stream ciphers is described in the textbook (1) by Stinson. Here I will present only a particular example. With the alphabet ⌃ = {0, 1}, let the plaintext and ciphertext spaces be P = C = ⌃⇤ . The key space is K = ⌃ n for some n 2 N. Words in ⌃⇤ are encrypted bit by bit as follows: 24 2.7 Mathematical Background II: Linear Algebra (1) Let k = (k 1, k 2, . . . , k n ) 2 K . (2) Let w = 1 1 ... m be a word of length m in ⌃⇤ . (3) Alice generates a key stream z1, z2, . . . , z m by setting z1 = k 1 , z2 = k 2 , . . . , z n = k n , and for m > n, zi+n = n 1 X c j zi+j (mod 2) j=0 where c0, c1, . . . , cn 1 n < i  m, (2.6) are fixed coefficients. (4) Then the encryption function Ek is defined by Ek (w) = 1 z1, 2 z2, . . . , m zm . z2, . . . , m zm . (5) The decryption function is the same, namely Dk (w) = 1 z1, 2 (It is easy to see that this is a cryptosystem). Example: Let n = 4, and choose c0 = c1 = 1, c2 = c3 = 0, so zi+4 = zi zi+1 . (2.7) Let the key be k = (1, 0, 0, 0); the key stream is then 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, . . . | {z } k This is periodic with period length 15. Remarks: (1) An equation or congruence of type (2.6) is called a linear recurrence of degree n. (2) Such a recurrence can be implemented in hardware by a linear feedback shift register. For instance, the recurrence (2.7) can be illustrated as follows: 2.7 Mathematical Background II: Linear Algebra In this section R will be a commutative ring with identity 1. The most important examples are the ring of integers Z and the residue class ring Z/mZ for some positive integer m. No proofs will be given; for more details, consult any linear algebra book. 25 2.7 Mathematical Background II: Linear Algebra 2.7.1 Matrices Definition 2.21. A k ⇥ n matrix over R is a rectangular scheme *. a1,1 a1,2 . . . a1,n +/ .. a / 2,1 a2,2 . . . a2,n / / A = .. . .. .. // . .. .. .. . . . / . / a a . . . a k,1 k,2 k,n , - We sometimes also write A = (ai, j ). If n = k, the matrix is called a square matrix. The vector (ai,1, . . . , ai,n ) is the ith row of A, 1  i  k. The vector (a1, j , . . . , ak, j ), 1  j  n, is the jth column of A. ai, j 2 R is called the entry in row i and column j. The set of all k ⇥ n matrices over R is denoted by R (k,n) . Example: Let R = Z. Then Definition 2.22. ⇣ 123 456 ⌘ | is a 2 ⇥ 3 matrix over R. It has 2 rows and 3 columns. (1) Let A = (ai, j ) 2 R (k,n) and v = (v1, . . . , vn ) 2 Rn . Then the product Av is defined as the vector w = (w1, . . . , wk ) with wi = n X ai, j v j , j=1 1  i  k. (2) Let n 2 N and A, B 2 R (n,n) , A = (ai, j ), B = (bi, j ). The sum of A and B is A + B = (ai, j + bi, j ), and the product of A and B is A · B = AB = (ci, j ), where n X ci, j = ai,k bk, j , 1  i  k. k=1 Example 1: Let A = ⇣ 12 23 ⌘ | and v = (1, 2). Then Av = (5, 8). Remarks: (1) It is sometimes convenient to write vectors as column vectors: ⇣ 12 23 ⌘⇣ ⌘ 1 2 = ⇣ ⌘ . ⌘ . 5 8 (2) Matrix addition and multiplication (with certain restrictions) can be defined also for non-square matrices. Example 2: Let A = ⇣ 12 23 ⌘ and B = ⇣ 45 67 ⌘ . Then A+ B = ⇣ 5 7 8 10 ⌘ , AB = ⇣ 16 19 26 31 ⌘ , BA = ⇣ 14 22 20 33 Remark: The set R (n,n) of n⇥n matrices over R, together with matrix addition and multiplication, forms itself a ring. The additive neutral element is the n ⇥ n zero matrix all of whose entries are 0 2 R. The identity in this ring is the n ⇥ n identity matrix In which has the identity element of R (e.g., 1 in the case of R = Z) in the main diagonal, and zero elements elsewhere. In general (as Example 2 shows), the ring (R (n,n), +, ·) is not commutative. ⇣ ⌘ ⇣ ⌘ Example: When R = Z and n = 2, then I2 = 10 01 , and the zero matrix is 00 00 . 26 2.7 Mathematical Background II: Linear Algebra 2.7.2 Determinants and Inverses Definition 2.23. Let A 2 R (n,n) . The determinant of A can be defined recursively as follows: For n = 1, with A = (a), we have det A = a. Now let n > 1. For i, j 2 {1, 2, . . . , n}, let Ai, j denote the matrix that is obtained from A by deleting the ith row and the jth column. Then, for a fixed i 2 {1, 2, . . . , n}, set det A = n X ( 1) i+j ai, j det Ai, j . j=1 | Remarks: (1) The matrices Ai, j are called minors. have (2) The value of det A is independent of the choice of i. Also, for all j 2 {1, 2, . . . , n} we det A = n X ( 1) i+j ai, j det Ai, j . i=1 This is called a column expansion of the determinant, while the formula in the definition is a row expansion. Example: If A = Hence ⇣a 1,1 a1,2 a2,1 a2,2 ⌘ , then A1,1 = (a2,2 ), A1,2 = (a2,1 ), A2,1 = (a1,2 ), and A2,2 = (a1,1 ). det A = a1,1 a2,2 a2,1 a1,2 . We will make repeated use of this formula. Definition 2.24. (a) A matrix A 2 R (n,n) which has a multiplicative inverse is called invertible. (b) The adjoint of A is the n ⇥ n matrix defined by ⇣ ⌘ adj A = ( 1) i+j det A j,i . Example: If A = ⇣a 1,1 a1,2 a2,1 a2,2 Proposition 2.8. ⌘ , then adj A = ⇣ a2,2 a1,2 a2,1 a1,1 ⌘ | . (a) A matrix A 2 R (n,n) is invertible if and only if det A is a unit in R. (b) If A is invertible then A 1 = (det A) 1 adj A.  The most important special case for our purposes arises when R = Z/mZ. To deal with this case, we first introduce some Notation: Let A = (ai, j ), B = (bi, j ) 2 Z(n,n) and m 2 N. We write A⌘B (mod m) if ai, j ⌘ bi, j (mod m) for all 1  i, j  n. 27 2.7 Mathematical Background II: Linear Algebra The following is an immediate consequence of Proposition 2.8: Proposition 2.9. Given an A 2 Z(n,n) , the congruence AA0 ⌘ In (2.8) (mod m) is solvable if and only if gcd(det A, m) = 1. In this case, if a is an inverse of det A (mod m), then A0 ⌘ a adj A (mod m) is a solution of the congruence (2.8), Example: Let A = ⇣ 12 34 ⌘  . Does the congruence (mod 11) AA0 ⌘ I2 have a solution? If so, find it. Solution. We find det A = 1 · 4 3 · 2 = 2, which is coprime to 11 (i.e., gcd( 2, 11) = 1), so the congruence is solvable. Note that ( 2)( 6) ⌘ 1 (mod 11), so (det A) ⇣ ⌘ (mod 11). Now adjA = 43 12 , and so ⇣ ⌘ ⇣ ⌘ ⇣ ⌘ A0 = 5 · 43 12 = 2015 510 ⌘ 97 15 (mod 11). 1 ⌘ 6 ⌘ 5 2.7.3 Affine linear functions Recall that the affine cipher was, in fact, a block cipher of block length 1. In the next section we generalize affine ciphers to arbitrary block lengths. But first we need some more concepts and results from linear algebra. Definition 2.25. A function f : Rn ! Rl is called affine linear if there is a matrix A 2 R (l,n) and a vector b 2 Rl such that f (v) = Av + b (2.9) for all v 2 Rn . If b = 0, then the function is called linear. Once again we consider the important special case R = Z/mZ. {0, 1, . . . , m | Recall that Zm = 1}, the set of least nonnegative residues mod m. Definition 2.26. A function f : Znm ! Zlm is called affine linear if there is a matrix A 2 Z(l,n) and a vector b 2 Zl such that f (v) = Av + b (mod m) for all v 2 Znm . If b ⌘ 0 (mod m), then the function is called linear. 28 (2.10) | 2.8 Affine Linear Block Cipher Proposition 2.10. The affine linear map (2.9) is bijective if and only if l = n and det A is a unit in R.  We recall that a “corollary” is a commonly used mathematical term which means consequence of a previous theorem or proposition. And indeed, the following is an easy consequence of Proposition 2.10. Corollary 2.2. The affine linear map (2.10) is bijective if and only if l = n and gcd(det A, m) = 1. ~ Example: Consider the map f : {0, 1}2 ! {0, 1}2 defined by f (0, 0) = (0, 0), f (1, 0) = (1, 1), f (0, 1) = (1, 0), f (1, 1) = (0, 1). This can be written as f (v) = Hence f is linear. ⇣ 11 10 ⌘ for all v v 2 {0, 1}2 . Proposition 2.11. (1) A function f : Rn ! Rn is linear if and only if f (av + bw) = a f (v) + b f (w) for all v, w 2 Rn and all a, b 2 R. (2) The function is affine linear if and only if the function Rn ! Rn defined by v 7! f (v) f (0) is linear.  2.8 Affine Linear Block Cipher Several famous historical cryptosystems turn out to be “affine linear block ciphers”, and can be easily broken. 2.8.1 The cipher in general Let n 2 N be the “block length”, and m 2 N, m > 2. Definition 2.27. A block cipher with block length n and P = C = Zm is called affine linear (resp. linear) if its encryption functions are affine linear (resp. linear). | By definition, the encryption functions must be of the form E : Znm ! Znm, v 7! Av + b (mod m), where A 2 Z(n,n), b 2 Zn . For this to be a cryptosystem, E must be injective, hence bijective. By 29 2.8 Affine Linear Block Cipher the Corollary to Proposition 2.10, gcd(det A, m) = 1. Hence the encryption function is uniquely determined by the key ( A, b) 2 Z(n,n) ⇥ Zn, where gcd(det A, m) = 1. The corresponding decryption function is then D : Znm ! Znm, v 7! A0 (v b) (mod m), where A0 = a 0 adj A (mod m), with a 0 an inverse of det A (mod m). Remark: The size of the key space depends on the number of invertible matrices modulo m. There is a formula, but it is too involved to give here. Note that the number of all matrices over 2 Zm is m n . 2.8.2 Vigenère, Hill, and permutation ciphers 1. The Vigenère cipher (Blaise de Vigenère, 1523–1596). The key space is K = Znm . For a k 2 Znm we have Ek : Znm ! Znm, v 7! v + k (mod m), Dk : Znm ! Znm, v 7! v k (mod m). and It is clear that Ek and Dk are affine linear (A = In , so det A = 1). The number of elements in K is m n . 2. The Hill cipher (Lester S. Hill, 1891–1961). The key space is K = { A 2 Z(n,n) | gcd(det A, m) = 1}. For A 2 K we have E A : Znm ! Znm, v 7! Av (mod m), Dk : Znm ! Znm, v 7! A0 v (mod m), and with A0 the inverse matrix modulo m, as in the previous subsection. In other words, the Hill cipher is the most general linear block cipher. 3. The permutation cipher (see Section 2.4). Let ⇡ 2 Sn , and let ei , 1  i  n, be the unit vectors, namely the row vectors of the identity matrix In . Now let E⇡ be the n ⇥ n matrix whose ith row vector is e⇡ (i), 1  i  n, i.e., the rows of In are permuted according to the permutation ⇡. Now, for any vector v = (v1, v2, . . . , vn ) 2 Znm we have (v⇡ (1), v⇡ (2), . . . , v⇡(n) ) = E⇡ v. Hence the permutation cipher is a linear cipher (note that det E⇡ = ±1), and thus a special case of the Hill cipher. 30 2.8 Affine Linear Block Cipher 2.8.3 Cryptanalysis of affine linear block ciphers For a possible ciphertext-only attack, see the assignments. Here I will describe a known plaintext attack. Suppose Alice and Bob use an affine linear cipher with key k = ( A, b) 2 Z(n,n) ⇥ Zn ; the encryption function is Ek : Znm ! Znm, v 7! Av + b (mod m). Suppose that Oscar has n + 1 plaintext blocks wi, 0  i  n, and the corresponding ciphertext blocks ci ⌘ Awi + b (mod m), 0  i  n. (2.11) By subtracting the congruence for i = 0 from the other n congruences, he gets ci c0 = A(wi w0 ) (mod m), 1  i  n. (2.12) Now let W be the matrix W = (w1 w0, w2 w0, . . . , wn w0 ) (mod m), where the columns are the vectors wi w0 (mod m), 1  i  n, and similarly, let C be the matrix C = (c1 c0, c2 c0, . . . , cn c0 ) (mod m). Then by the rules of matrix multiplication Oscar can rewrite (2.12) as AW ⌘ C (mod m). If gcd(det A, m) = 1, then he can solve this matrix congruence by multiplying by W 0 from the right; hence, by Proposition 2.9, A ⌘ CW 0 ⌘ C(w 0adj W ) (mod m), where w 0 is the inverse of det W modulo m. Also, from (2.11) he gets b ⌘ c0 Aw0 (mod m). Thus, Oscar has obtained the key ( A, b) from n + 1 pairs of plaintext and ciphertext blocks. If the cipher is linear, then Oscar has w0 = c0 = b = 0. If he knows about linearity beforehand, then n pairs would suffice. Remark: This shows that any linear or affine linear cipher is insecure. Example: We want to break a Hill cipher with block length n = 2 and m = 26. Suppose we know that hand is encrypted as FOOT. Hence we have the two plaintext-ciphertext pairs (see the table in Section 2.1) w1 = ha = (7, 0) 7! c1 = FO = (5, 14), So W = ⇣ 7 13 0 3 ⌘ w2 = nd = (13, 3) 7! c2 = OT = (14, 19). ⇣ ⌘ 5 14 . Now det W = 21, so gcd(det W, 26) = 1, and also 21 and C = 14 19 31 1 ⌘5 2.8 Affine Linear Block Cipher (mod 26). So W0 ⌘ 5 · and A ⌘ CW 0 ⌘ and thus the cipher is broken. ⇣ ⇣ 3 13 0 7 5 14 14 19 ⌘⇣ ⌘ ⌘ ⇣ 15 13 0 9 32 15 13 0 9 ⌘ ⌘ ⇣ ⌘ (mod 26) 23 9 2 15 ⌘ (mod 26), Chapter 3 Probability and Perfect Secrecy We have seen several historical cryptosystems which, as affine linear ciphers, are insecure. Are there secure cryptosystems? How can we define “perfect secrecy”? To discuss these questions, we need some basic concepts and results from probability theory. 3.1 Basics of Probability Theory 3.1.1 Probability Definition 3.1. Suppose we have an experiment with a finite number of outcomes. Then the sample space S is the set of these outcomes. Its elements are called elementary events. | Examples: (a) If we flip a coin, then we get either heads (H) or tails (T). So the sample space is S = {H, T }. (b) If we throw a die, then the sample space is S = {1, 2, 3, 4, 5, 6}. Definition 3.2. Given the sample space S, (a) an event (for S) is a subset of S; (b) the certain event is the set S itself; (c) the null event is the empty set ;; (c) two events A and B are mutually exclusive if their intersection is empty (i.e., A\B = ;). | Example: If we throw a die, then obtaining an even number is an event. According to the definition, this is the subset {2, 4, 6} of the sample space S = {1, 2, . . . , 6}. It excludes the event {1, 3, 5} of throwing an odd number. Definition 3.3. Let P(S) be the power set of S, i.e., the set of all events for S. A probability distribution on S is a map p : P(S) ! R with the properties (1) p( A) 0 for all events (2) p(S) = 1; A ✓ S; (3) p( A [ B) = p( A) + p(B) if the two events A, B are mutually exclusive. If A is an event, p( A) is called the probability of A. For an elementary event a 2 S, we define p(a) = p({a}). | 3.1 Basics of Probability Theory Properties: (1) p(;) = 0. (2) If A ✓ B, then p( A)  p(B). (3) 0  p( A)  1 for all A 2 P(S). (4) p(S \ A) = 1 p( A). (5) If A1, A2, . . . , An are pairwise mutually exclusive events, then n n [ X * + p Ai = p( Ai ). , i=1 - i=1 Remarks: (1) These properties follow almost directly from the definition. For proofs of (1) and (2), see Assignment 5. (2) Since S is a finite set, it suffices to define the probability distribution on elementary events, by Property (5). (3) For the same reason, we have for any event A ✓ S, X p( A) = p(a). a2A (4) A random variable is some function on the sample space. For instance, if two dice are thrown, the sum of the dots would be a random variable on the set of all pairs S = {1, 2, . . . , 6}2 . We will not use this term in this course. Examples: (a) The probability distribution of “throwing a fair die” is the function p : {1, . . . , 6} ! R given by p(a) = 1 6 for all a 2 S = {1, . . . , 6}. (b) The probability of the event “even result” is p({2, 4, 6}) = p(2) + p(4) + p(6) = 1 6 + 1 6 + 1 6 = 12 . Remark: The probability distribution that maps each elementary event a 2 S to the probability p(a) = 1 |S | is called the uniform distribution. 3.1.2 Conditional probability Example: You throw a fair die, so the sample space is S = {1, 2, . . . , 6}, and the probability distribution is p(a) = 1 6 for all a 2 S. Now suppose you have thrown one of the numbers {4, 5, 6}, i.e., the event B = {4, 5, 6} has happened. Given this assumption, what is the probability that you have thrown an even number? Each elementary event in B is equally likely, and so has probability 13 . Since two numbers in B are even, the probability that you have thrown an even number is 23 . How can we express this symbolically? Let A = {2, 4, 6} and, as before, B = {4, 5, 6}. Then the elementary events in question are the members of the set A \ B = {4, 6}. However, the event B = {4, 5, 6}, which has probability 34 3.1 Basics of Probability Theory 1 2, occurred with certainty, so we must divide by 12 . Thus, the “probability of A, given that B occurs”, is p( A \ B) p({4, 6}) 1/3 2 = = = . p(B) p({4, 5, 6}) 1/2 3 This suggests the following definition. Definition 3.4. Let S be a sample space, A and B events for S, and p a probability distribution on S. If p(B) > 0, the conditional probability of “A given that B occurs” is defined to be p( A \ B) p( A | B) = . (3.1) p(B) | Another term is closely related to this: Definition 3.5. With S and p as before, two events A and B are called independent if p( A \ B) = p( A)p(B). (3.2) If the events are not independent, they are called dependent. | Remark: It is clear form (3.1) and (3.2) that independence of two events A and B is equivalent to p( A | B) = p( A). Examples: (a) We flip two coins. The event “first coin comes up tails” (probability 12 ) is independent from the event “second coin comes up tails” (probability 12 ). The probability that both events occur is therefore 1 2 · 1 2 = 1 4. (b) If the coins are welded together so that either two heads or two tails come up, then the probability of two tails is 1 2 , 1 2 · 12 . So the two events of (a) are dependent. Proposition 3.1. (Theorem of Bayes) Let S and p be as before, and A, B be events with p( A) > 0 and p(B) > 0. Then p(B)p( A | B) = p( A)p(B | A).  Proof By (3.1) we have p(B)p( A | B) = p( A \ B) and p( A)p(B | A) = p( A \ B). The result now follows by equating the two. ⇤ 3.1.3 An example: the birthday paradox Problem: Suppose there are k people in a room. What is the probability that two them have the same birthday? Related to this is the question: How many people do you need to have in the room so that with probability 1/2 two of them have the same birthday? The answer is surprisingly low (and is therefore referred to as the birthday paradox): With as few as 23 people the probability is already slightly higher than 1/2. 35 3.1 Basics of Probability Theory To be able to use the solution of this problem for cryptographic purposes (and not just as a neat example of probability theory), we consider a more general situation: Suppose there are n possible birthdays, and k people are in the room. Then the formal setting is as follows: • The sample space is S = {1, 2, . . . , n} k . • An elementary event is the k-tuple (b1, b2, . . . , bk ) 2 S, meaning that the ith person has birthday bi , 1  i  k. • Hence there are nk elementary events. • We assume that the elementary events are equally probable, so p(a) = 1/nk for all a 2 S. Now let p be the probability that two people have the same birthday. Then the probability that any two people have different birthdays is q = 1 p. To determine this probability, we consider the event for E = {(b1, b2, . . . , bk ) 2 S | bi , b j 1  i < j  k}. Then clearly, by the properties of probability derived earlier in this section, |E| q= k, n k since each elementary event has probability 1/n . (3.3) We now determine the number |E| of elements of E: The entry b1 can be any of the n choices. Once b1 has been fixed, there are n 1 possibilities for b2 ; then with b2 also fixed, there are n 2 possibilities for b3 , etc., up to n Hence |E| = and by (3.3), q= k 1 1 Y (n nk i=0 k 1 Y (n k + 1 choices for bk . i), i=0 i) = k 1 Y i=1 1 ! i . n (3.4) Recall the series expansion for the exponential function: x2 x3 + . . ., 2! 3! which implies that 1 x  e x for all x 0 (and in fact for all x 2 R). This, with (3.4), gives ! k 1 k 1 Y X i+ k (k 1) i/n * q e = exp = exp , 2n , i=1 n i=1 where we have used the summation formula k 1 X (k 1)k i= . 2 i=1 e x =1 x+ Thus we have obtained the following result: 36 3.2 Perfect Secrecy Proposition 3.2. The probability q that among k persons and n possible birthdays no two persons have the same birthday satisfies q  exp ! k (k 1) . 2n Examples: (a) For n = 365 and k = 23, we get ! 23 · 22 1 q  exp ' 0.49999825 < . 2 · 365 2 This solves the original problem: If 23 people are in a room, the probability p = 1 (3.5)  q that two have the same birthday is greater than 1/2. (b) More generally, one can verify that for ⌘ p 1⇣ k 1 + 1 + 8n log 2 2 the inequality (3.5) implies q  1/2. (c) With 60 people in our classroom, the probability that two have the same birthday is at least 1 exp ! 60 · 59 ' 0.992167. 2 · 365 Remark: Proposition 3.2 is the basis for the so-called “birthday attack” on hash functions (see later). 3.2 Perfect Secrecy Suppose that Alice uses a cryptosystem to send an encrypted message to Bob and that Oscar, the opponent, can read the ciphertext. Roughly, the cryptosystem is said to have perfect secrecy if Oscar learns nothing about the plaintext from the ciphertext. The purpose of this section is to formalize this concept. 3.2.1 Shannon’s theory Suppose the cryptosystem used has • a finite plaintext space P, • a finite ciphertext space C, • a finite key space K , and • encryption and decryption functions Ek and Dk for k 2 K . We make the following assumptions about probabilities: 1. The probability of a plaintext w 2 P is p P (w). p P is a probability distribution on P. It is not normally the uniform distribution; it depends, for example, on the language used for communication. 37 3.2 Perfect Secrecy 2. For the encryption of each new plaintext Alice uses a new key, independent of the plaintext to be encrypted. 3. The probability of a key k 2 K is pK (k). pK is a probability distribution on K . 4. The probability that a w 2 P occurs and is encrypted with a k 2 K is (3.6) p((w, k)) = p P (w)pK (k), which defines a probability distribution on the sample space P ⇥ K . We will now work with this sample space, and define the following events: A. If w 2 P is a plaintext, then we can identify with w the event {(w, k) | k 2 K } that w is encrypted. It is clear that we have (3.7) p(w) = p P (w). B. If k 2 K is a key, then we can identify with k the event {(w, k) | w 2 P} that k is chosen for the encryption. Again, it is clear that (3.8) p(k) = pK (k). Note that, in the above notation, we have w \ k = (w, k) 2 P ⇥ K , and with (3.6), (3.7), and (3.8) we get p(w \ k) = p(w)p(k), which means that the events w and k are independent. C. If c 2 C is a ciphertext, we identify with c the event {(w, k) | Ek (w) = c} that the result of the encryption is c. Perfect secrecy is based on the following: Oscar knows the probability distribution p P on P (e.g., he knows which language Alice and Bob use). Now suppose that Oscar sees a ciphertext c. If the fact that the event c has occurred makes some plaintexts more likely than they would be according to p P , then Oscar has learned something about w from observing c. This motivates the following definition due to Claude Shannon (1916 – 2001). Definition 3.6. We say that the cryptosystem defined above has perfect secrecy if the events that a particular ciphertext occurs and that a particular plaintext has been encrypted are independent; in other words, if p(w | c) = p(w) for all w2P 38 and all c 2 C. | 3.2 Perfect Secrecy Example: Consider the following easy cryptosystem. Let P = {0, 1}, C = {0, 1}, and P = { A, B}, with the probability distributions 1 3 p(0) = , p(1) = , 4 4 and the encryption function defined by E A (0) = a, 1 p( A) = , 4 EB (0) = b, E A (1) = b, 3 p(B) = , 4 EB (1) = a. The probability that the plaintext 1 occurs and is encrypted with B 2 K is 3 3 9 p(1)p(B) = · = . 4 4 16 The probability of the ciphertext a 2 C is 1 1 3 3 5 p(a) = p((0, A)) + p((1, B)) = · + · = . 4 4 4 4 8 The probability of the ciphertext b 2 C is 3 1 1 3 3 p(b) = p((1, A)) + p((0, B)) = · + · = . 4 4 4 4 8 The conditional probability p(w | c) is, by Bayes’ theorem, p(w) · p(c | w) p(w | c) = . p(c) With this we compute p(0) · p(a | 0) 1/4 1 1 p(0 | a) = = · = , p(a) 5/8 4 10 p(1) · p(a | 1) 3/4 3 9 p(1 | a) = = · = , p(a) 5/8 4 10 p(0) · p(b | 0) 1/4 3 1 p(0 | b) = = · = , p(b) 3/8 4 2 p(1) · p(b | 1) 3/4 1 1 p(1 | b) = = · = . p(b) 3/8 4 2 So we see that p(w | c) , p(w) for all w, c; so the cryptosystem does not have perfect secrecy. In fact, if Oscar sees the ciphertext a 2 C, he can be reasonably certain that it corresponds to the plaintext 1 2 P. Note that we don’t have perfect secrecy if equality fails for one pair w, c. If we changed the probabilities in the above example to 1 p( A) = p(B) = , 2 then the system would have perfect secrecy. This is in fact true in general: Proposition 3.3. (Shannon’s Theorem) Suppose we have a cryptosystem with |C| = |K | and p(w) > 0 for all w 2 P. Then the system has perfect secrecy if and only if • the probability distribution on K is the uniform distribution, and if • for any w 2 P and any c 2 C there is exactly one k 2 K with Ek (w) = c. Proof “)”: Suppose the cryptosystem has perfect secrecy. Let w 2 P. (i) If there is a c 2 C for which there is no k 2 K with Ek (w) = c, then 0 = p(w | c) , p(w) > 0, 39  3.2 Perfect Secrecy and this contradicts perfect secrecy. Hence for any c 2 C there is at least one k 2 K with Ek (w) = c. But there cannot be more than one key since |K | = |C|. This proves the second assertion. have (ii) Fix a c 2 C, and let k (w) be the key with Ek (w) (w) = c. By the theorem of Bayes we p(w | c) = p(c | w) · p(w) p(k (w)) · p(w) = p(c) p(c) (3.9) for each w 2 P. Since the system has perfect secrecy, we have p(w | c) = p(w), so (3.9) implies p(k (w)) = p(c), that is, p(k (w)) is the same for each w 2 P. But any key k 2 K is equal to k (w) for some w 2 P. Hence the probability for all keys is the same, so the probability distribution on K is uniform. “(”: Assume the two statements hold, and let k (w, c) be the unique key k with Ek (w) = c. Then by the theorem of Bayes, p(w) · p(c | w) p(w) · p(k (w, c))) =P . p(c) x 2 P p(x)p(k (x, c)) Now, by assumption we have p(k (w, c)) = 1/|K |, and so X 1 X 1 p(x)p(k (x, c)) = p(x) = = p(k (w, c)). |K | x 2 P |K | x2P (3.10) p(w | c) = Hence by (3.10) we have p(w | c) = p(w), and perfect secrecy follows. Example: If the 26 keys of the shift cipher are used with equal probability ⇤ 1 26 , then the shift cipher has perfect secrecy. Indeed, we have P = C = K = Z26 , and the encryption function is Ek (w) = w + k for w 2 P and k 2 K . So, given w 2 P and c 2 C with Ek (w) = c, there is exactly one key k⌘c w (mod 26). Therefore, by Shannon’s theorem, the shift cipher has perfect secrecy. Remark: It is important to note that a new random key must be used to encrypt for every plaintext character. 3.2.2 The Vernam one-time pad Gilbert S. Vernam (1890 – 1960) invented and patented in 1917 the famous “Vernam one-time pad”: Let • P = C = K = {0, 1} n . • For k 2 K , the encryption function is Ek : {0, 1} n ! {0, 1} n, w 7! w k. • The decryption function is the same. In practice, to encrypt a plaintext w 2 {0, 1} n , Alice chooses a key randomly with uniform distribution from the set {0, 1} n , and then computes the ciphertext c = w 40 k. 3.3 Random and Pseudo-random Numbers This cryptosystem has perfect secrecy by Shannon’s theorem: (i) Uniform distribution is used on the key space, and (ii) given a plaintext w and a ciphertext c, there is exactly one key k with c = w k. Indeed, XOR w to both sides of this last relation, to obtain w c=w (w k) = (w w) k = k. (3.11) The main drawback of this system is that the one-time pad is not very efficient: To communicate a plaintext of length n, a key of length n must be randomly generated and exchanged. It cannot be reused: If Oscar knows the plaintext w and the corresponding ciphertext c, then he can obtain the key k by way of (3.11). Still, the Vernam one-time pad is useful in situations where perfect secrecy is of the utmost importance. 3.3 Random and Pseudo-random Numbers We have seen that it is of great importance to have a source of random numbers (or random bits). Systems to create them, also known as random number generators, can be hardware-based or software-based. 1. Hardware-based systems use, for instance, • the randomness of radioactive decay; • the thermal noise from a semiconductor resistor; • the time between two keyboard strokes. Most of these systems are slow and/or expensive to implement. 2. Software-based systems are typically pseudo-random number (or bit) generators. These are algorithms that, given a short sequence of random bits, produce a long sequence of bits that “look” random. There are many pseudo-random number generators of various “quality”. We will not deal with this topic in this course; for more details, see the book (1) by Stinson. 41 Chapter 4 Modern Classical Cryptosystems In Chapter 2 we defined cryptosystems and studied some historical examples. Most turned out to be affine linear, and can therefore be easily broken. In Chapter 3, on the other hand, we saw that the Vernam one-time pad is a cryptosystem with perfect secrecy, but it is inefficient to use. In this chapter we will study two modern cryptosystems that use a key of limited size to encrypt a relatively long string of plaintext (i.e., use the same key for many messages), while ensuring at least computational security, that is, they make breaking the cipher with any cryptanalytic attack a computationally difficult problem. 4.1 The Data Encryption Standard (DES) In 1973 the U.S. National Bureau of Standards (now called the National Institute of Standards and Technology - NIST) issued a call for a cryptosystem that was to become a national standard. In 1975, the DES was first published, and in 1977 it became the official standard. It has now been largely superseded by the Advanced Encryption Standard (AES, which we will study in Section 4.3), but its design principles are still of interest. 4.1.1 Feistel ciphers The DES belongs to a class of more general cryptosystems called Feistel ciphers, named after Horst Feistel of IBM. A Feistel cipher works as follows. 1. We have a block cipher, which we call the underlying or the internal block cipher with • alphabet {0, 1}; • block length t; • encryption function f K for the key K in this block cipher’s key space Kint . 2. The Feistel cipher constructed with this internal block cipher is itself a block cipher as follows: • The alphabet is again {0, 1}; • block length is 2t; • there is an integer r 1 counting the number of rounds; • the key space is K ; • given a key k 2 K , there is a method that generates a sequence K1, K2, . . . , Kr of “round keys” that belong to the key space Kint of the internal block cipher. 3. The encryption function Ek for a key k 2 K is then defined as follows: • Let w be a plaintext of length 2t. 4.1 The Data Encryption Standard (DES) Figure 4.1: Feistel encryption and decryption • Split it into 2 halves of length t, and set w = (L 0, R0 ), where L 0 is the left half and R0 is the right half. • For 1  i  r construct the sequence (L i, Ri ) = (Ri 1, L i • Finally, set 1 (4.1) f Ki (Ri 1 )). Ek (w) = (Rr , L r ). 4. To decrypt, we have to reverse the process, trying to go “backwards”. First, from (4.1) we have that Ri 1 = L i . Then the second components in (4.1) give, if we “XOR” f Ki (Ri 1 ) to both sides, Li 1 = (L i 1 f Ki (Ri 1 )) f Ki (Ri 1 ) = Ri f Ki (Ri 1 ) = Ri f Ki (L i ). Together, we have for 1  i  r, (Ri 1, L i 1 ) = (L i, Ri f Ki (L i )). Using this in r rounds with the reverse key sequence (Kr , Kr 1, . . . , K1 ), we obtain the plaintext pair (R0, L 0 ) from the ciphertext pair (Rr , L r ). 43 4.1 The Data Encryption Standard (DES) Remarks: (1) We see that for the Feistel cipher, encryption and decryption are the same, except that the key sequence is reversed. (2) The security of the Feistel cipher depends on the security of the underlying internal block cipher. 4.1.2 The DES Algorithm The DES is a slightly modified Feistel cipher with block length 2t = 64 and r = 16 rounds. 1. Plaintext, ciphertext, and key spaces: • P = C = {0, 1}64 . • Keys are elements of {0, 1}64 . • If a key is divided into 8 bytes, then the bit sum of each byte is odd (i.e., 7 bits of each byte determine the eighth bit). • This means that the key space is K = {(b1, . . . , b64 ) 2 {0, 1}64 | • Therefore |K | = 256 ' 7.2 · 1016 . 8 X i=1 b8k+i ⌘ 1 (mod 2), 0  k  7}. 2. The initial permutation: Given a plaintext w 2 {0, 1}64 , • Apply the initial permutation IP, to obtain IP(w) 2 {0, 1}64 . • Then apply the Feistel cipher to IP(w), with output (R16, L 16 ). • Finally, use the inverse permutation IP 1 : c = IP 1 (R16 L 16 ). • The initial permutation is defined in Table 4.1; it is to be read as follows: If w = w1 w2 w3 . . . w64 , then IP(w) = w58 w50 w42 . . . w7 . For the inverse initial permutation, see Table 4.2. 58 50 42 34 26 18 10 2 60 52 44 36 28 20 12 4 62 54 46 38 30 22 14 6 64 56 48 40 32 24 16 8 57 49 41 33 25 17 9 1 59 51 43 35 27 19 11 3 61 53 45 37 29 21 13 5 63 55 47 39 31 23 15 7 Table 4.1: The initial permutation IP 44 4.1 The Data Encryption Standard (DES) Figure 4.2: The encryption function of DES 40 8 48 16 56 24 64 32 39 7 47 15 55 23 63 31 38 6 46 14 54 22 62 30 37 5 45 13 53 21 61 29 36 4 44 12 52 20 60 28 35 3 43 11 51 19 59 27 34 2 42 10 50 18 58 26 33 1 41 9 49 17 57 25 Table 4.2: The inverse permutation IP 1 3. The internal block cipher (see Figure 4.3): • Block length is t = 32. • Key space is Kint = {0, 1}48 . The encryption function f K : {0, 1}32 ! {0, 1}32 , for a key K 2 {0, 1}48 , is defined as follows: • Let R 2 {0, 1}32 be the string to be encrypted. • Apply the expansion function E : {0, 1}32 ! {0, 1}48 to R; see Table 4.3 (i.e., if R = 45 4.1 The Data Encryption Standard (DES) R1 R2 . . . R32 , then E(R) = R32 R1 R2 R3 R4 R5 R4 R5 R6 R7 R8 R9 R8 R9 . . . R32 R1 .) • Compute E(R) K. • Divide the result into 8 blocks of length 6: E(R) K = B1 B2 B3 B4 B5 B6 B7 B8, where Bi 2 {0, 1}6, 1  i  8. • For each 1  i  8, apply an S-box Si : {0, 1}6 ! {0, 1}4 (see below); we obtain the string C = C1C2 . . . C8, where Ci = Si (Bi ), 1  i  8. Then C 2 {0, 1}32 . • Apply the fixed permutation P to C (see Table 4.3). • The result of the block cipher is then f K (R) = P(C). Figure 4.3: The internal block cipher of DES E P 32 1 2 3 4 5 16 7 20 21 4 5 6 7 8 9 29 12 28 17 8 9 10 11 12 13 1 15 23 26 12 13 14 15 16 17 5 18 31 10 16 17 18 19 20 21 2 8 24 14 20 21 22 23 24 25 32 27 3 9 24 25 26 27 28 29 19 13 30 6 28 29 30 31 32 1 22 11 4 25 Table 4.3: The functions E and P 46 4.1 The Data Encryption Standard (DES) 4. The S-boxes: Each S-box Si, 1  i  8, is represented by a table with 4 rows and 16 columns. Given a string B = b1 b2 . . . b6 , Si (B) is computed as follows: • The integer with binary expansion b1 b6 is used as row index. • The integer with binary expansion b2 b3 b4 b5 is used as column index. • The entry of the S-box in this row and column is written in binary expansion. • The result is padded with leading 0s to give it length 4. • This 4-bit string is then Si (B). • For the S-box Si , see Table 4.4. The other S-boxes are similar; see, e.g., the book (1) by Stinson. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 14 4 13 1 2 15 11 8 3 10 6 12 5 9 0 7 1 0 15 7 4 14 2 13 1 10 6 12 11 9 5 3 8 2 4 1 14 8 13 6 2 11 15 12 9 7 3 10 5 0 3 15 12 8 2 4 9 1 7 5 11 3 14 10 0 6 13 Table 4.4: The S-box S1 Example: Find S1 (001011). The “outer” bits are 01, so the row index is 1 (i.e., the second row). The “inner” bits are 0101, so the column index is 5. The corresponding entry in S1 is 2, or 10 in binary. Hence S1 (001011) = 0010. 5. The round keys (see Figure 4.4): Let k 2 K = {0, 1}64 be a DES key. First we define: • For 1  i  16, the values • the function 8 > > >1 vi = < > > >2 : for i 2 {1, 2, 9, 16}, otherwise; PC1 : {0, 1}64 ! {0, 1}28 ⇥ {0, 1}28, k = k 1 k 2 . . . k 64 7! (C, D) according to Table 4.5, namely C = k57 k 49 . . . k 36, • the function D = k 63 k 55 . . . k 4 ; PC2 : {0, 1}28 ⇥ {0, 1}28 ! {0, 1}48 according to Table 4.5, namely if (C, D) = b1 b2 . . . b56 , then PC2(b1 b2 . . . b56 ) = b14 b17 . . . b32 . The round keys Ki 2 Kint = {0, 1}48, 1  i  16, are now generated as follows: 47 4.1 The Data Encryption Standard (DES) Figure 4.4: Generation of the DES round keys PC1 PC2 57 49 41 33 25 17 9 14 17 11 24 1 5 1 58 50 42 34 26 18 3 28 15 6 21 10 10 2 59 51 43 35 27 23 19 12 4 26 8 19 11 3 60 52 44 36 16 7 27 20 13 2 63 55 47 39 31 23 15 41 52 31 37 47 55 7 62 54 46 38 30 22 30 40 51 45 33 48 14 6 61 53 45 37 29 44 49 39 56 34 53 21 13 5 28 20 12 4 46 42 50 36 29 32 Table 4.5: The functions PC1 and PC2 • Let (C0, D0 ) = PC1(k). • For 1  i  16, do the following: (a) Let Ci be the string that is obtained from Ci (b) Let Di be the string that is obtained from Di • Then Ki = PC2(Ci, Di ). 48 1 by a circular left shift of vi positions. 1 by a circular left shift of vi positions. 4.1 The Data Encryption Standard (DES) 6. Decryption: As we saw in Section 4.1.1, in order to decrypt a ciphertext, DES is applied with the reverse key sequence. In the process, the permutations IP and IP 1 will cancel each other out. Example: We apply the first round of DES with hexadecimal plaintext and key w = 0123456789ABCDEF, k = 133457799BBCDFF1, respectively. The left-hand table below is w in binary expansion, and the right-hand table is IP(w): 0 0 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 1 1 0 0 0 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 0 0 1 1 1 1 1 0 0 0 0 1 0 1 0 1 0 1 1 1 0 1 0 1 0 1 0 1 1 0 0 1 1 0 1 1 1 1 1 0 0 0 0 1 1 1 0 1 1 1 1 1 0 1 0 1 0 1 0 Then we obtain L 0 = 11001100000000001100110011111111, R0 = 11110000101010101111000010101010. The binary expansion of the DES key k is 0 0 0 1 0 0 1 1 0 0 1 1 0 1 0 0 0 1 0 1 0 1 1 1 0 1 1 1 1 0 0 1 1 0 0 1 1 0 1 1 1 0 1 1 1 1 0 0 1 1 0 1 1 1 1 1 1 1 1 1 0 0 0 1 To compute the first round key, we find C0 = 1111000011001100101010101111, D0 = 0101010101100110011110001111, C1 = 1110000110011001010101011111, D1 = 1010101011001100111100011110, and therefore K1 = 000110110000001011101111111111000111000001110010. 49 4.2 Mathematical Background III: Finite Fields Using this key, we obtain E(R0 ) K1 = 011000010001011110111010100001100110010100100111, then f K1 (R0 ) = 00100011010010101010100110111011, and finally R1 = 11101111010010100110010101000100. The other rounds are computed analogously. 4.1.3 Security of DES 1. Several techniques have been developed to attack DES, such as • “differential cryptanalysis” • “linear cryptanalysis” • exhaustive search of the key space. The last method has proved to be the most successful. With special hardware or large networks of workstations, DES ciphertexts can now be decrypted in a few hours. 2. Double DES. If we use two different keys k1, k 2 and encrypt a plaintext w by Ek2 (Ek1 (w)), does this increase security? We saw: In the case of the affine cipher, doing this is equivalent to an encryption with a third key, say Ek3 (w); in this case we say that the cipher is a group. Is DES a group? The answer is no, as can be proven. One might therefore think that Double DES would have the security level of a 2 · 56 = 112-bit key. However, Double DES is vulnerable to a “meet-in-the-middle attack”, which reduces the security level to that of a 57-bit key. 3. Triple DES. To avoid the weakness of Double DES, one can use Triple DES. To encrypt the plaintext w, choose two DES keys k 1, k 2 and perform Ek1 (Dk2 (Ek1 (w))). This is resistant to meet-in-the-middle attacks. Other kinds of attacks have been proposed, but they are impractical. 4.2 Mathematical Background III: Finite Fields The Rijndael algorithm of the next section requires a certain type of multiplication that is based on finite fields. This concept is also used in other cryptographic applications. 50 4.2 Mathematical Background III: Finite Fields 4.2.1 Polynomials Polynomials have many important applications in cryptography. In particular, they are needed to introduce the concept of a finite field. Definition 4.1. Let R be a commutative ring with identity 1 , 0. A polynomial (in one variable) over R in x is an expression f (x) = an x n + an 1 x n 1 + . . . + a1 x + a0, with the coefficients a j 2 R, j = 0, 1, . . . , n, and the variable x. The set of all polynomials over R in x is denoted by R[x]. If an , 0, then n is called the degree of f , n = deg f . The coefficients an and a0 are called the leading, resp. the constant coefficient of f . | Example: The polynomials 2x 3 + x + 1, x, and 1 are elements of Z[x] of degrees 3, 1, and 0, respectively. Definition 4.2. Let f 2 R[x]. If r 2 R, then f (r) = an r n + an 1r n 1 + . . . + a1 r + a0 is called the value of f at r. If f (r) = 0, then r is called a zero of f . | Examples: (a) The value of 2x 3 + x + 1 2 Z[x] at 1 is 2. (b) Denote the elements of Z/2Z by 0 and 1. Then x 2 + 1 2 (Z/2Z)[x]; it has the zero 1. Let g 2 R[x] be another polynomial, g(x) = bm x m + . . . + b1 x + b0, and assume that n m. If the “missing” coefficients in g are set to 0, we can write g(x) = bn x n + . . . + b1 x + b0 . Definition 4.3. With f , g 2 R[x] as above, we define the sum and the product of f and g by ( f + g)(x) = (an + bn )x n + . . . + (a1 + b1 )x + (a0 + b0 ) and ( f g)(x) = cn+m x n+m + . . . + c1 x + c0, with ck = k X 0  k  n + m, a j bk j , j=0 where any undefined coefficients ai, bi are set to 0. 51 | 4.2 Mathematical Background III: Finite Fields Examples: Let f (x) = x 3 + 2x 2 + x + 2 2 Z[x], g(x) = x 2 + x + 1 2 Z[x]. Then ( f + g)(x) = x 3 + 3x 2 + 2x + 3 and ( f g)(x) = x 5 + (2 + 1)x 4 + (1 + 2 + 1)x 3 + (2 + 1 + 2)x 2 + (2 + 1)x + 2 = x 5 + 3x 4 + 4x 3 + 5x 2 + 3x + 2. Remark: With this sum and product, (R[x], +, ·) is a commutative ring with identity 1. In what follows, let F be a field. Then the ring F[x] has no zero divisors. In fact, it is easy to see that deg ( f g) = deg f + deg g for all f , g 2 F[x] with f , g , 0. Most of the following results will be stated without proof. Theorem 4.1. Let f , g 2 F[x], g , 0. Then there are uniquely determined polynomials q, r 2 F[x] with f = qg + r, where r = 0 or deg r < deg g. ~ Remark: This is, in fact, division of the polynomials f , g with remainder. The polynomial q is called the quotient of f and g, and r the remainder. We can use “long division” to find q and r. Example: Let R = (Z/2Z)[x], f (x) = x 4 + x 3 + 1, g(x) = x 2 + x + 1. We divide f by g with remainder: x2 + 1 x 2 + x + 1 )x 4 + x 3 + 1 x4 + x3+ x2 x2+ 1 x2+ x + 1 x Hence x 4 + x 3 + 1 = (x 2 + 1) (x 2 + x + 1) + |{z} x . | {z } q(x) r (x) Corollary 4.1. Let f 2 F[x], f , 0. (a) If a 2 F is a zero of f , then f = (x polynomial x a). a)q, with q 2 F[x] (i.e., f is divisible by the (b) f has at most deg f zeros. ~ Proof (a) By Theorem 4.1 there are q, r 2 F[x] with f (x) = (x deg r < 1. Set x = a; then 0 = f (a) = 0 + r, so r = 0 and f = (x (b) This can be shown by induction. We skip the details. 52 a)q(x) + r and r = 0 or a)q. ⇤ 4.2 Mathematical Background III: Finite Fields Examples: (a) The polynomial x 2 + 1 2 (Z/2Z)[x] has the zero 1, and therefore we have x 2 + 1 = (x 1) 2 = (x + 1) 2 . (b) x 2 + x 2 (Z/2Z)[x] has the zeros 0 and 1 in Z/2Z. By the corollary it cannot have any more zeros. (c) x 2 + x + 1 2 (Z/2Z)[x] has no zero at all in Z/2Z. This shows that part (b) of the corollary is not always sharp. 4.2.2 Finite fields We have seen the following examples of fields: • Q, R, C, • Z/pZ, where p is a prime. Among these, only Z/pZ is a finite field. Are there other finite fields? We begin with a definition. Definition 4.4. Let F be a field. If repeated addition of 1 2 F to itself never gives 0 2 F, we say that F has characteristic zero. Otherwise, there is a prime p such that 1 + 1 + . . . + 1 (p times) equals 0, and p is called the characteristic of F. | Examples: The fields Q, R, C have characteristic 0. Z/pZ has characteristic p. Remark: If the characteristic is not 0, it has to be prime; otherwise F would have zero divisors, which is not possible since F is a field. Definition 4.5. A polynomial f 2 R[x] is called irreducible over R if it is impossible to write f = gh, with g, h 2 R[x], where the degrees of g and h are greater than 0. | Examples: (a) x 2 + 1 is irreducible over R, but not over Z/2Z (it factors, as we saw in a previous example.) (b) What are the irreducible polynomials of degree 3 over Z/2Z? It suffices to consider polynomials with constant coefficient 1 (otherwise x would be an obvious factor.) There are only four such polynomials: f 1 (x) = x 3 + 1, f 2 (x) = x 3 + x + 1, f 3 (x) = x 3 + x 2 + 1, f 4 (x) = x 3 + x 2 + x + 1. 53 4.2 Mathematical Background III: Finite Fields Note that each factorization must have a factor of degree 1, which occurs (in this case) if and only if 1 is a zero of the polynomial. Thus we find f 1 (x) = (x + 1)(x 2 + x + 1), f 4 (x) = (x + 1)(x 2 + 1) = (x + 1) 3, while f 2 and f 3 are irreducible (since f 2 (1), f 3 (1) are not 0.) Remark: In general, testing for irreducibility and factoring polynomials is not such an easy matter. In particular, an irreducible polynomial of degree 4 may not have a linear factor; some examples can be found in the assignments. Definition 4.6. Let f , g, h 2 R[x]. We say that the polynomials g and h are congruent modulo the polynomial f if g h is a multiple of f . We write g⌘h (mod f ). Example: We have x 3 ⌘ 1 (mod x 2 + x + 1) since x 3 | 1 = x 3 + 1 = (x + 1)(x 2 + x + 1) in (Z/2Z)[x]; see the previous example. We can now consider residue classes modulo a polynomial f ; we set R = Z/pZ for a fixed prime p. The residue class of the polynomial g modulo the polynomial f is g + f · (Z/pZ)[x] = {g + h f | h 2 (Z/pZ)[x]}. By Theorem 4.1 each residue class modulo f has a uniquely determined representative g 2 (Z/pZ)[x] such that either g = 0 or deg g < deg f . Next we define operations on the residue classes modulo f . Given g, h 2 (Z/pZ)[x], we define the sum, resp. the product of the residue classes of g and h to be the residue classes of g + h (mod f ), resp. gh (mod f ). It is now clear that this defines a ring with zero element f · (Z/pZ)[x] and identity 1 + f · (Z/pZ)[x]. We can say even more (without proof): Theorem 4.2. (a) The residue class ring defined above is a field if and only if the polynomial f 2 (Z/pZ)[x] is irreducible. (b) In this case, and if deg f = n, this field has pn elements. (c) For each n 2 N there is an irreducible polynomial of degree n over Z/pZ. (d) The fields generated by two irreducible polynomials of the same degree are isomorphic. (e) Any finite field of characteristic p is of this form. ~ Remarks: (1) Such a field is also called a Galois field of degree n over Z/pZ, and is denoted by GF(pn ). (2) The inverse of a nonzero residue class g + f · (Z/pZ)[x] can be found as follows: There are polynomials r, s 2 (Z/pZ)[x], which can be determined by way of the Euclidean algorithm 54 4.2 Mathematical Background III: Finite Fields (see later), such that gr + f s = 1. Then, clearly, r + f · (Z/pZ)[x] is the desired inverse. Example: Let us construct GF(22 ). Note that f (x) = x 2 + x + 1 2 (Z/2Z)[x] is an irreducible polynomial of degree 2 (in fact, the only one). Now the elements of GF(22 ) are the residue classes of the polynomials 0, 1, x, and x + 1 (mod f ). Denote ↵ = x + f · (Z/2Z)[x]; then ↵ 2 + ↵ + 1 = 0 in GF(22 ), and so ↵ 2 = ↵ 1 = ↵ + 1. We can use this to construct the following multiplication table. Here are two cases in detail: ↵ · ↵ = ↵ 2 = ↵ + 1; ↵ · (↵ + 1) = ↵ 2 + ↵ = 1. + 0 1 ↵ ↵+1 0 0 1 ↵ ↵+1 1 1 0 ↵+1 ↵ ↵ ↵ ↵+1 0 1 ↵+1 ↵+1 ↵ 1 0 1 ↵ ↵+1 1 1 ↵ ↵+1 ↵ ↵ ↵+1 1 ↵+1 ↵+1 1 ↵ · For a construction of GF(23 ), see the book (1) by Stinson, p. 182. 4.2.3 The finite field GF(28 ) The field GF(28 ) is used in the Rijndael algorithm; as defining irreducible polynomial the particular polynomial f (x) = x 8 + x 4 + x 3 + x + 1 (4.2) is used. Equivalent results would be obtained with other irreducible polynomials over Z/2Z of degree 8. 1. Every element of GF(28 ) can be uniquely represented as a polynomial b7 x 7 + b6 x 6 + . . . + b1 x + b0, with b j 2 {0, 1}, j = 0, 1, . . . , 7. So the elements of GF(28 ) can be represented as 8-bit bytes b7 b6 . . . b1 b0 . 2. Addition in GF(28 ) then corresponds to the XOR operation: Example: (x 7 + x 6 + x 3 + x + 1) + (x 4 + x 3 + 1) = x 7 + x 6 + x 4 + x, 11001011 00011001 = 11010010. 3. For multiplying elements in GF(28 ), we have to keep in mind that we are working modulo the polynomial (4.1) which can be represented by the 9 bits 100011011. Subtracting the polynomial (4.1) is then equivalent to XORing these 9 bits. 55 4.3 The Advanced Encryption Standard (AES): Rijndael Example: (x 7 + x 6 + x 3 + x + 1)x = x 8 + x 7 + x 4 + x 2 + x = (x 7 + x 3 + x 2 + 1) + (x 8 + x 4 + x 3 + x + 1) ⌘ x7 + x3 + x2 + 1 (mod x 8 + x 4 + x 3 + x + 1). The corresponding bit operations are 11001011 ! 110010110 ! 110010110 = (shift left and append a 0) 100011011 (subtract x 8 + x 4 + x 3 + x + 1) 010001101. Remarks: (1) This example suggest an easy general algorithm. (2) If after the left-shift the first bit is 0, we do not XOR the 9-bit string 100011011. (3) Multiplication by higher powers of x are accomplished by repeating this algorithm. (4) Finally, multiplication by an arbitrary polynomial (i.e., bit string o...
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Attached. Please let me know if you have any questions or need revisions.
A...


Anonymous
I use Studypool every time I need help studying, and it never disappoints.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4

Related Tags