A preliminary version of this paper appeared in Advances in Cryptology { Eurocrypt 94 Proceedings, Lecture Notes in Computer Science Vol. 950, A. De Santis ed., Springer-Verlag, 1994.
Optimal Asymmetric Encryption
How to Encrypt with RSA
Mihir Bellare
Phillip Rogawayy
November 19, 1995
Abstract
Given an arbitrary k-bit to k-bit trapdoor permutation f and a hash function, we exhibit an
encryption scheme for which (i) any string x of length slightly less than k bits can be encrypted
as f (rx ), where rx is a simple probabilistic encoding of x depending on the hash function; and (ii)
the scheme can be proven semantically secure assuming the hash function is \ideal." Moreover, a
slightly enhanced scheme is shown to have the property that the adversary can create ciphertexts
only of strings for which she \knows" the corresponding plaintexts|such a scheme is not only
semantically secure but also non-malleable and secure against chosen-ciphertext attack.
Department of Computer Science & Engineering, Mail Code 0114, University of California at San Diego, 9500
Gilman Drive, La Jolla, CA 92093. E-mail: mihir@cs.ucsd.edu
y Department of Computer Science, University of California at Davis, Davis, CA 95616, USA. E-mail:
rogaway@cs.ucdavis.edu
1
1 Introduction
Asymmetric (i.e. public key) encryption is a goal for which there is a large and widely-recognized
gap between practical schemes and provably-secure ones: the practical methods are ecient but not
well-founded, while the provably-secure schemes have more satisfying security properties but are
not nearly as ecient.1 The goal of this paper is to (nearly) have it all: to do asymmetric encryption
in a way as ecient as any mechanism yet suggested, yet to achieve an assurance bene t almost as
good as that obtained by provable security.
In the setup we consider a sender who holds a k-bit to k-bit trapdoor permutation f and wants
to transmit a message x to a receiver who holds the inverse permutation f ,1 . We concentrate on
the case which arises most often in cryptographic practice, where n = jxj is at least a little smaller
than k.
What practioners want is the following: encryption should require just one computation of f ;
decryption should require just one computation of f ,1 ; the length of the enciphered text should be
precisely k; and the length n of the text x that can be encrypted is close to k. Since heuristic schemes
achieving these conditions exist [22, 15], if provable security is provided at the cost of violating any
of these conditions (e.g., two applications of f to encrypt, message length n + k rather than k)
practioners will prefer the heuristic constructions. Thus to successfully impact practice one must
provide provably-secure schemes which meet the above constraints.
The heuristic schemes invariably take the following form: one (probabilistically, invertibly)
embeds x into a string rx and then takes the encryption of x to be f (rx).2 Let's call such a process
a simple-embedding scheme. We will take as our goal to construct provably-good simple-embedding
schemes which allow n to be close to k.
Assuming an ideal hash function and an arbitrary trapdoor permutation, we describe and prove
secure two simple-embedding schemes that are bit-optimal (i.e., the length of the string x that can
be encrypted by f (rx) is almost k). Our rst scheme achieves semantic security [11], while our
second scheme achieves a notion of plaintext-aware encryption, which we introduce here. This new
notion is very strong, and in particular implies \ambitious" goals like chosen-ciphertext security
and non-malleability [7] in the ideal-hash model which we assume.
The methods of this paper are simple and completely practical. They provide a good starting
point for an asymmetric encryption/key distribution standard.
Next we describe our schemes and their properties. We refer the reader to Section 1.7 for
discussion of previous work on encryption and comparisons with ours.
1.1 The basic scheme
Recall k is the security parameter, f mapping k-bits to k-bits is the trapdoor permutation. Let k0
be chosen such that the adversary's running time is signi cantly smaller than 2k0 steps. We x the
length of the message to encrypt as n = k , k0 bits (shorter messages can be suitably padded to
this length). The scheme makes use of a \generator" G: f0; 1gk0 ! f0; 1gn and a \hash function"
By a provably-secure scheme we mean here one shown, under some standard complexity-theoretic assumption,
to achieve a notion of security at least as strong as semantic security [11].
2
It is well-known that a naive embedding like rx = x is no good: besides the usual de ciencies of any deterministic
encryption, f being a trapdoor permutation does not mean that f (x) conceals all the interesting properties of x.
Indeed it was exactly such considerations that helped inspire ideas like semantic security [11] and hardcore bits [5, 26].
1
2
H : f0; 1gn ! f0; 1gk0 . To encrypt x 2 f0; 1gn choose a random k0-bit r and set
E G;H (x) = f (x G(r) k r H (x G(r))):
Here \k" denotes concatenation. The decryption function DG;H is de ned in the obvious way, and
the pair (E ; D) constitutes what we call the \basic" scheme.
We prove security under the assumption that G; H are \ideal." This means G is a random
function of f0; 1gk0 to f0; 1gn and H is a random function of f0; 1gn ! f0; 1gk0 . The formal
statement of our result is in Theorem 4.1. It says that if f is a trapdoor permutation and G; H are
ideal then the basic scheme achieves the notion of semantic security [11] appropriately adjusted to
take account of the presence of G; H .
In practice, G and H are best derived from some standard cryptographic hash function. (For
example, they can be derived from the compression function of the Secure Hash Algorithm [18]
following the methods described in [2]).
1.2 The plaintext-aware scheme
A variety of goals for encryption have come to be known which are actually stronger than the
notion of [11]. These include non-malleability [7] and chosen ciphertext security. We introduce a
new notion of an encryption scheme being plaintext-aware |roughly said, it should be impossible for
a party to produce a valid ciphertext without \knowing" the corresponding plaintext (see Section 3
for a precise de nition). In the ideal-hash model that we assume, this notion can be shown to imply
non-malleability and chosen-ciphertext security.
We construct a plaintext-aware encryption scheme by slightly modifying the basic scheme. Let
k and k0 be as before and let k1 be another parameter. This time let n = k , k0 , k1. Let the
generator be G: f0; 1gk0 ! f0; 1gn+k1 and the hash function H : f0; 1gn+k1 ! f0; 1gk0 . To encrypt,
choose a random k0-bit r and set
E G;H (x) = f (x0k1 G(r) k r H (x0k1 G(r))):
The decryption DG;H is de ned in the obvious way and the pair constitutes the scheme we call
\plaintext-aware."
The formal statement of our results are in Theorems 6.1 and 6.2. They say that if f is a trapdoor
permutation and G; H are ideal then the plaintext-aware scheme is a semantically secure, plaintextaware encryption. In practice, again, G and H are derived from some standard cryptographic hash
function.
1.3 Eciency
The function f can be set to any candidate trapdoor permutation such as RSA [21] or modular
squaring [19, 3]. In such a case the time for computing G and H is negligible compared to the
time for computing f; f ,1 . Thus complexity is discussed only in terms of f; f ,1 computations. In
this light our basic encryption scheme requires just a single application of f to encrypt, a single
application of f ,1 to decrypt, and the length of the ciphertext is k (as long as k n + k0 ). Our
plaintext-aware scheme requires a single application of f to encrypt, a single application of f ,1 to
decrypt, and the length of the ciphertext is still k (as long as k n + k0 + k1).
A concrete instantiation of our plaintext-aware scheme (using RSA for f and getting G; H from
the Secure Hash Algorithm [18]) is given in Section 7.
3
1.4 The ideal hash function paradigm
As we indicated above, when proving security we take G; H to be random, and when we want a
concrete scheme, G; H are instantiated by primitives derived from a cryptographic hash function.
In this regard we are following the paradigm of [2] who argue that even though results which
assume an ideal hash function do not provide provable security with respect to the standard model
of computation, assuming an ideal hash function and doing proofs with respect to it provides much
greater assurance bene t than purely ad. hoc. protocol design. We refer the reader to that paper
for further discussion of the meaningfulness, motivation and history of this ideal hash approach.
1.5 Exact security
We want our results to be meaningful for practice. In particular, this means we should be able to
say meaningful things about the security of our schemes for speci c values of the security parameter
(e.g., k = 512). This demands not only that we avoid asymptotics and address security \exactly,"
but also that we strive for security reductions which are as ecient as possible.3
Thus the theorem proving the security of our basic scheme quanti es the resources and success
probability of a potential adversary: let her run for time t, make qgen queries of G and qhash queries
of H , and suppose she could \break" the encryption with advantage . It then provides an algorithm
M and numbers t0; 0 such that M inverts the underlying trapdoor permutation f in time t0 with
probability 0 . The strength of the result is in the values of t0 ; 0 which are speci ed as functions of
t; qgen; qhash; and the underlying scheme parameters k; k0; n (k = k0 + n). Now a user with some
idea of the (assumed) strength of a particular f (e.g., RSA on 512 bits) can get an idea of the
resources necessary to break our encryption scheme.
1.6 Extensions
The assumption that n = jxj k , k0 , k1 can be removed while retaining the bit optimality
of the scheme: the ideas presented here can be extended to design an authenticated encryption
scheme (provably secure in the ideal-hash model assuming an arbitrary trapdoor permutation)
where encryption still requires one application of f on a k-bit input; decryption still requires one
application of f ,1 on a k-bit input; and now the length of the encrypted text will be maxfk; jxj +
k0 + k1g.
1.7 Prior work in encryption
We brie y survey relevant prior art in encryption. In the following, f mapping k-bits to k-bits
is the trapdoor permutation. As above, the following assumes the length n of the message to be
encrypted is at most k. We begin by discussing work on attaining semantic security, and then move
on to stronger goals.
Goldwasser and Micali [11] rst suggested encrypting a message by probabilistically encrypting
each of its bits: if Bf denotes a hard core predicate [5, 26, 10] for the trapdoor permutation f , then
Exact security is not new: previous works which address it explicitly include [10, 14, 23, 16, 8, 1]. Moreover,
although it is true that most theoretical works only provide asymptotic security guarantees of the form \the success
probability of a polynomially bounded adversary is negligible" (everything measured as a function of the security
parameter), the exact security can be derived from examination of the proof. (However, a lack of concern with the
exactness means that in many cases the reductions are very inecient, and the results are not useful for practice).
3
4
the encryption of x = x1 : : :xn is EGM (x) = f (r1) k : : : k f (rn), where each ri is randomly chosen
from the domain of f subject to Bf (ri) = xi . This yields an encryption of length O(nk) which
requires n evaluations of f to encrypt and n evaluation of f ,1 to decrypt, which is not practical.
The more ecient construction of Blum and Goldwasser [4] is based on the particular choice
of f as the modular squaring function [19]. They achieve encryption size n + k. They require
O(nk2= log k) steps to encrypt and O(k3) steps to decrypt. The encryption is longer than ours by
n bits. To compare the time complexities, take the function f in our scheme to also be squaring.
Then their encryption time is a factor O(n= log k) more than ours. Their decryption time is a
constant factor more than ours.
Of course the above two schemes have the advantage of being based only on standard assumptions, not the use of an ideal hash function.
The discrete log function simultaneously hides a constant fraction of the bits of its pre-image
[24]. But it is not known to have a trapdoor and hence is not usable for the problem we are
considering.
What we have called simple-embedding schemes are prevalent in computing practice. One
example is the RSA Public Key Cryptography Standard #1 [22], where rx in the embedding
x 7! rx is essentially x in the low-order bit positions and a string of random non-zero bytes in the
remaining bit positions. Another scheme is described in [15]; a simpli ed version of it is
G (x) = f ((x0k2 G(r)) k r) :
EIBM
Of concern with both of these schemes is that there is no compelling reason to believe that x is
as hard to compute from f (rx) as rx is hard to compute from f (rx )|let alone that all interesting
properties of x are well-hidden by f (rx). Indeed whether or not [22, 15] \work" depends on aspects
of f beyond its being one-way, insofar as it is easy to show that if there exists a trapdoor permutation
then there exists one for which encryption as above is completely insecure.4
In [2] we suggested the scheme
G (x) = f (r) k G(r) x :
EBR
and proved it semantically secure in the same ideal-hash model used here. In comparison with the
schemes given here, the drawback is that the encryption size is n + k rather than k.
Now we turn to stronger goals. Chosen-ciphertext security was provably achieved by [17], but
the scheme is extremely inecient. More practical encryption schemes which aimed at achieving
chosen ciphertext security were proposed by Damgard [6] and Zheng and Seberry [27]. The latter
scheme is
G;H (x) = f (r) k (G(r) (x k H (x)) ;
EZS
matching our plaintext-aware scheme in computation but having bit complexity n + k + k1. Nonmalleability is provably achieved by [7], but the scheme is extremely inecient. An ecient scheme
proven in [2] to achieve both non-malleability and chosen-ciphertext security under the ideal-hash
model is
E G;H
f (r) k G(r) x k H (rx) :
BR (x) =
Again the drawback is a bit complexity of n + k + k1.
4
But f is mandated to be RSA in both of [22, 15].
5
2 Preliminaries
2.1 Probabilistic algorithms
We shall use notation of [13]. If A is a probabilistic algorithm then A(x; y; ) refers to to the
probability space which to the string assigns the probability that A on inputs x; y; outputs
. If S is a probability space we denote its support (the set of elements of positive probability)
by [S ]. When S is a probability space, x S denotes selecting a random sample from S . We
use x; y S as shorthand for x S ; y S . For probability spaces S; T; : : :, the notation
Pr[ x S ; y T ; : p(x; y; ) ] denotes the probability that the predicate p(x; y; ) is true
after the (ordered) execution of the algorithms x S , y T , etc.. PPT is short for \probabilistic,
polynomial time."
In evaluating the complexity of oracle machines we adopt the usual convention that all oracle
queries receive their answer in unit time.
2.2 Random oracles
We will be discussing schemes which use functions G; H chosen at random from appropriate spaces
(the input and output lengths for G and H depend on parameters of the scheme). When stating
de nitions it is convenient to not have to worry about exactly what these spaces may be and just
write G; H
, the latter being de ned as the set of all maps from the set f0; 1g of nite strings
to the set f0; 1g1 of in nite strings. The notation should be interpreted as appropriate to the
context|for example, if the scheme says G maps f0; 1ga to f0; 1gb then we can interpret G
as meaning we choose G from at random, restrict the domain to f0; 1ga, and drop all but the
rst b bits of output.
2.3 Trapdoor permutations and their security
Our encryption schemes require a trapdoor permutation generator. This is a PPT algorithm F such
that F (1k ) outputs a pair of deterministic algorithms (f; f ,1 ) specifying a permutation and its
inverse on f0; 1gk.
We associate to F an evaluation time TF (): for all k, all (f; f ,1 ) 2 [F (1k )] and all w 2 f0; 1gk,
the time to compute f (w) (given f and w) is TF (k). Note the evaluation time depends on the
setting: for example on whether or not there is hardware available to compute f .
We will be interested in two attributes of a (possibly non-uniform) algorithm M trying to invert
F (1k)-distributed permutations; namely its running time and its success probability.
De nition 2.1 Let F be a trapdoor permutation generator. We say that algorithm M succeeds
in (t; )-inverting F (1k ) if
Pr[ (f; f ,1 )
F (1k); w f0; 1gk; y f (w) : M (f; y) = w ] ;
and, moreover, in the experiment above, M runs in at most t steps.
RSA [21] is a good candidate function as a secure trapdoor permutation.5
5
Candidates like RSA [21] don't quite t our de nition, in that the domain of RSA is some ZN , a proper subset
of of f0; 1gk . Things can be patched in standard ways.
6
3 Semantically Secure Encryption
We extend the de nition of semantic security [11] to the random oracle model in a way which
enables us to discuss exact security.
3.1 Encryption schemes
An asymmetric (i.e. public key) encryption scheme is speci ed by a probabilistic generator, G ,
and an associated plaintext-length function, n(). On input 1k , the generator G outputs a pair of
algorithms (E ; D), the rst of which is probabilistic. Each of these algorithms has oracle-access to
two functions, one called G and one called H . A user i runs G to get (E ; D) and makes the former
public while keeping the latter secret. To encrypt message x 2 f0; 1gn(k) using functions G; H ,
anyone can compute y E G;H (x) and send it to i. To decrypt ciphertext y user i computes
x DG;H (y). We require DG;H (y) = x for all y 2 [E G;H (x)]. We further demand that DG;H (y) =
if there is no x such that y 2 [E G;H (x)].
An adversary is a (possibly nonuniform) algorithm A with access to oracles G; H . We assume
without loss of generality that an adversary makes no particular G-query more than once and no
particular H -query more than once. For simplicity we assume that the number of G-queries and
H -queries that an adversary makes don't depend on its coin tosses but only, say, on the length of
its input.
3.2 Semantic security
The following de nition will be used to discuss (exact) security. It captures the notion of semantic
security [11] appropriately lifted to take into account the presence of G; H .
We consider an adversary who runs in two stages. In the nd-stage it is given an encryption
algorithm E and outputs a pair x0; x1 of messages. It also outputs a string c which could record,
for example, its history and its inputs. Now we pick at random either x0 or x1 (the choice made
according to a bit b) and encrypt it (under E ) to get y . In the guess -stage we provide A the output
x0; x1; c of the previous stage, and y, and we ask it to guess b. (We assume wlog that E is included
in c so that we don't need to explicitly provide it again.) Since even the algorithm which always
outputs a xed bit will be right half of the time, we measure how well A is doing by 1=2 less than
the fraction of time that A correctly predicts b. We call twice this quantity the advantage which
A has in predicting b. Multiplying by two makes the advantage fall in the range [0; 1] (0 for a
worthless prediction and 1 for an always correct one), instead of [0; 0:5].
De nition 3.1 Let G be a generator for an encryption scheme having plaintext-length function
n(). An adversary A is said to succeed in (t; qgen; qhash; )-breaking G (1k ) if
2 Pr[ (E ; D) G (1k); G; H
; (x0; x1; c) AG;H (E ; nd);
b f0; 1g; y E G;H (xb) : AG;H (y; x0; x1; c) = b ] , 1 ;
and, moreover, in the experiment above, A runs for at most t steps, makes at most qgen queries to
G, and makes at most qhash queries to H .
Note that t is the total running time; ie. the sum of the times in the two stages. Similarly qgen ; qhash
are the total number of G and H queries, respectively.
7
4 The Basic Encryption Scheme
Let F be a trapdoor permutation generator and k0 () a positive integer valued function such that
k0(k) < k for all k 1. The basic scheme G with parameters F and k0() has an associated
plaintext-length function of n(k) = k , k0 (k). On input 1k , the generator G runs F (1k ) to obtain
(f; f ,1 ). Then it outputs the pair of algorithms (E ; D) determined as follows:
(1) On input x of length n = n(k), algorithm E selects a random r of length k0 = k0(k). It sets
s = xG(r) and t = rH (s). It sets w = s k t and returns y = f (w).
(2) On input y of length k, algorithm D computes w = f ,1 (y ). Then it sets s to the rst n bits
of w and t to the last k0 bits of w. It sets r = tH (s), and returns the string x = sG(r).
The oracles G and H which E and D reference above have input/output lengths of G : f0; 1gk0 !
f0; 1gn and H : f0; 1gn ! f0; 1gk0 . We use the encoding of f as the encoding of E and the encoding
of f ,1 as the encoding of D.
The intuition behind the (semantic) security of this scheme is as follows. We wish to guarantee
that the adversary, given a point y in the range of f , must recover the complete preimage w = rx of
y if she is to say anything meaningful about x itself. Well, if the adversary does not recover all of
the rst n bits of the preimage, s, then she will have no idea about the value H (s) which is its hash;
a failure to know anything about H (s) implies a failure to know anything about r = H (s)t (where
t is the last k0 bits of w), and therefore G(r), and therefore x = G(r)s itself. Now, assuming
the adversary does recover s, a failure to completely recover t will again mean that the adversary
fails to completely recover r, and, in the lack of complete knowledge about r, xG(r) is uniformly
distributed and so again the adversary can know nothing about x.
Yet the above discussion masks some subtleties and a formal proof of security is more complex
than it might appear. This is particularly the case when one is interested, as we are here, in
achieving the best possible exact security.
The following theorem says that if there is an adversary A who is able to break the encryption
scheme with some success probability, then there is an algorithm M which can invert the underlying
trapdoor permutation with comparable success probability and in comparable time. This implies
that if the trapdoor permutations can't be inverted in reasonable time (which is the implicit assumption) then our scheme is secure. But the theorem says more: it speci es exactly how the
resources and success of M relate to those of A and to the underlying scheme parameters k; n; k0
(k = n + k0 ).
The inverting algorithm M can by obtained from A in a \uniform" way; the theorem says there
is a \universal" oracle machine U such that M can be implemented by U with oracle access to
A. It is important for practice that the \description" of U is \small;" this is not made explicit in
the theorem but is clear from the proof. The constant depends only on details of the underlying
model of computation. We write n; k0 for n(k); k0(k), respectively, when, as below, k is understood.
Theorem 4.1 Let G be the basic encryption scheme with parameters F , k0 and let n be the associated
plaintext length. Then there exists an oracle machine U and a constant such that for each integer k
the following is true. Suppose A succeeds in (t; qgen; qhash; )-breaking G (1k ). Then M = U A succeeds
in (t0 ; 0 )-inverting F (1k ), where
t0 = t + qgen qhash (TF (k) + k)
0 = (1 , qgen2,k0 , qhash2,n) , qgen2,k+1 :
8
The proof of Theorem 4.1 is in Appendix A.
For reasonable values of k (e.g., k 512) it will be the case that k > n >> k0 . Thus for
reasonable values of qgen; qhash we'll have 0 (1 , qgen2,k0 ). Thus the success probability 0
achieved here is good in the sense that it is only slightly less than and close to optimal. Note also
that the expression for 0 indicates that A will do best by favoring G-oracle queries over H -oracle
queries.
The dominant factor in the time t0 taken by the inverting algorithm to compute f ,1 (y ) is the
time to do qgen qhash computations of the underlying f . An interesting open question is to nd a
scheme under which the number of computation of f is linear in qgen + qhash while retaining a value
of 0 similar to ours.
5 Plaintext-Aware Encryption
We introduce a new notion of an encryption being \plaintext aware." The idea is that an adversary
is \aware" of the decryption of the messages which she encrypts in the sense that she cannot
produce a ciphertext y without \knowing" the corresponding plaintext. In formalizing this we have
relied on de nitional ideas which begin with [12, 9, 25]. Our notion requires that some (universal)
algorithm K (the \knowledge extractor") can usually decrypt whatever ciphertext an adversary B
may output, just by watching the G; H -queries which B makes.
Let B be an adversary which given an encryption algorithm E outputs a string y (intuitively,
the ciphertext). The notation (y; ) runB G;H (E ) means the following. We run the algorithm
BG;H (E ) which outputs y. We record in the process the transcripts of its interaction with its
oracles. Thus there is a list gen which for each G-oracle query g made by B records g and the
answer G(g ); similarly for H :
gen = (g1; G(g1)); : : :; (gqgen; G(gqgen))
hash = (h1; H (h1)); : : :; (hqhash ; H (hqhash )) :
The pair (gen; hash) constitutes .
De nition 5.1 Let G be a generator for an encryption scheme and let B be an adversary that
outputs a string. An algorithm K is said to be a (t; )-plaintext extractor for B; G (1k ) if
Pr[ (E ; D)
G ; G; H
; (y; )
runB G;H (E ) : K (E ; y; ) 6= DG;H (y ) ] ;
and K runs in at most t steps in the experiment above.
The information we provide K about B is only B 's output y and the transcript of her oracle
interactions . We could more generally also provide B 's coin tosses; we omit to do this only
because the stronger notion we de ne above is achieved by our scheme.
Note we don't give K oracle access to G; H : it is required to nd the plaintext corresponding
to y given only B 's \view" of the oracle. The rest is random anyway so it makes no di erence.
A complexity-theoretic notion for a plaintext-aware encryption can be easily created out of the
exact de nition given above. Also, a de nition for the standard (random oracle devoid) model is
easily obtained. But in this case, we would de nitely allow K access to B 's coin tosses.
9
As previously mentioned, demanding awareness of a secure encryption scheme is asking a lot.
In the random oracle model, we can show that a plaintext-aware scheme is non-malleable and also
secure against chosen-ciphertext attack. We omit proofs of this, but the intuition is quite clear.
For example, a chosen-ciphertext attack will not help because the adversary already \knows" the
plaintext of any ciphertext y whose decryption she might request from an available decryption box.
6 The Plaintext-Aware Encryption Scheme
Let F be a trapdoor permutation generator. Let k0 () and k1() be positive integer valued functions
such that k0 (k) + k1(k) < k for all k 1. The plaintext-aware scheme G with parameters F ; k0; k1
has an associated plaintext-length function of n(k) = k , k0 (k) , k1(k). On input 1k , the generator
G runs F (1k) to obtain (f; f ,1). Then it outputs the pair of algorithms (E ; D) determined as
follows:
(1) On input x of length n = n(k), algorithm E selects a random r of length k0 = k0(k). It sets
s = x0k1 G(r) and t = rH (s). It sets w = s k t and returns y = f (w).
(2) On input y of length k, algorithm D computes w = f ,1 (y ). Then it sets s to the rst n + k1
bits of w and t to the last k0 bits of w. It sets r = tH (s). It sets x to the rst n bits of
sG(r) and z to the last k1 bits of sG(r). If z = 0k1 then it returns x, else it returns .
The oracles G and H which E and D reference above have input/output lengths of G: f0; 1gk0 !
f0; 1gn+k1 and H : f0; 1gn+k1 ! f0; 1gk0 .
The semantic security of this scheme as given by the following theorem is a consequence of
Theorem 4.1.
Theorem 6.1 Let G be the plaintext-aware encryption scheme with parameters F , k ; k and let n
0
1
be the associated plaintext length. Then there exists an oracle machine U and a constant such that
for each integer k the following is true. Suppose A succeeds in (t; qgen; qhash; )-breaking G (1k ). Then
M = U A succeeds in (t00)-inverting F , where
t0 = t + qgen qhash (TF (k) + k)
0 = (1 , qgen2,k0 , qhash2,n,k1 ) , qgen2,k+1 :
Proof: Let G 0 be the generator for the basic scheme with parameters F and k | the associated
0
plaintext-length function is n0 (k) = k , k0(k) = n(k)+ k1 (k). Let A0 be the adversary for G 0 who (i)
in the nd-stage runs A to get (x0; x1; c) and outputs (x0 0k1 ; x10k1 ; c); and (ii) in the guess -stage
removes the padded zeroes from the messages and runs A. Now apply Theorem 4.1 to A0.
The intuition for the plaintext awareness of our encryption scheme can be described as follows. Let
y be the string output by B. If she hasn't asked G(r), then almost certainly the rst n + k1 bits
of the preimage of y won't end with the right substring 0k1 ; and if she hasn't asked H (s), then she
can't know r; but if the adversary does know s, then certainly she knows its rst n bits, which is x.
To discuss exact security it is convenient to say that adversary B () is a (t; qgen; qhash)-adversary
for G (1k ) if for all (E ; D) 2 [G (1k )], B (E ) runs in at most t steps, makes qgen G-queries and makes
qhash H -queries.
10
Theorem 6.2 Let G be the plaintext-aware encryption scheme with parameters F , k ; k and let n be
0
1
the associated plaintext length. Then there exists an oracle machine K and a constant such that for
each integer k the following is true. Suppose B is a (t; qgen; qhash)-adversary for G (1k ). Then K = U B
is a (t0 ; 0)-plaintext extractor for B; G , where
t0 = t + qgen qhash (TF (k) + k)
0 = qgen2,k0 + 2,k1 :
As before, one interesting open question is to device a scheme with t0 linear in qgen + qhash rather
than quadratic. Another nice open question is whether one can achieve plaintext-aware encryption
in the standard (random oracle devoid) model given a standard complexity theoretic assumption.
7 Sample RSA-Based Instantiation
We provide here a concrete instantiation of our plaintext-aware encryption scheme (omitting only
certain minor details). We use RSA as the trapdoor permutation and construct the functions G; H
out of the (revised) NIST Secure Hash Algorithm [18]. (Other hash algorithms such as MD5 [20]
would do as well).
Let f be the RSA function [21], so f (x) = xe mod N is speci ed by (e; N ) where N is the
k-bit product of two large primes and (e; '(N )) = 1. We demand k 512 bits (larger values are
recommended). Our scheme will allow the encryption of any string msg whose length is at most
k , 320 bits (thus the minimal permitted security parameter allows 192 bits (e.g., three 64-bit keys)
to be encrypted.) Let D = f1 i < N : gcd(i; N ) = 1g f0; 1gk be the set of valid domain points
for f .
Our probabilistic encryption scheme depends on the message msg to encrypt, an arbitrarylength string rand coins , the security parameter k, the function f , and a predicate inD(x) which
should return true if and only if x 2 D. Our scheme further uses a 32-bit string key data (whose
use we do not specify here), and a string desc which provides a complete description of the function
f (i.e., it says \This is RSA using N and e") encoded according to conventions not speci ed here.
We denote by SHA (x) the 160-bit result of SHA (Secure Hash Algorithm) applied to x, except
that the 160-bit \starting value" in the algorithm description is taken to be ABCDE = . Let
SHA` (x) denote the rst `-bits of SHA (x). Fix the notation h i i for i encoded as a binary 32-bit
word. We de ne the function H` (x) for string x, number `, and 160-bit to be the `-bit pre x of
80
80
SHA80
(h 0 i:x) k SHA (h 1 i:x) k SHA (h 2 i:x) k
Let K0 be a xed, randomly-chosen 160-bit string (which we do not specify here).
Our scheme is depicted in Figure 7. Basically, we augment the string msg which we want
to encrypt by tacking on a word to indicate its length; including k1 = 128 bits of redundancy;
incorporating a 32-bit eld key data whose use we do not specify; and adding enough additional
padding to ll out the length of the string we have made to k , 128 bits. The resulting string x now
plays the same role as the x of our basic scheme, and a separate 128-bit r is then used to encrypt
it.
We comment that in the concrete scheme shown in Figure 7 we have elected to make our
generator and hash function sensitive both to our scheme itself (via K0) and to the particular
11
Encrypt ( msg ; rand coins )
= SHAK (desc );
= SHA (h 1 i);
= SHA (h 2 i);
= SHA (h 3 i);
i 0;
0
1
2
3
repeat
r H128
(h i i k rand coins );
1
x key data k h jmsg j i k 0128
x xHjx2j (r);
r rH128
(x);
3
rx = x k r ;
i i + 1;
until inD(rx);
return f (rx);
k 0k,
,jmsg j
320
k msg ;
Figure 1: A sample instantiation of the plaintext-aware encryption scheme.
function f (via desc ). Such \key separation" is a generally-useful heuristic to help ensure that
when the same key is used in multiple (separately-secure) algorithms that the internals of these
algorithms do not interact in such a way as to jointly compromise security. The use of \key variants"
1, 2 and 3 is motivated similarly. Our choice to only use half the bits of SHA has to do with a
general \de ciency" in the use of SHA-like hash functions to instantiate random oracles; see [2] for
a discussion.
Acknowledgments
We thank Don Johnson for an early discussion on this problem, where he described the method
of [15]. We thank Silvio Micali for encouraging us to nd and present the exact security of our constructions. Thanks also to (anonymous) Eurocrypt reviewers for their comments and corrections.
This work was carried out while the rst author was at the IBM T. J. Watson Research Center,
New York, and the second author worked for IBM in Austin, Texas (System Design, Jamil Bissar,
PSP LAN Systems).
References
[1] M. Bellare, J. Kilian and P. Rogaway, \On the security of cipher-block chaining," Advances in Cryptology { Crypto 94 Proceedings, Lecture Notes in Computer Science Vol. 839,
Y. Desmedt ed., Springer-Verlag, 1994.
[2] M. Bellare and P. Rogaway, \Random oracles are practical: a paradigm for designing
ecient protocols," Proceedings of the First Annual Conference on Computer and Communications Security, ACM, 1993.
12
[3] L. Blum, M. Blum and M. Shub, \A Simple Unpredictable Pseudo-Random Number
Generator," SIAM Journal on Computing 15(2), 364-383, May 1986.
[4] M. Blum and S. Goldwasser, \An ecient probabilistic public-key encryption scheme
which hides all partial information," Advances in Cryptology { Crypto 84 Proceedings,
Lecture Notes in Computer Science Vol. 196, R. Blakely ed., Springer-Verlag, 1984.
[5] M. Blum and S. Micali, \How to generate cryptographically strong sequences of pseudorandom bits," SIAM Journal on Computing 13(4), 850-864, November 1984.
[6] I. Damg
ard, \Towards practical public key cryptosystems secure against chosen ciphertext
attacks," Advances in Cryptology { Crypto 91 Proceedings, Lecture Notes in Computer
Science Vol. 576, J. Feigenbaum ed., Springer-Verlag, 1991.
[7] D. Dolev, C. Dwork and M. Naor, \Non-malleable cryptography," Proceedings of the
23rd Annual Symposium on Theory of Computing, ACM, 1991.
[8] S. Even, O. Goldreich and S. Micali, \On-line/O line digital signatures," Manuscript.
Preliminary version in Advances in Cryptology { Crypto 89 Proceedings, Lecture Notes in
Computer Science Vol. 435, G. Brassard ed., Springer-Verlag, 1989.
[9] U. Feige, A. Fiat and A. Shamir, \Zero knowledge proofs of identity," Journal of Cryptology, Vol. 1, pp. 77{94, 1987.
[10] O. Goldreich and L. Levin, \A hard predicate for all one-way functions," Proceedings
of the 21st Annual Symposium on Theory of Computing, ACM, 1989.
[11] S. Goldwasser and S. Micali, \Probabilistic Encryption," Journal of Computer and
System Sciences 28, 270-299, April 1984.
[12] S. Goldwasser, S. Micali and C. Rackoff, \The knowledge complexity of interactive
proof systems," SIAM J. of Comp., Vol. 18, No. 1, 186{208, February 1989.
[13] S. Goldwasser, S. Micali and R. Rivest, \A digital signature scheme secure against
adaptive chosen-message attacks," SIAM Journal of Computing, 17(2):281{308, April 1988.
[14] R. Impagliazzo, L. Levin and M. Luby, \Pseudo-random generation from one-way functions," Proceedings of the 21st Annual Symposium on Theory of Computing, ACM, 1989.
[15] D. Johnson, A. Lee, W. Martin, S. Matyas and J. Wilkins, \Hybrid key distribution scheme giving key record recovery," IBM Technical Disclosure Bulletin, 37(2A), 5{16,
February 1994.
[16] T. Leighton and S. Micali, \Provably fast and secure digital signature algorithms based
on secure hash functions," Manuscript, March 1993.
[17] M. Naor and M. Yung, \Public-key cryptosystems provably secure against chosen ciphertext attacks," Proceedings of the 22nd Annual Symposium on Theory of Computing, ACM,
1990.
[18] National Institute of Standards, FIPS Publication 180, \Secure Hash Standard," 1993.
13
[19] M. Rabin, \Digitalized signatures and public-key functions as intractable as factorization,"
MIT Laboratory for Computer Science TR-212, January 1979.
[20] R. Rivest, \The MD5 message-digest algorithm," IETF Network Working Group,
RFC 1321, April 1992.
[21] R. Rivest, A. Shamir and L. Adleman, \A method for obtaining digital signatures and
public key cryptosystems," CACM 21 (1978).
[22] RSA Data Security, Inc., \PKCS #1: RSA Encryption Standard," June 1991.
[23] C. Schnorr, \Ecient identi cation and signatures for smart cards," Advances in Cryptology { Crypto 89 Proceedings, Lecture Notes in Computer Science Vol. 435, G. Brassard
ed., Springer-Verlag, 1989.
[24] A. Schrift and A. Shamir, \The discrete log is very discreet," Proceedings of the 22nd
Annual Symposium on Theory of Computing, ACM, 1990.
[25] M. Tompa and H. Woll, \Random self-reducibility and zero-knowledge interactive proofs
of possession of information," UCSD TR CS92-244, 1992.
[26] A. Yao, \Theory and applications of trapdoor functions," Proceedings of the 23rd Symposium on Foundations of Computer Science, IEEE, 1982.
[27] Y. Zheng and J. Seberry, \Practical approaches to attaining security against adaptively
chosen ciphertext attacks," Advances in Cryptology { Crypto 92 Proceedings, Lecture Notes
in Computer Science Vol. 740, E. Brickell ed., Springer-Verlag, 1992.
A Proof of Theorem 4.1
We rst de ne the behavior of inverting algorithm M . M is given (an encoding of) a function
f : f0; 1gk ! f0; 1gk and a string y 2 f0; 1gk. It is trying to nd w = f ,1(y).
(1) M begins by constructing E from f as speci ed by our basic scheme. It then initializes two
lists, called its G-list and its H -list, to empty. It picks a bit b f0; 1g at random. Then it
simulates the two stages of A as indicated in the next two steps.
(2) M simulates the nd-stage of A by running A on input (E ; nd). M provides A with fair
random coins and simulates A's random oracles G and H as follows. When A makes an oracle
call h of H , machine M provides A with a random string Hh of length k0, and adds h to the
H -list. Similarly when A makes an oracle call g of G, machine M provides A with a random
string Gg of length n and adds g to the G-list. Let (x0; x1; c) be the output with which A halts.
(3) Now M starts simulating the guess-stage of A. It runs A on input (y; x0; x1; c). It responds to
oracle queries as follows.
(3.1) Suppose A makes H -query h. M provides A with a random string Hh of length h and
adds h to the H -list. Then for each g on the G-list M constructs wh;g = h k g Hh and
computes yh;g = f (wh;g ). If there is some h; g such that yh;g = y then M sets w = wh;g .
14
(3.2) Suppose A makes G-query g . Then for each h on the H -list M constructs the string
wh;g = h k gHh and computes yh;g = f (wh;g).
(3.2.1) If there are h; g such that yh;g = y then M sets w = wh;g . It sets Gg = hxb ,
adds g to the G-list, and returns Gg to A.
(3.2.2) Else (ie. there are no h; g such that yh;g = y ) M provides A with a random string
Gg of length n and adds g to the G-list.
The output of M is w if this string was de ned in the above experiment, and fail otherwise. Note
that the H -list and G-list include the queries of both the nd and guess stages of A's execution.
It is easy to verify that the amount of time t0 to carry out Game 1 is as claimed. It is also easy to
verify that there is a universal machine U such that the computation of M can be done by U A .
We note that as soon as M successfully nds a point w = f ,1 (y ), it could stop and output w.
Not only do we have it go on, but some variables and actions (such as the usage of the bit b in
Step (3.2.1)) come into play only after w is found. These \unnecessary" actions do not a ect
the success probability of M but we put them in to simplify our exposition of the analysis of M 's
success probability. The intuition is that A in the above experiment is trying to predict b and M is
trying to make the distribution provided to A look like that which A would expect were A running
under the experiment which de nes A's success in breaking the encryption scheme. Unfortunately,
M does not provide A with a simulation which is quite perfect. Let us now proceed to the analysis.
We consider the probability space given by the above experiment. The inputs f; y to M are drawn
at random according to (f; f ,1 ) F (1k ); y f0; 1gk. We call this \Game 1" and we let Pr1 []
denote the corresponding probability.
Let w = f ,1 (y ) and write it as w = s k t where jsj = n and jtj = k0. Let r be the random variable
tH (s). We consider the following events.
FBAD is true if:
G-oracle query r was made in the nd-stage, and
Gr 62 fsx0; sx1g.
GBAD is true if:
G-oracle query r was made in the guess stage, and
at the point in time that it was made, the H -oracle query s was not on the H -list, and
Gr 62 fsx0; sx1g.
G = :FBAD ^ :GBAD.
We let Pr2 [] = Pr1 [ j G] denote the probability distribution, in Game 1, conditioned on G being
true, and call this \Game 2."
Now consider the experiment which de nes the advantage of A. Namely, rst choose (f ; f,1)
F (1k) and let E be the corresponding encryption function under the basic scheme. Then choose
G; H
; (x0; x1; c) AG ;H (E ; nd); b f0; 1g; y E G ;H (xb ) ;
and run AG ;H (y ; x0; x1; c). Let Pr1 [] be the corresponding distribution and Game 1 the game.
15
Now consider playing Game 1 a little bit di erently. As before, choose (f; f,1 ) F (1k ) and let
E be the corresponding encryption function. But now choose y f0; 1gk uniformly at random
rst, and then select the rest according to the distribution which makes the outcome the same as
in Game-1. (This is possible because the distribution on y -values in Game 1 is indeed uniform).
We let Game 2 be this di erent way of playing Game 1.
We claim that Game 2 and Game 2 are identical in the sense that the view of A at any point in
these two games is the same. Indeed we have chosen the event G so that the oracle queries we are
returning in Game 1 will mimic Game 2 as long as G remains true.
We omit details to formally justify these claims, but a good way to get some intuition is to assume
for simplicity that the nd-stage is trivial and A always outputs the same strings x0 ; x1; c. Now
if y is xed then the conditional distribution on G; H can be described as follows: Pick H at
random; pick G (g ) to be random whenever g 6= tH(s). But G(tH (s)) must be constrained
to be either sx0 or sx1 , the choice of which being at random.
To proceed further with our analysis (of Game 1), let us introduce the following additional events:
FAskS is true if H -oracle query s was made in the nd-stage.
AskR is true if, at the end of the guess -stage, r is on the G-list.
AskS is true if, at the end of the guess-stage, s is on the H -list.
W = AskR ^ AskS.
The rst step is to show that the probability that the good event fails is low.
Lemma A.1 The probability that the good event fails is upper bounded by
Pr1 [:G] qgen2,k0 + qhash2,n :
Proof: The intuition is that as long as H -query s has not been made, each G-query has probability
only 2,k0 of being r. Now, :G = FBAD _ GBAD. In GBAD is already included the fact that no
H -query of s has been made before the G-query r. But in FBAD it could be that H -query s was
made. But the probability of FAskS is small since s k t = f ,1 (y ) is determined at random after
the nd -stage. The proof that follows captures all this by conditioning on FAskS. We have:
Pr1 [:G] = Pr1 [:G j FAskS] Pr1 [FAskS ] + Pr1 [:G j :FAskS ] Pr1 [:FAskS]
Pr1 [FAskS] + Pr1 [:G j :FAskS]
Pr1 [FAskS] + Pr1 [AskR j :FAskS] :
The random choice of y implies that Pr1 [FAskS ] qhash2,n while, on the other hand, we have
Pr1 [AskR j :FAskS] qgen2,k0 .
We think of A in Game 1 as trying to predict b. With this in mind, let \A = b" denote the event
that A is successful in predicting bit b. We analyze this probability to show that in Game 2 either
W is true or A has little advantage in predicting b. Notice that if W is true then M successfully
16
nds w = f ,1 (y ). Following this we will use the equivalence with Game 2 to relate this to , and
nally we will use Lemma A.1 to get a conclusion for Game 1.
Recall that k = k0 + n is the \security parameter" of the original trapdoor permutation.
Lemma A.2 The winning probability in Game 2 is bounded below by:
qgen2,k :
Pr2 [W] 2 Pr2 [A = b] , 1 , 2Pr
1 [G]
Proof: We upper bound Pr [A = b] by:
2
Pr2 [A = b] = Pr2 [A = b j W] Pr2 [W] + Pr2 [A = b j :AskR ] Pr2 [:AskR]
+ Pr2 [A = b j AskR ^ :AskS ] Pr2 [AskR ^ :AskS]
Pr2 [W] + Pr2 [A = b j :AskR] Pr2 [:AskR] + Pr2 [AskR ^ :AskS]
= Pr2 [W] + Pr2 [A = b j :AskR ] (1 , Pr2 [W] , Pr2 [AskR ^ :AskS ])
+ Pr2 [AskR ^ :AskS ] :
(1)
Now observe that if :AskR then A has no advantage in predicting b:
Pr2 [A = b j :AskR ] 1=2 :
(2)
In order to upper bound Pr2 [AskR ^ :AskS ], let RBS be the event that r is on the G-list and at
the time it was put there, s was not on the H -list. Recall that k = k0 + n. One can check that:
Pr1 [AskR ^ :AskS ^ G] = Pr1 [RBS ^ Gr 2 fsx0; sx1 g]
= Pr1 [RBS] Pr1 [Gr 2 fsx0 ; sx1g j RBS]
qgen2,k0 2 2,n
= 2qgen2,k :
(3)
Using (3) we have
^ G] 2qgen2,k :
(4)
Pr2 [AskR ^ :AskS] = Pr1 [AskRPr^ :[GAskS
]
Pr1 [G]
1
Now put the bounds provided by (2) and (4) into (1) to get
,k
gen 2
Pr2 [A = b] 12 Pr2 [W] + 12 + qPr
1 [G]
and we may conclude the lemma.
The equivalence of Game 2 and Game 2 implies Pr2 [A = b] + 1=2 so that from Lemma A.2
(making the conditioning on G explicit) we get
qgen2,k :
Pr1 [W j G] , 2Pr
(5)
1 [G]
17
Using (5) and Lemma A.1 we get
Pr1 [W] Pr1 [W j G] Pr1 [G]
,k !
2
q
gen 2
, Pr [G] Pr1 [G]
1
= Pr1 [G] , 2qgen2,k
(1 , qgen2,k0 , qhash2,n) , 2qgen2,k :
However as we remarked earlier, 0 Pr1 [W], so the proof is concluded.
B Proof of Theorem 6.2
We de ne the plaintext extractor K . Let (f; f ,1 ) 2 [F (1k)] and let E be the corresponding
encryption function as constructed by our plaintext-aware scheme. Let = (gen; hash) where
gen = (r1; G1); : : :; (rqgen; Gqgen )
hash = (s1; H1); : : :; (sqhash ; Hqhash ) :
We call r1; : : :; rqgen the G-list and s1 ; : : :; sqhash the H -list. The inputs to K are E ; y; . It proceeds
as follows.
(1) For i = 1; : : :; qgen and j = 1; : : :; qhash machine K
(1.1) Sets xi;j to the rst n bits of si Gj and zi;j to the remaining k1 bits of si Gj
(1.2) Sets wi;j = si k rj Hi and computes yi;j = f (wi;j ).
(2) If there is an i; j such that yi;j = y and zi;j = 0k1 then K outputs xi;j ; else it outputs .
For the analysis let w = f ,1 (y ) and write it as w = s k t where jsj = n + k1 and jtj = k0 . Let r be
the random variable tH (s). Let x; z be the random variables de ned by writing sG(r) = x k z
where jxj = n and jz j = k1. We consider the following events.
FAIL is true if the output of K is di erent from DG;H (y ).
AskR is true if r is on the G-list.
AskS is true if s is on the H -list.
We now bound the failure probability.
Pr [FAIL] = Pr [FAIL j :AskR] Pr [:AskR]
+ Pr [FAIL j AskR ^ AskS] Pr [AskR ^ AskS]
+ Pr [FAIL j AskR ^ :AskS] Pr [AskR ^ :AskS ]
Pr [FAIL j :AskR] + Pr [FAIL j AskR ^ AskS] + Pr [AskR ^ :AskS] :
If r is not on the G-list then the probability that z = 0k1 is at most 2,k1 , so that in this case an
output of is success. Thus Pr [FAIL j :AskR] 2,k1 .
18
If r is on the G-list and s is on the H -list then there are i; j such that w = wi;j . So K will decrypt
correctly. That is, Pr [FAIL j AskR ^ AskS] = 0.
If s is not on the H -list then H (s) is uniformly distributed and hence so is r. So
Pr [AskR ^ :AskS] Pr [AskR j :AskS] qgen 2,k0 :
This concludes the proof.
19
Purchase answer to see full
attachment