2015-01-21 Multiplicative inverses modulo a power of two (Permalink: https://mumble.net/~campbell/2015/01/21/inverse-mod-power-of-two) I couldn't find any good citation on the web for how to compute the multiplicative inverse of an odd integer a, modulo a power of two, n = 2^w, which is necessary preparation for Montgomery reduction. The obvious way to compute modular inverses is to use the extended Euclidean algorithm, which, given a and n, computes integers s and t satisfying Bézout's identity a s + n t = gcd(a, n) = 1, so that s is a multiplicative inverse of a modulo n, and t is a multiplicative inverse of n modulo a. (define (euclid a b) (define (loop s s* t t* r r*) ;; a s + b t - r = 0 (if (zero? r*) (values s t r) (let ((q (quotient r r*)) (r** (remainder r r*))) (loop s* (- s (* q s*)) t* (- t (* q t*)) r* r**)))) (if (< a b) (loop 0 1 1 0 b a) (loop 1 0 0 1 a b))) ;; Bezout's identity: (let ((a 3) (b (expt 2 32))) (receive (s t r) (euclid a b) (- (+ (* a s) (* b t)) r))) ;Value: 0 ;; Modular inverse: (receive (s t r) (euclid 3 (expt 2 32)) (modulo s (expt 2 32))) ;Value: 2863311531 (modulo (* 3 2863311531) (expt 2 32)) ;Value: 1 This requires division, and in particular division of n, which, if w is our word size so n cannot itself be represented in a word, is not convenient at the machine level -- e.g., if we were writing code in standard C without bignums. The second obvious way -- if you are keen on number theory -- is to use Euler's theorem, that a^phi(n) = 1 (mod n), or a*a^(phi(n) - 1) = 1 (mod n), so that a^(phi(n) - 1) is a multiplicative inverse of a modulo n, where phi(n) is Euler's totient function, giving the number of positive integers below n and coprime with it. In particular, for any prime q, of the q^w - 1 positive integers below q^w, the only ones sharing factors in common with q^w are the multiples of q, of which there are q^(w - 1) - 1, so phi(q^w) = q^w - 1 - (q^(w - 1) - 1) = q^w - q^(w - 1). In our case, phi(2^w) = 2^w - 2^(w - 1) = 2^(w - 1). Using a standard square-and-multiply algorithm for modular exponentiation quickly computes the inverse: (define (modexp a e n) (if (zero? e) 1 (let loop ((r 1) (b (modulo a n)) (f e)) ;; r b^f = a^e (mod n), f > 0 (let ((r (if (even? f) r (modulo (* r b) n))) (f (quotient f 2))) (if (zero? f) r (loop r (modulo (* b b) n) f)))))) (modexp 3 (- (expt 2 31) 1) (expt 2 32)) ;Value: 2863311531 This general modular exponentiation requires 2 w - 1 w-bit multiplications. If our modulus is an even power of two, i.e. the square of the power of two, we can do a little better. Say n = m^2 for some m and we can find an inverse of a modulo m, a b = 1 (mod m), or a b - 1 = k m for some k. We would like an inverse b' of a modulo n = m^2, so that a b' - 1 = k' m^2 for some k'. Squaring both sides of a b - 1 = k m in an attempt to find a multiple of m^2 gives k^2 m^2 = (a b - 1)^2 = a^2 b^2 - 2 a b + 1 = a b (a b - 2) + 1, or, if we negate both sides, a b (2 - a b) - 1 = -k^2 m^2. Hence we have found an inverse b' = b (2 - a b) of a modulo m^2 = n, with k' = -k^2. A little more generally, if a b = 1 (mod m), then for any power c of m, we can use the binomial expansion (a b - 1)^c = a b (...) + (-1)^c to find b' = (-1)^(c + 1) b (...) so that a b' = 1 (mod m^c). Similarly, for any v and u coprime with a (and not necessarily distinct), if a b = 1 (mod u) and a c = 1 (mod v), so that a b - 1 = i u and a c - 1 = j v for some i and j, then taking their product yields i j u v = (a b - 1) (a c - 1) = a^2 b c - a b - a c + 1 = a (a b c - b - c) + 1, or a (b + c - a b c) - 1 = -i j u v, so that b + c - a b c is an inverse of a modulo u v. If n is a power of two power of two -- that is, 2^(2^i) for some i: 2, 4, 16, 256, 65536, &c. --, as is likely to be the case for machine word sizes in which to do Montgomery reduction, this lends itself to a faster algorithm by starting at a, which, being odd, is its own inverse modulo 2: (define (modinv2^2^i a i) (let loop ((r 0) (b a)) ;; a b = 1 (mod 2^(2^r)) (if (< r i) (loop (+ r 1) (* b (- 2 (* a b)))) b))) (modulo (modinv2^2^i 3 5) (expt 2 32)) ;Value: 2863311531 This approach uses only 2 lg w w-bit multiplications and lg w w-bit additions. (The Scheme code does not reduce the intermediate quantities, but all arithmetic can be done modulo n -- or even modulo 2^(2^(r + 1)) <= n.) For Montgomery reduction, which needs a b = -1 (mod n), we can either negate the result, or compute it directly with the modified step b' = b (2 + a b). The above source code is also available at . [Update, 2015-02-22: I found , which provides algorithms for computing a mod p^w for more general p and w.] -- Copyright (c) 2006--2015, Taylor R. Campbell. Verbatim copying and distribution of this entire article are permitted worldwide, without royalty, in any medium, provided this notice, and the copyright notice, are preserved.