developer tip

pow (a, d, n)이 a ** d % n보다 훨씬 빠른 이유는 무엇입니까?

optionbox 2020. 8. 7. 08:20
반응형

pow (a, d, n)이 a ** d % n보다 훨씬 빠른 이유는 무엇입니까?


Miller-Rabin 소수성 테스트 를 구현하려고했는데 왜 중형 숫자 (~ 7 자리)에 너무 오래 (> 20 초) 걸리는지 의아해했습니다. 결국 문제의 원인이되는 다음 코드 줄을 발견했습니다.

x = a**d % n

(단 a, dn모든 유사하지만 동일하지 않은, 중간 번호는, **누승 연산자이며, %모듈러 연산자이다)

그런 다음 다음으로 바꾸려고 시도했습니다.

x = pow(a, d, n)

비교하면 거의 즉각적입니다.

컨텍스트를 위해 원래 기능은 다음과 같습니다.

from random import randint

def primalityTest(n, k):
    if n < 2:
        return False
    if n % 2 == 0:
        return False
    s = 0
    d = n - 1
    while d % 2 == 0:
        s += 1
        d >>= 1
    for i in range(k):
        rand = randint(2, n - 2)
        x = rand**d % n         # offending line
        if x == 1 or x == n - 1:
            continue
        for r in range(s):
            toReturn = True
            x = pow(x, 2, n)
            if x == 1:
                return False
            if x == n - 1:
                toReturn = False
                break
        if toReturn:
            return False
    return True

print(primalityTest(2700643,1))

시간 제한 계산의 예 :

from timeit import timeit

a = 2505626
d = 1520321
n = 2700643

def testA():
    print(a**d % n)

def testB():
    print(pow(a, d, n))

print("time: %(time)fs" % {"time":timeit("testA()", setup="from __main__ import testA", number=1)})
print("time: %(time)fs" % {"time":timeit("testB()", setup="from __main__ import testB", number=1)})

출력 (PyPy 1.9.0으로 실행) :

2642565
time: 23.785543s
2642565
time: 0.000030s

출력 (Python 3.3.0, 2.7.2로 실행하면 매우 유사한 시간이 반환 됨) :

2642565
time: 14.426975s
2642565
time: 0.000021s

And a related question, why is this calculation almost twice as fast when run with Python 2 or 3 than with PyPy, when usually PyPy is much faster?


See the Wikipedia article on modular exponentiation. Basically, when you do a**d % n, you actually have to calculate a**d, which could be quite large. But there are ways of computing a**d % n without having to compute a**d itself, and that is what pow does. The ** operator can't do this because it can't "see into the future" to know that you are going to immediately take the modulus.


BrenBarn answered your main question. For your aside:

why is it almost twice as fast when run with Python 2 or 3 than PyPy, when usually PyPy is much faster?

If you read PyPy's performance page, this is exactly the kind of thing PyPy is not good at—in fact, the very first example they give:

Bad examples include doing computations with large longs – which is performed by unoptimizable support code.

Theoretically, turning a huge exponentiation followed by a mod into a modular exponentiation (at least after the first pass) is a transformation a JIT might be able to make… but not PyPy's JIT.

As a side note, if you need to do calculations with huge integers, you may want to look at third-party modules like gmpy, which can sometimes be much faster than CPython's native implementation in some cases outside the mainstream uses, and also has a lot of additional functionality that you'd otherwise have to write yourself, at the cost of being less convenient.


There are shortcuts to doing modular exponentiation: for instance, you can find a**(2i) mod n for every i from 1 to log(d) and multiply together (mod n) the intermediate results you need. A dedicated modular-exponentiation function like 3-argument pow() can leverage such tricks because it knows you're doing modular arithmetic. The Python parser can't recognize this given the bare expression a**d % n, so it will perform the full calculation (which will take much longer).


The way x = a**d % n is calculated is to raise a to the d power, then modulo that with n. Firstly, if a is large, this creates a huge number which is then truncated. However, x = pow(a, d, n) is most likely optimized so that only the last n digits are tracked, which are all that are required for calculating multiplication modulo a number.

참고URL : https://stackoverflow.com/questions/14133806/why-is-powa-d-n-so-much-faster-than-ad-n

반응형