developer tip

Python 'for'루프를위한 더 나은 방법

optionbox 2020. 11. 18. 08:57
반응형

Python 'for'루프를위한 더 나은 방법


파이썬에서 특정 횟수만큼 명령문을 실행하는 일반적인 방법은 for루프 를 사용하는 것 입니다.

이를 수행하는 일반적인 방법은 다음과 같습니다.

# I am assuming iterated list is redundant.
# Just the number of execution matters.
for _ in range(count):
    pass

위의 코드가 일반적인 구현이라고 아무도 주장하지 않을 것이라고 생각하지만 다른 옵션이 있습니다. 참조를 곱하여 Python 목록 생성 속도를 사용합니다.

# Uncommon way.
for _ in [0] * count:
    pass

옛날 while방식도 있습니다.

i = 0
while i < count:
    i += 1

이러한 접근 방식의 실행 시간을 테스트했습니다. 다음은 코드입니다.

import timeit

repeat = 10
total = 10

setup = """
count = 100000
"""

test1 = """
for _ in range(count):
    pass
"""

test2 = """
for _ in [0] * count:
    pass
"""

test3 = """
i = 0
while i < count:
    i += 1
"""

print(min(timeit.Timer(test1, setup=setup).repeat(repeat, total)))
print(min(timeit.Timer(test2, setup=setup).repeat(repeat, total)))
print(min(timeit.Timer(test3, setup=setup).repeat(repeat, total)))

# Results
0.02238852552017738
0.011760978361696095
0.06971727824807639

작은 차이가 있으면 주제를 시작하지 않겠지 만 속도 차이는 100 %임을 알 수 있습니다. 두 번째 방법이 훨씬 더 효율적이라면 왜 파이썬은 그러한 사용을 권장하지 않습니까? 더 좋은 방법이 있습니까?

테스트는 Windows 10Python 3.6으로 수행 됩니다.

@Tim Peters의 제안에 따라

.
.
.
test4 = """
for _ in itertools.repeat(None, count):
    pass
"""
print(min(timeit.Timer(test1, setup=setup).repeat(repeat, total)))
print(min(timeit.Timer(test2, setup=setup).repeat(repeat, total)))
print(min(timeit.Timer(test3, setup=setup).repeat(repeat, total)))
print(min(timeit.Timer(test4, setup=setup).repeat(repeat, total)))

# Gives
0.02306803115612352
0.013021619340942758
0.06400113461638746
0.008105080015739174

훨씬 더 나은 방법을 제공하며 이것은 내 질문에 거의 대답합니다.

range둘 다 생성기이기 때문에 왜 이것이보다 빠릅니다 . 가치가 변하지 않기 때문입니까?


사용

for _ in itertools.repeat(None, count)
    do something

모든 세계를 최대한 활용하는 분명하지 않은 방법입니다. 일정한 공간 요구 사항이 적고 반복 당 생성되는 새로운 객체가 없습니다. 내부적으로 C 코드 repeat는 네이티브 C 정수 유형 (Python 정수 객체가 아닙니다!)을 사용하여 남은 개수를 추적합니다.

따라서 개수 ssize_t는 일반적으로 최대 2**31 - 132 비트 상자와 64 비트 상자 에있는 플랫폼 C 유형 에 맞아야합니다 .

>>> itertools.repeat(None, 2**63)
Traceback (most recent call last):
    ...
OverflowError: Python int too large to convert to C ssize_t

>>> itertools.repeat(None, 2**63-1)
repeat(None, 9223372036854775807)

내 루프에 대해 충분히 큽니다 ;-)


The first method (in Python 3) creates a range object, which can iterate through the range of values. (It's like a generator object but you can iterate through it several times.) It doesn't take up much memory because it doesn't contain the entire range of values, just a current and a maximum value, where it keeps increasing by the step size (default 1) until it hits or passes the maximum.

Compare the size of range(0, 1000) to the size of list(range(0, 1000)): Try It Online!. The former is very memory efficient; it only takes 48 bytes regardless of the size, whereas the entire list increases linearly in terms of size.

The second method, although faster, takes up that memory I was talking about in the past one. (Also, it seems that although 0 takes up 24 bytes and None takes 16, arrays of 10000 of each have the same size. Interesting. Probably because they're pointers)

Interestingly enough, [0] * 10000 is smaller than list(range(10000)) by about 10000, which kind of makes sense because in the first one, everything is the same primitive value so it can be optimized.

The third one is also nice because it doesn't require another stack value (whereas calling range requires another spot on the call stack), though since it's 6 times slower, it's not worth that.

The last one might be the fastest just because itertools is cool that way :P I think it uses some C-library optimizations, if I remember correctly.


The first two methods need to allocate memory blocks for each iteration while the third one would just make a step for each iteration.

Range is a slow function, and I use it only when I have to run small code that doesn't require speed, for example, range(0,50). I think you can't compare the three methods; they are totally different.

According to a comment below, the first case is only valid for Python 2.7, in Python 3 it works like xrange and doesn't allocate a block for each iteration. I tested it, and he is right.


This answer provides a loop construct for convenience. For additional background about looping with itertools.repeat look up Tim Peters' answer above, Alex Martelli's answer here and Raymond Hettinger's answer here.

# loop.py

"""
Faster for-looping in CPython for cases where intermediate integers
from `range(x)` are not needed.

Example Usage:
--------------

from loop import loop

for _ in loop(10000):
    do_something()

# or:

results = [calc_value() for _ in loop(10000)]
"""

from itertools import repeat
from functools import partial

loop = partial(repeat, None)

참고URL : https://stackoverflow.com/questions/46996315/a-better-way-for-a-python-for-loop

반응형