developer tip

C에서 size_t는 무엇입니까?

optionbox 2020. 10. 3. 10:31
반응형

C에서 size_t는 무엇입니까?


나는 size_tC에서 혼동되고 있다. 나는 그것이 sizeof연산자에 의해 반환된다는 것을 알고있다 . 하지만 정확히 무엇입니까? 데이터 유형입니까?

for루프 가 있다고 가정 해 보겠습니다 .

for(i = 0; i < some_size; i++)

int i;또는 사용해야합니까 size_t i;?


Wikipedia에서 :

1999 ISO C 표준 (C99)에 따르면 size_t최소 16 비트의 부호없는 정수 유형입니다 (섹션 7.17 및 7.18.3 참조).

size_t에 정의 된 C99 ISO / IEC 9899 표준과 같은 여러 C / C ++ 표준에 의해 정의 된 서명되지 않은 데이터 유형입니다 stddef.h. 1stdlib.h 이 파일은 내부적으로 하위에 포함 되므로을 포함하여 추가로 가져올 수 있습니다 stddef.h.

이 유형은 개체의 크기를 나타내는 데 사용됩니다. 크기를 취하거나 반환하는 라이브러리 함수는 유형이되거나 반환 유형이 size_t. 또한 가장 자주 사용되는 컴파일러 기반 연산자 sizeof는와 호환되는 상수 값으로 평가되어야합니다 size_t.

함축적으로 size_t모든 배열 인덱스를 보유하도록 보장되는 유형입니다.


size_t서명되지 않은 유형입니다. 따라서 음수 값 (<0)을 나타낼 수 없습니다. 당신은 무언가를 셀 때 그것을 사용하고 그것이 음수가 될 수 없다는 것을 확신합니다. 예를 들어 는 문자열 길이가 0 이상이어야하므로 strlen()a를 반환합니다 size_t.

귀하의 예에서 루프 인덱스가 항상 0보다 크면 size_t또는 기타 서명되지 않은 데이터 유형을 사용하는 것이 좋습니다.

size_t객체 를 사용할 때 산술을 포함 하여 객체가 사용되는 모든 컨텍스트에서 음이 아닌 값을 원하는지 확인해야합니다. 예를 들어 다음이 있다고 가정 해 보겠습니다.

size_t s1 = strlen(str1);
size_t s2 = strlen(str2);

그리고 당신의 길이의 차이를 찾으려 str2하고 str1. 할 수없는 일 :

int diff = s2 - s1; /* bad */

이는 계산이 부호없는 유형으로 수행되기 때문에에 할당 된 값 diff이 항상 양수가 될 s2 < s1것이기 때문입니다. 이 경우 사용 사례에 따라 및에 int(또는 long long)를 사용하는 것이 더 나을 수 있습니다 .s1s2

C / POSIX에는를 사용할 수 있거나 사용해야하는 일부 기능이 size_t있지만 역사적 이유 때문에 사용 하지 않습니다. 예를 들어의 두 번째 매개 변수 fgets는 이상적으로이어야 size_t하지만입니다 int.


size_t 모든 배열 인덱스를 보유 할 수있는 유형입니다.

구현에 따라 다음 중 하나 일 수 있습니다.

unsigned char

unsigned short

unsigned int

unsigned long

unsigned long long

내 컴퓨터 size_t에서 정의 하는 방법 다음과 같습니다 stddef.h.

typedef unsigned long size_t;

당신이 경험적 유형이라면 ,

echo | gcc -E -xc -include 'stddef.h' - | grep size_t

Ubuntu 14.04 64 비트 GCC 4.8의 출력 :

typedef long unsigned int size_t;

stddef.h에 따라 GCC가 아닌 glibc에 제공되는 src/gcc/ginclude/stddef.hGCC 4.2.

흥미로운 C99 등장

  • mallocsize_t인수로 하므로 할당 될 수있는 최대 크기를 결정합니다.

    그리고에서도 반환되기 때문에 sizeof모든 배열의 최대 크기를 제한한다고 생각합니다.

    참조 : C에서 배열의 최대 크기는 얼마입니까?


types.h 의 맨 페이지 는 다음과 같습니다.

size_t는 부호없는 정수 유형이어야합니다.


Since nobody has yet mentioned it, the primary linguistic significance of size_t is that the sizeof operator returns a value of that type. Likewise, the primary significance of ptrdiff_t is that subtracting one pointer from another will yield a value of that type. Library functions that accept it do so because it will allow such functions to work with objects whose size exceeds UINT_MAX on systems where such objects could exist, without forcing callers to waste code passing a value larger than "unsigned int" on systems where the larger type would suffice for all possible objects.


size_t and int are not interchangeable. For instance on 64-bit Linux size_t is 64-bit in size (i.e. sizeof(void*)) but int is 32-bit.

Also note that size_t is unsigned. If you need signed version then there is ssize_t on some platforms and it would be more relevant to your example.

As a general rule I would suggest using int for most general cases and only use size_t/ssize_t when there is a specific need for it (with mmap() for example).


To go into why size_t needed to exist and how we got here:

In pragmatic terms, size_t and ptrdiff_t are guaranteed to be 64 bits wide on a 64-bit implementation, 32 bits wide on a 32-bit implementation, and so on. They could not force any existing type to mean that, on every compiler, without breaking legacy code.

A size_t or ptrdiff_t is not necessarily the same as an intptr_t or uintptr_t. They were different on certain architectures that were still in use when size_t and ptrdiff_t were added to the Standard in the late ’80s, and becoming obsolete when C99 added many new types but not gone yet (such as 16-bit Windows). The x86 in 16-bit protected mode had a segmented memory where the largest possible array or structure could be only 65,536 bytes in size, but a far pointer needed to be 32 bits wide, wider than the registers. On those, intptr_t would have been 32 bits wide but size_t and ptrdiff_t could be 16 bits wide and fit in a register. And who knew what kind of operating system might be written in the future? In theory, the i386 architecture offers a 32-bit segmentation model with 48-bit pointers that no operating system has ever actually used.

The type of a memory offset could not be long because far too much legacy code assumes that long is exactly 32 bits wide. This assumption was even built into the UNIX and Windows APIs. Unfortunately, a lot of other legacy code also assumed that a long is wide enough to hold a pointer, a file offset, the number of seconds that have elapsed since 1970, and so on. POSIX now provides a standardized way to force the latter assumption to be true instead of the former, but neither is a portable assumption to make.

It couldn’t be int because only a tiny handful of compilers in the ’90s made int 64 bits wide. Then they really got weird by keeping long 32 bits wide. The next revision of the Standard declared it illegal for int to be wider than long, but int is still 32 bits wide on most 64-bit systems.

It couldn’t be long long int, which anyway was added later, since that was created to be at least 64 bits wide even on 32-bit systems.

So, a new type was needed. Even if it weren’t, all those other types meant something other than an offset within an array or object. And if there was one lesson from the fiasco of 32-to-64-bit migration, it was to be specific about what properties a type needed to have, and not use one that meant different things in different programs.


In general, if you are starting at 0 and going upward, always use an unsigned type to avoid an overflow taking you into a negative value situation. This is critically important, because if your array bounds happens to be less than the max of your loop, but your loop max happens to be greater than the max of your type, you will wrap around negative and you may experience a segmentation fault (SIGSEGV). So, in general, never use int for a loop starting at 0 and going upwards. Use an unsigned.


size_t is unsigned integer data type. On systems using the GNU C Library, this will be unsigned int or unsigned long int. size_t is commonly used for array indexing and loop counting.


size_t or any unsigned type might be seen used as loop variable as loop variables are typically greater than or equal to 0.

When we use a size_t object, we have to make sure that in all the contexts it is used, including arithmetic, we want only non-negative values. For instance, following program would definitely give the unexpected result:

// C program to demonstrate that size_t or
// any unsigned int type should be used 
// carefully when used in a loop

#include<stdio.h>
int main()
{
const size_t N = 10;
int a[N];

// This is fine
for (size_t n = 0; n < N; ++n)
a[n] = n;

// But reverse cycles are tricky for unsigned 
// types as can lead to infinite loop
for (size_t n = N-1; n >= 0; --n)
printf("%d ", a[n]);
}

Output
Infinite loop and then segmentation fault

From my understanding, size_t is an unsigned integer whose bit size is large enough to hold a pointer of the native architecture.

So:

sizeof(size_t) >= sizeof(void*)

참고URL : https://stackoverflow.com/questions/2550774/what-is-size-t-in-c

반응형