팬더 적용 함수에서 행의 인덱스 가져 오기

developer tip

팬더 적용 함수에서 행의 인덱스 가져 오기

optionbox 2020. 9. 10. 07:41

팬더 적용 함수에서 행의 인덱스 가져 오기

DataFramePandas 전체 에 적용된 함수의 행 인덱스에 액세스하려고합니다 . 다음과 같은 것이 있습니다.

df = pandas.DataFrame([[1,2,3],[4,5,6]], columns=['a','b','c'])
>>> df
   a  b  c
0  1  2  3
1  4  5  6

주어진 행으로 요소에 액세스하는 함수를 정의하겠습니다.

def rowFunc(row):
    return row['a'] + row['b'] * row['c']

다음과 같이 적용 할 수 있습니다.

df['d'] = df.apply(rowFunc, axis=1)
>>> df
   a  b  c   d
0  1  2  3   7
1  4  5  6  34

대박! 이제 인덱스를 함수에 통합하려면 어떻게해야합니까? 이에 주어진 행의 인덱스를 DataFrame추가하기 전에이 d될 것입니다 Index([u'a', u'b', u'c', u'd'], dtype='object'),하지만 난 0과 1 내가 할 수있는 그래서 그냥 액세스하려는 row.index.

인덱스를 저장하는 테이블에 임시 열을 만들 수 있다는 것을 알고 있지만 행 개체에 어딘가에 저장되어 있는지 궁금합니다.

이 경우 색인에 액세스하려면 name속성에 액세스 합니다.

In [182]:

df = pd.DataFrame([[1,2,3],[4,5,6]], columns=['a','b','c'])
def rowFunc(row):
    return row['a'] + row['b'] * row['c']

def rowIndex(row):
    return row.name
df['d'] = df.apply(rowFunc, axis=1)
df['rowIndex'] = df.apply(rowIndex, axis=1)
df
Out[182]:
   a  b  c   d  rowIndex
0  1  2  3   7         0
1  4  5  6  34         1

이것이 실제로 수행하려는 작업이라면 다음이 작동하고 훨씬 빠릅니다.

In [198]:

df['d'] = df['a'] + df['b'] * df['c']
df
Out[198]:
   a  b  c   d
0  1  2  3   7
1  4  5  6  34

In [199]:

%timeit df['a'] + df['b'] * df['c']
%timeit df.apply(rowIndex, axis=1)
10000 loops, best of 3: 163 µs per loop
1000 loops, best of 3: 286 µs per loop

편집하다

Looking at this question 3+ years later, you could just do:

In[15]:
df['d'],df['rowIndex'] = df['a'] + df['b'] * df['c'], df.index
df

Out[15]: 
   a  b  c   d  rowIndex
0  1  2  3   7         0
1  4  5  6  34         1

but assuming it isn't as trivial as this, whatever your rowFunc is really doing, you should look to use the vectorised functions, and then use them against the df index:

In[16]:
df['newCol'] = df['a'] + df['b'] + df['c'] + df.index
df

Out[16]: 
   a  b  c   d  rowIndex  newCol
0  1  2  3   7         0       6
1  4  5  6  34         1      16

apply() isn't the droid you're looking for.

DataFrame.iterrows() allows you to iterate over rows, and access their name:

for name, row in df.iterrows():
    ...

To answer the original question: yes, you can access the index value of a row in apply(). It is available under the key name and requires that you specify axis=1 (because the lambda processes the columns of a row and not the rows of a column).

Working example (pandas 0.23.4):

>>> import pandas as pd
>>> df = pd.DataFrame([[1,2,3],[4,5,6]], columns=['a','b','c'])
>>> df.set_index('a', inplace=True)
>>> df
   b  c
a      
1  2  3
4  5  6
>>> df['index_x10'] = df.apply(lambda row: 10*row.name, axis=1)
>>> df
   b  c  index_x10
a                 
1  2  3         10
4  5  6         40

참고URL : https://stackoverflow.com/questions/26658240/getting-the-index-of-a-row-in-a-pandas-apply-function

'developer tip' 카테고리의 다른 글

AMI 저장 비용 (0)	2020.09.10
PostgreSQL : CREATE TABLE 정의에서 인덱스를 생성 할 수 있습니까? (0)	2020.09.10
Go에서 분기 된 패키지 가져 오기 사용 (0)	2020.09.10
겹치는 직사각형을 간격을 두는 알고리즘? (0)	2020.09.10
Python에서 numpy.random과 random.random의 차이점 (0)	2020.09.10

현재글팬더 적용 함수에서 행의 인덱스 가져 오기

optionbox

팬더 적용 함수에서 행의 인덱스 가져 오기

팬더 적용 함수에서 행의 인덱스 가져 오기

'developer tip' 카테고리의 다른 글

'developer tip'의 다른글

티스토리툴바

팬더 적용 함수에서 행의 인덱스 가져 오기

팬더 적용 함수에서 행의 인덱스 가져 오기

'developer tip' 카테고리의 다른 글

'developer tip'의 다른글

관련글

티스토리툴바