python - Filter by rank of one index level in Pandas? -

January 15, 2011

i have dataframe customer id, date , price , want aggregate prices except purchase of each id on first date.

df=pd.dataframe([[1,1,1],[1,1,1],[1,2,1],[1,2,4],[1,3,1],[2,2,1],[2,3,3]], columns=["id", "date", "price"]) s=df.groupby(["id","date"]).price.sum() # id  date # 1   1       2 #     2       5 #     3       1 # 2   2       1 #     3       3

i'd sum prices except ones on smallest dates each id (date 1 id 1; , date 2 id 2). result 5+1+3=9.

so, i'd have rank on part of index with-in groups , combine result previous aggregation?

any suggestions?

you can sort level follows:

s = s.sortlevel([0,1])

we can first sum group (ignoring first unit), , sum on result

in[153]: s.groupby(level=0).apply(lambda x: sum(x.iloc[1:])) out[153]:  id 1     6 2     3 dtype: int64 in[154]: s.groupby(level=0).apply(lambda x: sum(x.iloc[1:])).sum() out[154]: 9

if want more advanced stuff not follow logic iloc[] operator can work with, should have separate function instead of lambda

import numpy np def is_prime(n):     if n < 2:         return true     in np.arange(2, n-1):         if (n%i) == 0:             return false     return true  def select_and_sum(group):     n = len(group)     r = range(n)     primes = [j j in r if is_prime(j) == true]     return group.iloc[primes].sum()  s.groupby(level=0).apply(select_and_sum)

Search This Blog

O9

python - Filter by rank of one index level in Pandas? -

Comments

Post a Comment

Popular posts from this blog

javascript - Jquery show_hide, what to add in order to make the page scroll to the bottom of the hidden field once button is clicked -

Error while updating a record in APEX screen -

ios - Xcode 5 "No such file or directory" -