python - Filter by rank of one index level in Pandas? -


i have dataframe customer id, date , price , want aggregate prices except purchase of each id on first date.

df=pd.dataframe([[1,1,1],[1,1,1],[1,2,1],[1,2,4],[1,3,1],[2,2,1],[2,3,3]], columns=["id", "date", "price"]) s=df.groupby(["id","date"]).price.sum() # id  date # 1   1       2 #     2       5 #     3       1 # 2   2       1 #     3       3 

i'd sum prices except ones on smallest dates each id (date 1 id 1; , date 2 id 2). result 5+1+3=9.

so, i'd have rank on part of index with-in groups , combine result previous aggregation?

any suggestions?

you can sort level follows:

s = s.sortlevel([0,1]) 

we can first sum group (ignoring first unit), , sum on result

in[153]: s.groupby(level=0).apply(lambda x: sum(x.iloc[1:])) out[153]:  id 1     6 2     3 dtype: int64 in[154]: s.groupby(level=0).apply(lambda x: sum(x.iloc[1:])).sum() out[154]: 9 

if want more advanced stuff not follow logic iloc[] operator can work with, should have separate function instead of lambda

import numpy np def is_prime(n):     if n < 2:         return true     in np.arange(2, n-1):         if (n%i) == 0:             return false     return true  def select_and_sum(group):     n = len(group)     r = range(n)     primes = [j j in r if is_prime(j) == true]     return group.iloc[primes].sum()  s.groupby(level=0).apply(select_and_sum) 

Comments

Popular posts from this blog

javascript - Jquery show_hide, what to add in order to make the page scroll to the bottom of the hidden field once button is clicked -

javascript - Highcharts multi-color line -

javascript - Enter key does not work in search box -