python - Filter by rank of one index level in Pandas? -
i have dataframe customer id
, date
, price
, want aggregate prices except purchase of each id
on first date.
df=pd.dataframe([[1,1,1],[1,1,1],[1,2,1],[1,2,4],[1,3,1],[2,2,1],[2,3,3]], columns=["id", "date", "price"]) s=df.groupby(["id","date"]).price.sum() # id date # 1 1 2 # 2 5 # 3 1 # 2 2 1 # 3 3
i'd sum prices except ones on smallest dates each id (date 1 id 1; , date 2 id 2). result 5+1+3=9.
so, i'd have rank on part of index with-in groups , combine result previous aggregation?
any suggestions?
you can sort level follows:
s = s.sortlevel([0,1])
we can first sum group (ignoring first unit), , sum on result
in[153]: s.groupby(level=0).apply(lambda x: sum(x.iloc[1:])) out[153]: id 1 6 2 3 dtype: int64 in[154]: s.groupby(level=0).apply(lambda x: sum(x.iloc[1:])).sum() out[154]: 9
if want more advanced stuff not follow logic iloc[]
operator can work with, should have separate function instead of lambda
import numpy np def is_prime(n): if n < 2: return true in np.arange(2, n-1): if (n%i) == 0: return false return true def select_and_sum(group): n = len(group) r = range(n) primes = [j j in r if is_prime(j) == true] return group.iloc[primes].sum() s.groupby(level=0).apply(select_and_sum)
Comments
Post a Comment