python - How to substract columns in pandas df based on condition

Question

Welcome To Ask or Share your Answers For Others

python - How to substract columns in pandas df based on condition

asked Oct 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - How to substract columns in pandas df based on condition

I have a dataset which looks like this. In my new dataset, I want to subtract the amount column(s) with principal column(s) and remainder(s) column.

For instance, if the amount column is 4, the principalcolumn is 2 and remainder is 3, then the first amount column must be subtracted from the first principal column and first remainder column, 2nd with 2nd principal column and 2nd remainder column and 3rd with 3rd remainder column (since now there is no more principal column). And the last amount4 column must stay as it is as newamount4

amount1  amount2   amount3 amount4  principal1  principal2  remainder1  remainder2    remainder3  
 100      250       150    100           250       100         80         100          100 
 200      200       350    25            450       100        120         100          50
 300      150       450    30            200       100        150         100          100
 250      550       550    100           100       200         50         500          200
 550      200       650    200          250       200        500         100          500

My new dataset must look like this. Please note am stands for amount and pr stands for principal and rem stands for remainder.

newamount1          newamount2         newamount3     newamount4       
-230(am1-pr1-rem1)  50(am2-pr2-rem2)  50(am3-rem3)    amount4        
-370                0                 300             amount4        
 50                 50                350             amount4        
 100               -150               350             amount4        
-200               -100               150             amount4

question from:https://stackoverflow.com/questions/66062836/how-to-substract-columns-in-pandas-df-based-on-condition

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-06T03:04:27+0000

You can use defaultdict to group common suffixes, then apply a reducing function (np.subtract.reduce) to get your output:

from collections import defaultdict

mapping = defaultdict(list)
for column in df:
    if column[-1] != 4:
        mapping[f"newamount{column[-1]}"].append(df[column])
    else:
        mapping[f"newamount{column[-1]}"].append(column)

mapping = {
    key: np.subtract.reduce(value) if "4" not in key else "amount4"
    for key, value in mapping.items()
}

pd.DataFrame(mapping)

    newamount1  newamount2  newamount3  newamount4
0   -230        50          50          amount4
1   -370        0           300         amount4
2   -50        -50          350         amount4
3   100       -150          350         amount4
4   -200     -100           150         amount4

You could also iterate through a groupby:

mapping = {
    f"newamount{key}": frame.agg(np.subtract.reduce, axis=1)
    for key, frame in df.groupby(df.columns.str[-1], axis=1)
}

pd.DataFrame(mapping).assign(newamount4="amount4")

You may use the code below and adapt it if your data goes beyond 4:

You can use pivot_longer function from pyjanitor to reshape the data before grouping and aggregating; at the moment you have to install the latest development version from github:

 # install latest dev version
# pip install git+https://github.com/ericmjl/pyjanitor.git
 import janitor

(
    df.pivot_longer(names_to=".value", 
                    names_pattern=".+(d)$", 
                    ignore_index=False)
    .fillna(0)
    .add_prefix("newamount")
    .groupby(level=0)
    .agg(np.subtract.reduce)
    .assign(newamount4="amount4") # edit your preferred column
)

Sticking to functions within Pandas only, we can reshape the data by stacking, before grouping and aggregating:

df.columns = df.columns.str.split("(d)", expand=True).droplevel(-1)
(
    df.stack(0)
    .fillna(0)
    .droplevel(-1)
    .groupby(level=0)
    .agg(np.subtract.reduce)
    .add_prefix("newamount")
    .assign(newamount4="amount4")
)

Categories

python - How to substract columns in pandas df based on condition

python - How to substract columns in pandas df based on condition

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags