I have two data frames with three columns, with identical column names. I want to subtract the value of the third column where the values of the first, and second columns match. I've tried the following:
# Common column names
columns = ["month", "category", "sum"]
# First data frame
data1 = [("jan", "j", 10), ("feb", "f", 20)]
df1 = pd.DataFrame.from_records(data1, columns=columns)
# Second data frame
data2 = [("jan", "j", 9.5), ("mar", "m", 30)]
df2 = pd.DataFrame.from_records(data2, columns=columns)
print(df1) # Observe order of `month`s: jan, feb
print(df2) # Observe order of `month`s: jan, mar
# Subtract `sum` where `month`, and `category` match:
df1.set_index(["month", "category"]).subtract(df2.set_index(["month", "category"])).reset_index()
This yields the following output.
Observe that rows are sorted alphabetically on month
.
month category sum
0 feb f NaN
1 jan j 0.5
2 mar m NaN
How can I maintain the row order of the left hand operand? I.e. how can get the following output (or similar):
month category sum
1 jan j 0.5
0 feb f NaN
2 mar m NaN
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…