There are two dataframes:
df1 = pd.DataFrame({'year':[2000, 2001, 2002], 'city':['NY', 'AL', 'TX'], 'zip':[100, 200, 300]})
df2 = pd.DataFrame({'year':[2000, 2001, 2002], 'city':['NY', 'AL', 'TX'], 'zip':["95-150", "160-220", "190-310"], 'value':[10, 20, 30]})
The main df is df1 and I want to add the 'value' column from df2 to df1 based off of a matching year, city, and zip. The problem is that the zip of df2 is given in a range and I want to attach 'value' only if df1's zip is within a given range. I'm not sure how to do this. I've tried a few things like:
# Match indices so that new cols will attach when equal indices
df1 = df1.set_index(['year', 'city'])
df2 = df2.set_index(['year', 'city'])
# Split range of zip into a list
df2['zip'] = df2['zip'].str.split("-")
# Attach 'value' to df1 if df1's zip if greater than df2's min zip AND less than df2's max zip
df1['value'] = df2.loc[(df2['zip'].str[0].astype(int) <= df1['zip']) &
(df2['zip'].str[1].astype(int) >= df1['zip']), 'value']
Which gives me this error: ValueError: Can only compare identically-labeled Series objects
question from:
https://stackoverflow.com/questions/65907950/attach-column-from-one-dataframe-to-another-if-col-value-from-1-within-range-of 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…