Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
211 views
in Technique[技术] by (71.8m points)

python - Assign values to a pandas dataframe column based on intervals

I have a dataframe (df) with a single column of dates and a second dataframe (df_value) with three columns: a start date ('From'), an end date ('To') and an associated value. I want to create a second column in df with the correct value which has been looked up from df_value:

import pandas as pd
df = pd.DataFrame(['30/03/2018', '01/10/2019','03/07/2020', '05/08/2020', '06/08/2020', '10/10/2020'], columns=['Date'])
df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y', dayfirst=True).dt.date

df_value = pd.DataFrame([['01/01/2018','31/12/2018',1.286], ['01/01/2019','30/06/2019',1.555], ['01/07/2019','31/12/2019',1.632], ['01/01/2020','31/12/2020',1.864]], columns =['From', 'To', 'Value'])
df_value['From'] = pd.to_datetime(df_value['From'], format='%d/%m/%Y', dayfirst=True).dt.date
df_value['To'] = pd.to_datetime(df_value['To'], format='%d/%m/%Y', dayfirst=True).dt.date

At the moment I have done this through applying the function below to df row-by-row. Although this works I feel that there must be a far more efficient way of doing this:

def fixed_func(df):    
        value = 0
        row_counter = 0
                    
        while value == 0:
            if (df['Date']>= df_value.iloc[row_counter, 0]) & (df['Date']<= df_value.iloc[row_counter, 1]):
                value = df_value.iloc[row_counter, 2]
            else:
                row_counter += 1
            
        return value

df['Value'] = df.apply(fixed_func, axis=1)
question from:https://stackoverflow.com/questions/65843211/assign-values-to-a-pandas-dataframe-column-based-on-intervals

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can use numpy.where:

import numpy as np

date_col = df.Date.values[:, np.newaxis]
x,y = np.where((date_col >= df_value.From.values) & 
               (date_col <= df_value.To.values))

df["Value"] = df_value.loc[y,"Value"].reset_index(drop = True)
#        Date  Value
#0 2018-03-30  1.286
#1 2019-10-01  1.632
#2 2020-07-03  1.864
#3 2020-08-05  1.864
#4 2020-08-06  1.864
#5 2020-10-10  1.864

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...