Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
163 views
in Technique[技术] by (71.8m points)

python - Add a column to a dataframe based on another column dealing with multiple occurrence

I have a function that gives me the time of sunset and sunrise based on an API that I called get_sun(date) with date being a string of format "%d/%m/%Y".

I have a dataframe with a column Date containing strings of format "%d/%m/%Y".

        Date        Time    Sky temp (C°)   Ambient temp (C°)
0       01/01/2020  00:00:07    -13.01  8.23
1       01/01/2020  00:01:12    -12.93  8.25
2       01/01/2020  00:02:17    -12.91  8.19
3       01/01/2020  00:03:22    -12.75  8.19
4       01/01/2020  00:04:27    -12.99  8.17
... ... ... ... ...
349074  31/10/2020  23:54:44    8.83    8.53
349075  31/10/2020  23:55:49    8.75    8.49
349076  31/10/2020  23:56:54    8.65    8.47
349077  31/10/2020  23:57:59    8.65    8.45
349078  31/10/2020  23:59:04    8.61    8.43

I want to add to my dataframe a column 'Sunrise' and 'Sunset' but without using apply. If I use dataframe.Date.apply() it will iterrate on every line. For a same date I have 3000 lines so it would be much quicker to call get_sun only once per different date.

I wich an output of the form :

        Date        Time    Sky temp (C°)   Ambient temp (C°) Sunrise Sunset
0       01/01/2020  00:00:07    -13.01      8.23             7:58:32    18:21:39
1       01/01/2020  00:01:12    -12.93      8.25             7:58:32    18:21:39
2       01/01/2020  00:02:17    -12.91      8.19             7:58:32    18:21:39
3       01/01/2020  00:03:22    -12.75      8.19             7:58:32    18:21:39
4       01/01/2020  00:04:27    -12.99      8.17             7:58:32    18:21:39

My code is the following :

df['Sunrise'] = ""
df['Sunset'] = ""

for i in tqdm(unique(df.Date.values)):
    (sunrise, sunset) = get_sun(i)
    df[df.Date.apply(lambda x : x==i)]['Sunrise'].apply(lambda x : sunrise)
    df[df.Date.apply(lambda x : x==i)]['Sunset']=sunset

df[df.Date.apply(lambda x : x==i)] is my way to select only the lines of my dataframe where the date is equal to i. For these lines I would like to append the value of sunrise and sunset in the corresponding columns.

Thank you in advance

question from:https://stackoverflow.com/questions/65904887/add-a-column-to-a-dataframe-based-on-another-column-dealing-with-multiple-occurr

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I think you over complicate the definition of your new columns. A single call to pandas.apply should suffice for your needs. No need to iterate by hand, nor to find unique dates.

Here is a simplified example (dates/sunrise/sunset as integers):

#your function
get_sunrise = lambda date: (date-1,date+1)

#function passed to pandas.DataFrame.apply(...,axis=1)
def fun(row):   
    (sunrise,sunset) = get_sunrise(row['date'])
    row['sunrise'] = sunrise
    row['sunset'] = sunset
    return row

#mock example
df = pd.DataFrame({'date':[1,2,3,4,5,6]})
df = df.apply(fun,axis=1)

enter image description here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...