Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
338 views
in Technique[技术] by (71.8m points)

Python: Panda Dataframe replace with another Dataframe

a specific stock has changed the stock symbol and I want to have the new data from the dataframe (df_enc - stock symbol CAP.VI) in the main dataframe (df - stock symbol CAP.DE) instead of the wrong data.

I have already cleaned up the main dataframe (df). Here is my code:

import pandas as pd
import numpy as np
from pandas_datareader import data as web
from datetime import datetime

# Get the stock symbols / tickers in the porfolio
# Assign the weights to the stocks

portfolio_value = 100 
stock_symbols = ['AAPL','GOOG','CAP.DE'] 
portfolio_weights = np.array([40,40,20])
portfolio_weights = (1/portfolio_value)*portfolio_weights

# Get the stock/portfolio starting date
stockStartDate = '2011-01-01'

# Get the stocks ending day (today)
today = datetime.today().strftime('%Y-%m-%d')

# Create a dataframe to store adjusted close price of the stocks
df = pd.DataFrame()

# Store the adjusted close price of the stocks
data_source='yahoo'
df = web.DataReader(name=stock_symbols, data_source=data_source, start=stockStartDate, end=today)['Adj Close']

# Clear false data from CAP.DE
df.loc['2021-01-04':,'CAP.DE'] = np.nan
df_enc = pd.DataFrame()
df_enc = web.DataReader(name = 'CAP.VI', data_source = data_source, start = '2021-01-04', end = today)['Adj Close']

The result from my code is:

Symbols           AAPL         GOOG    CAP.DE
Date                                         
2011-01-03   10.153708   301.046600  1.669983
2011-01-04   10.206702   299.935760  1.669983
2011-01-05   10.290195   303.397797  1.545708
2011-01-06   10.281874   305.604523  1.561246
2011-01-07   10.355506   307.069031  1.561246
               ...          ...       ...
2020-12-31  132.690002  1751.880005       NaN
2021-01-04  129.410004  1728.239990       NaN
2021-01-05  131.009995  1740.920044       NaN
2021-01-06  126.599998  1735.290039       NaN
2021-01-07  130.470001  1779.035034       NaN

The result (df) should look like this:

Symbols           AAPL         GOOG    CAP.DE
Date                                         
2011-01-03   10.153708   301.046600  1.669983
2011-01-04   10.206702   299.935760  1.669983
2011-01-05   10.290195   303.397797  1.545708
2011-01-06   10.281874   305.604523  1.561246
2011-01-07   10.355506   307.069031  1.561246
               ...          ...       ...
2020-12-31  132.690002  1751.880005       NaN
2021-01-04  129.410004  1728.239990   21.049999
2021-01-05  131.009995  1740.920044   21.049999
2021-01-06  126.599998  1735.290039   22.549999
2021-01-07  130.470001  1779.035034   24.299999
question from:https://stackoverflow.com/questions/65618203/python-panda-dataframe-replace-with-another-dataframe

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

this is how you replace the nulls with the other data

df = np.where(df['CAP.DE'].isnull(),df_enc['CAP.VI'],df['CAP.DE'])

or if you want to replace the data from 2021-01-04 onwards

val = df['CAP.DE']
val[val.index >= '2021-01-04'] = df_enc
df['CAP.DE'] = val

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...