Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
819 views
in Technique[技术] by (71.8m points)

python - Search and replace dots and commas in pandas dataframe

This is my DataFrame:

d = {'col1': ['sku 1.1', 'sku 1.2', 'sku 1.3'], 'col2': ['9.876.543,21', 654, '321,01']}
df = pd.DataFrame(data=d)
df

       col1           col2
0   sku 1.1   9.876.543,21
1   sku 1.2            654
2   sku 1.3         321,01

Data in col2 are numbers in local format, which I would like to convert into:

      col2
9876543.21
       654
    321.01

I tried df['col2'] = pd.to_numeric(df['col2'], downcast='float'), which returns a ValueError: : Unable to parse string "9.876.543,21" at position 0.

I tried also df = df.apply(lambda x: x.str.replace(',', '.')), which returns ValueError: could not convert string to float: '5.023.654.46'

Thanks for your help!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The best is use if possible parameters in read_csv:

df = pd.read_csv(file, thousands='.', decimal=',')

If not possible, then replace should help:

df['col2'] = (df['col2'].replace('.','', regex=True)
                        .replace(',','.', regex=True)
                        .astype(float))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...